case of words in R

Question:

There is a vector with words

 # текст
text <- c("R is a very essential tool for data analysis. While it 
          is regarded as domain specific, it is a very complete programming 
          language. Almost certainly, many people who would benefit from
          using R, do not use it")
# разбиваю текст на вектор сo словами с пом. пакета stringr
text <- unlist(    stringr::str_match_all(text , '\\w+\\b')   )



 text
 [1] "R"           "is"          "a"           "very"        "essential"   "tool"        "for"        
 [8] "data"        "analysis"    "While"       "it"          "is"          "regarded"    "as"         
[15] "domain"      "specific"    "it"          "is"          "a"           "very"        "complete"   
[22] "programming" "language"    "Almost"      "certainly"   "many"        "people"      "who"        
[29] "would"       "benefit"     "from"        "using"       "R"           "do"          "not"        
[36] "use"         "it"    

I want to find the word "using" in it

text[text=="using"]
[1] "using"

everything is fine, everything is found, but if you change the case a little

text[text=="Using"]
character(0)

you can't find the word

The question is how to make word search case insensitive?

Answer:

The grep function can be used

grep("Using",text,ignore.case=TRUE,value=TRUE)

ignore.case=TRUE – ignore case

value=TRUE – Return the value from the vector, not the position of the found word

UPD

grep will search for occurrences. Therefore, searching for "it" will return an incorrect result:

grep("it",text,ignore.case=TRUE,value=T,useBytes = T)

[1] "it" "it" "benefit" "it"

in this case, the regexpr function regexpr work better:

match<-regexpr("IT$",text,ignore.case=TRUE)
text[match==1]

[1] “it” “it” “it”

Scroll to Top