Question:
There is a vector with words
# текст
text <- c("R is a very essential tool for data analysis. While it
is regarded as domain specific, it is a very complete programming
language. Almost certainly, many people who would benefit from
using R, do not use it")
# разбиваю текст на вектор сo словами с пом. пакета stringr
text <- unlist( stringr::str_match_all(text , '\\w+\\b') )
text
[1] "R" "is" "a" "very" "essential" "tool" "for"
[8] "data" "analysis" "While" "it" "is" "regarded" "as"
[15] "domain" "specific" "it" "is" "a" "very" "complete"
[22] "programming" "language" "Almost" "certainly" "many" "people" "who"
[29] "would" "benefit" "from" "using" "R" "do" "not"
[36] "use" "it"
I want to find the word "using" in it
text[text=="using"]
[1] "using"
everything is fine, everything is found, but if you change the case a little
text[text=="Using"]
character(0)
you can't find the word
The question is how to make word search case insensitive?
Answer:
The grep
function can be used
grep("Using",text,ignore.case=TRUE,value=TRUE)
ignore.case=TRUE
– ignore case
value=TRUE
– Return the value from the vector, not the position of the found word
UPD
grep
will search for occurrences. Therefore, searching for "it" will return an incorrect result:
grep("it",text,ignore.case=TRUE,value=T,useBytes = T)
[1] "it" "it" "benefit" "it"
in this case, the regexpr
function regexpr
work better:
match<-regexpr("IT$",text,ignore.case=TRUE)
text[match==1]
[1] “it” “it” “it”