Apply, sapply, mapply, lapply, vaply,rapply, tapply, replicate, aggregate, by and related in R. When and how to use it?

Question:

What is the difference between apply , sapply , mapply , lapply , vapply , rapply , tapply , replicate , aggregate , by and related functions in R?

When and how to use each one?

Are there other packages that do something similar or can override these functions?

Answer:

Translating from here .

R has many *apply functions which are well explained in the help (eg ?apply ). As there are so many, some novice users may have difficulty deciding which one is appropriate for their situation or even remembering them all.

  • applyWhen you want to apply the function to the rows or columns of a matrix.

     # Matriz de duas dimensões M <- matrix(seq(1,16), 4, 4) # apply min às linhas apply(M, 1, min) [1] 1 2 3 4 # apply min às colunas apply(M, 2, max) [1] 4 8 12 16 # Array tridimensional M <- array( seq(32), dim = c(4,4,2)) # Aplicar soma em cada M [ * ], - isto é, através de Soma 2 ª e 3 ª dimensão apply(M, 1, sum) # O resultado é unidimensional [1] 120 128 136 144 # Aplicar soma em cada M [ * , * ] - ou seja, através de Soma 3 ª dimensão apply(M, c(1,2), sum) # O resultado é bidimensional [,1] [,2] [,3] [,4] [1,] 18 26 34 42 [2,] 20 28 36 44 [3,] 22 30 38 46 [4,] 24 32 40 48
  • lapplyWhen you want to apply a function to each element of a list and get a list back.

    This is the flagship of many of the other *apply functions.

     x <- list(a = 1, b = 1:3, c = 10:100) lapply(x, FUN = length) $a [1] 1 $b [1] 3 $c [1] 91 lapply(x, FUN = sum) $a [1] 1 $b [1] 6 $c [1] 5005
  • sapplyWhen you want to apply the function to each element of a list, but want to return an array instead of a list.

    Instead of using unlist(lapply(...)) , consider using sapply .

     x <- list(a = 1, b = 1:3, c = 10:100) #Compare com acima; um vetor chamado , não uma lista sapply(x, FUN = length) abc 1 3 91 sapply(x, FUN = sum) abc 1 6 5005

    In more advanced uses of sapply the function will attempt to result in a multi-dimensional array if appropriate. For example, if our function returns vectors of the same length, sapply will use them as columns of a matrix:

     sapply(1:5,function(x) rnorm(3,x))

    If our function returns a 2-dimensional array, sapply will essentially do the same thing, treating each array as a single vector:

     sapply(1:5,function(x) matrix(x,2,2))

    Unless we specify simplify = "array" , in which case it will use the individual arrays to build a multi-dimensional array:

     sapply(1:5,function(x) matrix(x,2,2), simplify = "array")
  • vapplyFor when you want to use sapply but maybe need faster code.

    By vapply , you basically give R an example of what kind of function it will return, which can increase your performance.

     x <- list(a = 1, b = 1:3, c = 10:100) # Note que uma vez que o avanço aqui é principalmente a velocidade , este # Exemplo é apenas para ilustração. Estamos dizendo que R # Tudo voltou por length () deve ser um número inteiro de # Comprimento 1. vapply(x, FUN = length, FUN.VALUE = 0) abc 1 3 91
  • mapplyFor when you have several different data structures (eg vectors, lists) and you want to apply the function to the first elements of each and then the second, etc., forcing the result into a vector or array as in sapply .

    In this case your function must accept multiple arguments.

     #Soma os 1ºs elementos, os 2ºs elementos, etc. mapply(sum, 1:5, 1:5, 1:5) [1] 3 6 9 12 15 #Para fazer rep(1,4), rep(2,3), etc. mapply(rep, 1:4, 4:1) [[1]] [1] 1 1 1 1 [[2]] [1] 2 2 2 [[3]] [1] 3 3 [[4]] [1] 4
  • rapplyFor when you want to apply the function to each element of a nested list recursively.

     #Adiciona ! na string, ou incrementa myFun <- function(x){ if (is.character(x)){ return(paste(x,"!",sep="")) } else{ return(x + 1) } } #Estrutura da lista l <- list(a = list(a1 = "Boo", b1 = 2, c1 = "Eeek"), b = 3, c = "Yikes", d = list(a2 = 1, b2 = list(a3 = "Hey", b3 = 5))) #O resultado é um vetor ligado ao caractere rapply(l,myFun) #O resultado é uma lista como l, porém com os valores alterados rapply(l, myFun, how = "replace")
  • tapplyFor when you want to apply the function to subsectors of a vector and these are defined by another vector.

    A vector:

     x <- 1:20

    The factor (of the same size!) defining the groups:

     y <- factor(rep(letters[1:5], each = 4))

    Add the values ​​in x in each subgroup defined by y :

     tapply(x, y, sum) abcde 10 26 42 58 74
    • Aggregate and by – It is relatively easy to collect data in R using one or more BY variables and a defined function.

attach(mtcars)
aggdata <-aggregate(mtcars, by=list(cyl,vs),
FUN=mean, na.rm=TRUE)
print(aggdata)
detach(mtcars)

Scroll to Top