Gradient function return when using Optim and colSums functions

Question:

I can't understand why when using the gradlik function as an argument to the Optim function I get the following error:

Error in optim(beta, loglik, gradlik, method = "BFGS", hessian = T, control = list(fnscale = -1)) : 
gradiente em optim retorna um objeto de comprimento 9000 ao invés de 9

However, when I call the gradlik(beta) function it returns the gradient vector as expected!

Does anyone have any suggestions for fixing this code?

loglik <- function(beta) {
  NXS <- dim(model.matrix(~XS))[2]#Número de colunas de XS+1
  NXO <- dim(model.matrix(~XO))[2]#Número de colunas de XO+1
  ## parameter indices
  ibetaS <- 1:NXS
  ibetaO <- seq(tail(ibetaS, 1)+1, length=NXO)
  isigma <- tail(ibetaO, 1) + 1
  irho <- tail(isigma, 1) + 1
  g <- beta[ibetaS]
  b <- beta[ibetaO]
  sigma <- beta[isigma]
  if(sigma < 0) return(NA)
  rho <- beta[irho]
  if( ( rho < -1) || ( rho > 1)) return(NA)

  XS.g <- model.matrix(~XS) %*% g
  XO.b <- model.matrix(~XO) %*% b
  u2 <- YO - XO.b
  r <- sqrt( 1 - rho^2)
  B <- (XS.g + rho/sigma*u2)/r
  ll <- ifelse(YS == 0,
               (pnorm(-XS.g, log.p=TRUE)),
               dnorm(u2/sigma, log = TRUE) - log(sigma) +
                 (pnorm(B, log.p=TRUE))
  )
  sum(ll)
}

gradlik <- function(beta) {
  NXS <- dim(model.matrix(~XS))[2]#Número de colunas de XS+1
  NXO <- dim(model.matrix(~XO))[2]#Número de colunas de XO+1
  nObs <- length(YS)
  NO <- length(YS[YS > 0])
  nParam <- NXS + NXO + 2 #Total of parameters

  XS0 <- XS[YS==0,,drop=FALSE]
  XS1 <- XS[YS==1,,drop=FALSE]
  YO[is.na(YO)] <- 0
  YO1 <- YO[YS==1]
  XO1 <- XO[YS==1,,drop=FALSE]
  N0 <- sum(YS==0)
  N1 <- sum(YS==1)

  w  <- rep(1,N0+N1 )
  w0 <- rep(1,N0)
  w1 <- rep(1,N1)
  NXS <- dim(model.matrix(~XS))[2]#Número de colunas de XS+1
  NXO <- dim(model.matrix(~XO))[2]#Número de colunas de XO+1
  ## parameter indices
  ibetaS <- 1:NXS
  ibetaO <- seq(tail(ibetaS, 1)+1, length=NXO)
  isigma <- tail(ibetaO, 1) + 1
  irho <- tail(isigma, 1) + 1

  g <- beta[ibetaS]
  b <- beta[ibetaO]
  sigma <- beta[isigma]
  if(sigma < 0) return(matrix(NA, nObs, nParam))
  rho <- beta[irho]
  if( ( rho < -1) || ( rho > 1)) return(matrix(NA, nObs, nParam))
  XS0.g <- as.numeric(model.matrix(~XS0) %*% g)
  XS1.g <- as.numeric(model.matrix(~XS1) %*% g)
  XO1.b <- as.numeric(model.matrix(~XO1) %*% b)
  #      u2 <- YO1 - XO1.b
  u2 <- YO1 - XO1.b
  r <- sqrt( 1 - rho^2)
  #      B <- (XS1.g + rho/sigma*u2)/r
  B <- (XS1.g + rho/sigma*u2)/r
  lambdaB <- exp( dnorm( B, log = TRUE ) - pnorm( B, log.p = TRUE ) )
  gradient <- matrix(0, nObs, nParam)
  gradient[YS == 0, ibetaS] <- - w0 * model.matrix(~XS0) *
    exp( dnorm( -XS0.g, log = TRUE ) - pnorm( -XS0.g, log.p = TRUE ) )
  gradient[YS == 1, ibetaS] <- w1 * model.matrix(~XS1) * lambdaB/r
  gradient[YS == 1, ibetaO] <- w1 * model.matrix(~XO1) * (u2/sigma^2 - lambdaB*rho/sigma/r)
  gradient[YS == 1, isigma] <- w1 * ( (u2^2/sigma^3 - lambdaB*rho*u2/sigma^2/r) - 1/sigma )
  gradient[YS == 1, irho] <- w1 * (lambdaB*(u2/sigma + rho*XS1.g))/r^3
  return(colSums(gradient))
}

n=1000
X1 <- runif(n)
X2 <- runif(n)
XO <- cbind(X1,X2)
X3 <- runif(n)
XS <- cbind(X1,X2,X3)
YS <- sample(c(0,1),n,replace = TRUE)
YO <- sample(100:400,n,replace = TRUE)*YS
beta <- c(1,1,1,1,1,1,1,1,0.5)
#Note que a função abaixo compila normalmente:
gradlik(beta)
#Porém a função Optim não compila:
theta <-optim(beta,loglik, gradlik, method = "BFGS",hessian = T,control=list(fnscale=-1)) 
theta$par

Answer:

The gradient function needs to return a vector with the same size as the number of parameters.

In his role, there are two occasions when this does not happen.

When sigma <0 you return the following value in your function:

if(sigma < 0) return(matrix(NA, nObs, nParam))

This is going to be a 9000 x 9 matrix that the error optim() gives you. You have to change this piece of code to something that returns only 9 elements and not an array.

Likewise, when ( rho < -1) || ( rho > 1) your function returns:

if( ( rho < -1) || ( rho > 1)) return(matrix(NA, nObs, nParam))

What is a 9000 x 9 matrix in your example causing the error.

Scroll to Top
AllEscort