grad {numDeriv}R Documentation

Numerical Gradient of a Function

Description

Calculate the gradient of a function by numerical approximation.

Usage

     grad(func, x, method="Richardson", method.args=list(), ...) 

    ## Default S3 method:
grad(func, x, method="Richardson",
      method.args=list(), ...)
 

Arguments

func

a function with a scalar real result (see details).

x

a real scalar or vector argument to func, indicating the point(s) at which the gradient is to be calculated.

method

one of "Richardson", "simple", or "complex" indicating the method to use for the approximation.

method.args

arguments passed to method. Arguments not specified remain with their default values as specified in details

...

an additional arguments passed to func. WARNING: None of these should have names matching other arguments of this function.

Details

The function grad calculates a numerical approximation of the first derivative of func at the point x. Any additional arguments in ... are also passed to func, but the gradient is not calculated with respect to these additional arguments. It is assumed func is a scalar value function. If a vector x produces a scalar result then grad returns the numerical approximation of the gradient at the point x (which has the same length as x). If a vector x produces a vector result then the result must have the same length as x, and it is assumed that this corresponds to applying the function to each of its arguments (for example, sin(x)). In this case grad returns the gradient at each of the points in x (which also has the same length as x – so be careful). An alternative for vector valued functions is provided by jacobian.

If method is "simple", the calculation is done using a simple epsilon difference. For method "simple" methods.args=list(eps=1e-4) is the default. Only eps is used by this method.

If method is "complex", the calculation is done using the complex step derivative approach described in Lyness and Moler. This method requires that the function be able to handle complex valued arguments and return the appropriate complex valued result, even though the user may only be interested in the real-valued derivatives. For cases where it can be used, it is faster than Richardson's extrapolation, and it also provides gradients that are correct to machine precision (16 digits). For method "complex", methods.args is ignored. The algorithm uses an eps of .Machine$double.eps which cannot (and should not) be modified.

If method is "Richardson", the calculation is done by Richardson's extrapolation (see e.g. Linfield and Penny, 1989, or Fornberg and Sloan, 1994.) This method should be used if accuracy, as opposed to speed, is important (but see method "complex" above). For this method methods.args=list(eps=1e-4, d=0.0001, zero.tol=sqrt(.Machine$double.eps/7e-7), r=4, v=2, show.details=FALSE) is set as the default. d gives the fraction of x to use for the initial numerical approximation. The default means the initial approximation uses 0.0001 * x. eps is used instead of d for elements of x which are zero (absolute value less than zero.tol). zero.tol tolerance used for deciding which elements of x are zero. r gives the number of Richardson improvement iterations (repetitions with successly smaller d. The default 4 general provides good results, but this can be increased to 6 for improved accuracy at the cost of more evaluations. v gives the reduction factor. show.details is a logical indicating if detailed calculations should be shown.

The general approach in the Richardson method is to iterate for r iterations from initial values for interval value d, using reduced factor v. The the first order approximation to the derivative with respect to x_{i} is

f'_{i}(x) = <f(x_{1},…,x_{i}+d,…,x_{n}) - f(x_{1},…,x_{i}-d,…,x_{n})>/(2*d)

This is repeated r times with successively smaller d and then Richardson extraplolation is applied.

If elements of x are near zero the multiplicative interval calculation using d does not work, and for these elements an additive calculation using eps is done instead. The argument zero.tol is used determine if an element should be considered too close to zero. In the iterations, interval is successively reduced to eventual be d/v^r and the square of this value is used in second derivative calculations (see genD) so the default zero.tol=sqrt(.Machine$double.eps/7e-7) is set to ensure the interval is bigger than .Machine$double.eps with the default d, r, and v.

Value

A real scalar or vector of the approximated gradient(s).

References

Linfield, G. R. and Penny, J. E. T. (1989) Microcomputers in Numerical Analysis. New York: Halsted Press.

Fornberg, B. and Sloan, D, M. (1994) “A review of pseudospectral methods for solving partial differential equations.” Acta Numerica, 3, 203-267.

Lyness, J. N. and Moler, C. B. (1967) “Numerical Differentiation of Analytic Functions.” SIAM Journal for Numerical Analysis, 4(2), 202-210.

See Also

jacobian, hessian, genD, numericDeriv

Examples

  grad(sin, pi)
  grad(sin, (0:10)*2*pi/10)
  func0 <- function(x){ sum(sin(x))  }
  grad(func0 , (0:10)*2*pi/10)

  func1 <- function(x){ sin(10*x) - exp(-x) }

  curve(func1,from=0,to=5)

  x <- 2.04
  numd1 <- grad(func1, x)
  exact <- 10*cos(10*x) + exp(-x)
  c(numd1, exact, (numd1 - exact)/exact)

  x <- c(1:10)
  numd1 <- grad(func1, x)
  numd2 <- grad(func1, x, "complex")
  exact <- 10*cos(10*x) + exp(-x)
  cbind(numd1, numd2, exact, (numd1 - exact)/exact, (numd2 - exact)/exact)

  sc2.f <- function(x){
    n <- length(x)
    sum((1:n) * (exp(x) - x)) / n
    }

  sc2.g <- function(x){
    n <- length(x)
    (1:n) * (exp(x) - 1) / n
    }

  x0 <- rnorm(100)
  exact <- sc2.g(x0)

  g <- grad(func=sc2.f, x=x0)
  max(abs(exact - g)/(1 + abs(exact)))

  gc <- grad(func=sc2.f, x=x0, method="complex")
  max(abs(exact - gc)/(1 + abs(exact)))

[Package numDeriv version 2012.9-1 Index]