autodiff: A Go repository from shutej

Documentation

Autodiff is a numerical optimization and linear algebra library for the Go / Golang programming language. It implements basic automatic differentation for many mathematical routines. The documentation of this package can be found here.

Scalars

Autodiff has two different scalar types. The Real type allows to store first and second derivatives for the current value, whereas the BareReal type is a simple float64 which cannot store any information other than its value. Every scalar supports the following set of functions:

Function	Description
Min	Minimum
Max	Maximum
Abs	Absolute value
Sign	Sign
Neg	Negation
Add	Addition
Sub	Substraction
Mul	Multiplication
Div	Division
Pow	Power
Sqrt	Square root
Exp	Exponential function
Log	Logarithm
Log1p	Logarithm of 1+x
Log1pExp	Logarithm of 1+Exp(x)
Logistic	Standard logistic function
Erf	Error function
Erfc	Complementary error function
LogErfc	Log complementary error function
Sigmoid	Numerically stable sigmoid function
Sin	Sine
Sinh	Hyperbolic sine
Cos	Cosine
Cosh	Hyperbolic cosine
Tan	Tangent
Tanh	Hyperbolic tangent
LogAdd	Addition on log scale
LogSub	Substraction on log scale
SmoothMax	Differentiable maximum
LogSmoothMax	Differentiable maximum on log scale
Gamma	Gamma function
Lgamma	Log gamma function
Mlgamma	Multivariate log gamma function
GammaP	Lower incomplete gamma function
BesselI	Modified Bessel function of the first kind
LogBesselI	Log of the Modified Bessel function of the first kind

Vectors and Matrices

Autodiff supports vectors and matrices including basic linear algebra operations. Vectors support the following linear algebra operations:

Function	Description
VaddV	Element-wise addition
VsubV	Element-wise substraction
VmulV	Element-wise multiplication
VdivV	Element-wise division
VaddS	Addition of a scalar
VsubS	Substraction of a scalar
VmulS	Multiplication with a scalar
VdivS	Division by a scalar
VdotV	Dot product

Matrices support the following linear algebra operations:

Function	Description
MaddM	Element-wise addition
MsubM	Element-wise substraction
MmulM	Element-wise multiplication
MdivM	Element-wise division
MaddS	Addition of a scalar
MsubS	Substraction of a scalar
MmulS	Multiplication with a scalar
MdivS	Division by a scalar
MdotM	Matrix product
Outer	Outer product

Algorithms

The algorithms package contains more complex linear algebra and optimization routines:

Package	Description
bfgs	Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm
blahut	Blahut algorithm (channel capacity)
cholesky	Cholesky and LDL factorization
determinant	Matrix determinants
eigensystem	Compute Eigenvalues and Eigenvectors
gaussJordan	Gauss-Jordan algorithm
gradientDescent	Vanilla gradient desent algorithm
gramSchmidt	Gram-Schmidt algorithm
hessenbergReduction	Matrix Hessenberg reduction
lineSearch	Line-search (satisfying the Wolfe conditions)
matrixInverse	Matrix inverse
msqrt	Matrix square root
msqrtInv	Inverse matrix square root
newton	Newton's method (root finding and optimization)
qrAlgorithm	QR-Algorithm for computing Schur decompositions
rprop	Resilient backpropagation
svd	Singular Value Decomposition (SVD)
saga	SAGA stochastic average gradient descent method

Basic usage

Import the autodiff library with

  import . "github.com/pbenner/autodiff"

A scalar holding the value 1.0 can be defined in several ways, i.e.

  a := NewScalar(RealType, 1.0)
  b := NewReal(1.0)
  c := NewBareReal(1.0)

a and b are both Reals, however a has type Scalar whereas b has type *Real which implements a Scalar. Variable c is of type *BareReal which cannot carry any derivatives. Basic operations such as additions are defined on all Scalars, i.e.

  a.Add(a, b)

which stores the result of adding a and b in a. If autodiff/simple is imported, one may also use

  d := Add(a, b)

where the result is stored in a new variable d. The ConstReal type allows to define real constants without allocation of additional memory. For instance

  a.Add(a, ConstReal(1.0))

adds a constant value to a where a type cast is used to define the constant 1.0.

To differentiate a function

  f := func(x, y Scalar) Scalar {
    return Add(Mul(x, Pow(y, NewReal(3))), NewReal(4))
  }

first two reals are defined

  x := NewReal(2)
  y := NewReal(4)

that store the value at which the derivative of f should be evaluated. Afterwards, x and y must be defined as variables with

  Variables(2, x, y)

where the first argument says that derivatives up to second order should be computed. After evaluating f, i.e.

  z := f(x, y)

the function value at (x,y) = (2, 4) can be retrieved with z.GetValue(). The first and second partial derivatives can be accessed with z.GetDerivative(i) and z.GetHessian(i, j), where the arguments specify the index of the variable. For instance, the derivative of f with respect to x is returned by z.GetDerivative(0), whereas the derivative with respect to y by z.GetDerivative(1).

Basic linear algebra

Vectors and matrices can be created with

  v := NewVector(RealType, []float64{1,2})
  m := NewMatrix(RealType, 2, 2, []float64{1,2,3,4})

where v has length 2 and m is a 2x2 matrix. With

  v := NullVector(RealType, 2)
  m := NullMatrix(RealType, 2, 2)

all values are initially set to zero. Vector and matrix elements can be accessed with the At method, which returns a reference to the Scalar, i.e.

  m.At(1,1).Add(v.At(0), v.At(1))

adds the first two values in v and stores the result in the lower right element of the matrix m. Autodiff supports basic linear algebra operations, for instance, the vector matrix product can be computed with

  w := NullVector(RealType, 2)
  w.MdotV(m, v)

where the result is stored in w. Other operations, such as computing the eigenvalues and eigenvectors of a matrix, require importing the respective package from the algorithm library, i.e.

  import "github.com/pbenner/autodiff/algorithm/eigensystem"

  lambda, _, _ := eigensystem.Run(m)

Examples

Gradient descent

Compare vanilla gradient descent with resilient backpropagation

  import . "github.com/pbenner/autodiff"
  import   "github.com/pbenner/autodiff/algorithm/gradientDescent"
  import   "github.com/pbenner/autodiff/algorithm/rprop"
  import . "github.com/pbenner/autodiff/simple"

  f := func(x Vector) Scalar {
    // x^4 - 3x^3 + 2
    return Add(Sub(Pow(x.At(0), NewReal(4)), Mul(NewReal(3), Pow(x.At(0), NewReal(3)))), NewReal(2))
  }
  x0 := NewVector(RealType, []float64{8})
  // vanilla gradient descent
  xn1, _ := gradientDescent.Run(f, x0, 0.0001, gradientDescent.Epsilon{1e-8})
  // resilient backpropagation
  xn2, _ := rprop.Run(f, x0, 0.0001, 0.4, rprop.Epsilon{1e-8})

Matrix inversion

Compute the inverse r of a matrix m by minimizing the Frobenius norm ||mb - I||

  import . "github.com/pbenner/autodiff"
  import   "github.com/pbenner/autodiff/algorithm/rprop"
  import . "github.com/pbenner/autodiff/simple"

  m := NewMatrix(RealType, 2, 2, []float64{1,2,3,4})

  I := IdentityMatrix(RealType, 2)
  r := m.Clone()
  // objective function
  f := func(x Vector) Scalar {
    r.SetValues(x)
    s := Mnorm(MsubM(MmulM(m, r), I))
    return s
  }
  x, _ := rprop.Run(f, r.GetValues(), 0.01, 0.1, rprop.Epsilon{1e-12})
  r.SetValues(x)

Newton's method

Find the root of a function f with initial value x0 = (1,1)

  import . "github.com/pbenner/autodiff"
  import   "github.com/pbenner/autodiff/algorithm/newton"
  import . "github.com/pbenner/autodiff/simple"

  f := func(x Vector) Vector {
    y := NilVector(2)
    // y1 = x1^2 + x2^2 - 6
    // y2 = x1^3 - x2^2
    y.At(0).Sub(Add(Pow(x.At(0), NewReal(2)), Pow(x.At(1), NewReal(2))), NewReal(6))
    y.At(1).Sub(Pow(x.At(0), NewReal(3)), Pow(x.At(1), NewReal(2)))

    return y
  }

  x0    := NewVector(RealType, []float64{1,1})
  xn, _ := newton.RunRoot(f, x0, newton.Epsilon{1e-8})

Minimize Rosenbrock's function

Compare Newton's method, BFGS and Rprop for minimizing Rosenbrock's function

  import . "github.com/pbenner/autodiff"
  import   "github.com/pbenner/autodiff/algorithm/rprop"
  import   "github.com/pbenner/autodiff/algorithm/bfgs"
  import   "github.com/pbenner/autodiff/algorithm/newton"
  import . "github.com/pbenner/autodiff/simple"

  f := func(x Vector) (Scalar, error) {
     // f(x1, x2) = (a - x1)^2 + b(x2 - x1^2)^2
     // a = 1
     // b = 100
     // minimum: (x1,x2) = (a, a^2)
     a := NewReal(  1.0)
     b := NewReal(100.0)
     s := Pow(Sub(a, x.At(0)), NewReal(2.0))
     t := Mul(b, Pow(Sub(x.At(1), Mul(x.At(0), x.At(0))), NewReal(2.0)))
     return Add(s, t), nil
   }
  hook_bfgs := func(x, gradient Vector, y Scalar) bool {
    fmt.Println("x       :", x)
    fmt.Println("gradient:", gradient)
    fmt.Println("y       :", y)
    fmt.Println()
    return false
  }
  hook_rprop := func(gradient, step []float64, x Vector, y Scalar) bool {
    fmt.Println("x       :", x)
    fmt.Println("gradient:", gradient)
    fmt.Println("y       :", y)
    fmt.Println()
    return false
  }
  hook_newton := func(x, gradient Vector, hessian Matrix, y Scalar) bool {
    fmt.Println("x       :", x)
    fmt.Println("gradient:", gradient)
    fmt.Println("y       :", y)
    fmt.Println()
    return false
  }

  x0 := NewVector(RealType, []float64{-0.5, 2})

  rprop.Run(f, x0, 0.05, []float64{1.2, 0.8},
    rprop.Hook{hook_rprop},
    rprop.Epsilon{1e-10})

  bfgs.Run(f, x0,
    bfgs.Hook{hook_bfgs},
    bfgs.Epsilon{1e-10})

  newton.RunMin(f, x0,
    newton.HookMin{hook_newton},
    newton.Epsilon{1e-8},
    newton.HessianModification{"LDL"})

Constrained optimization

Maximize the function f(x, y) = x + y subject to x^2 + y^2 = 1 by finding the critical point of the corresponding Lagrangian

  import . "github.com/pbenner/autodiff"
  import   "github.com/pbenner/autodiff/algorithm/newton"
  import . "github.com/pbenner/autodiff/simple"

  // define the Lagrangian
  f := func(x Vector) (Scalar, error) {
    // x + y + lambda(x^2 + y^2 - 1)
    y := Add(Add(x.At(0), x.At(1)), Mul(x.At(2), Sub(Add(Mul(x.At(0), x.At(0)), Mul(x.At(1), x.At(1))), NewReal(1))))

    return y, nil
  }
  // initial value
  x0    := NewVector(RealType, []float64{3,  5, 1})
  // run Newton's method
  xn, _ := newton.RunCrit(
      f, x0,
      newton.Epsilon{1e-8})

shutej/autodiff