NN.Optimisers.Instances

Definitions

GD : Num pType => Neg pType => Random pType => {auto mon : ComMonoid pType} -> FromDouble pType => {default 0.001 _ : pType} -> Optimiser (Const pType) ()

  Gradient descent optimiser. Has trivial state
  @lr is the learning rate

Visibility: public export

GA : Num pType => Neg pType => Random pType => {auto mon : ComMonoid pType} -> FromDouble pType => {default 0.001 _ : pType} -> Optimiser (Const pType) ()

  Gradient ascent optimiser. Has trivial state
  @lr is the learning rate

Visibility: public export

momentumUpdate : Num pType => Neg pType => pType -> pType -> pType -> pType -> (pType, pType)

Visibility: public export

lookAhead : Num pType => pType -> pType -> pType -> pType

Visibility: public export

GDMomentum : Num pType => Neg pType => Random pType => {auto mon : ComMonoid pType} -> FromDouble pType => {default False _ : Bool} -> {default 0.001 _ : pType} -> {default 0.9 _ : pType} -> Optimiser (Const pType) pType

  Gradient Descent with momentum, optionally with Nesterov acceleration

Visibility: public export

adamUpdate : Num pType => Neg pType => Fractional pType => Sqrt pType => pType -> pType -> pType -> pType -> pType -> pType -> pType -> pType -> pType -> (pType, (pType, (pType, (pType, pType))))

  Adam update step
  State is (m, v, beta1^t, beta2^t) where m and v are the first and second
  moment estimates, and beta1^t, beta2^t are running powers for bias correction

Visibility: public export

Adam : Num pType => Neg pType => Random pType => {auto mon : ComMonoid pType} -> FromDouble pType => Fractional pType => Sqrt pType => {default 0.001 _ : pType} -> {default 0.9 _ : pType} -> {default 0.999 _ : pType} -> {default 1e-8 _ : pType} -> Optimiser (Const pType) (pType, (pType, (pType, pType)))

  Adam optimiser (Kingma & Ba, 2014)
  Using 4 parameters for state for efficiency
  @lr is the learning rate
  @beta1 is the exponential decay rate for the first moment estimate
  @beta2 is the exponential decay rate for the second moment estimate
  @epsilon is a small constant for numerical stability

Visibility: public export