Idris2Doc : NN.Architectures.Softargmax

NN.Architectures.Softargmax

Definitions

logSumExp : Exp a => Ord a => Neg a => Foldable (Tensor [i]) => AllAlgebra [i] a => Tensor [i] a -> Maybe a

  Numerically stable log-sum-exp operation
  LSE(x) = max(x) + log(Σᵢ exp(xᵢ - max(x)))
  See https://gregorygundersen.com/blog/2020/02/09/log-sum-exp/

Visibility: public export

logSoftargmax : Exp a => Ord a => Neg a => Foldable (Tensor [i]) => AllAlgebra [i] a => Tensor [i] a -> Tensor [i] a

  Log(softargmax(x)), but computationally efficient and numerically stable
  Used for computing cross-entropy loss
  Returns empty tensor for empty input

Visibility: public export

softargmaxImpl : Fractional a => Exp a => Ord a => Neg a => IsFoldable (i .cont) => AllAlgebra [i] a => {default 1 _ : a} -> Tensor [i] a -> Tensor [i] a

  Commonly known as 'softmax'
  When `temperature=0` it reduces to `argmax`

Visibility: public export

softargmax : Fractional a => Exp a => Ord a => Neg a => IsFoldable (i .cont) => AllAlgebra [i] a => Tensor [i] a -\-> Tensor [i] a

  Softargmax as a parametric map, with temperature as a parameter
  TODO since distribution is an applicative functor (https://glaive-research.org/2025/02/11/Generalized-Transformers-from-Applicative-Functors.html)
  is there a meaningful notion of the "distribution container"?
  Is there a sense in which `Dist` is a functor on containers?

Visibility: public export