Idris2Doc : NN.Architectures.Softargmax

NN.Architectures.Softargmax

(source)

Definitions

logSumExp : Expa=>Orda=>Nega=>Foldable (Tensor [i]) =>AllAlgebra [i] a=>Tensor [i] a->Maybea
  Numerically stable log-sum-exp operation
LSE(x) = max(x) + log(Σᵢ exp(xᵢ - max(x)))
See https://gregorygundersen.com/blog/2020/02/09/log-sum-exp/

Visibility: public export
logSoftargmax : Expa=>Orda=>Nega=>Foldable (Tensor [i]) =>AllAlgebra [i] a=>Tensor [i] a->Tensor [i] a
  Log(softargmax(x)), but computationally efficient and numerically stable
Used for computing cross-entropy loss
Returns empty tensor for empty input

Visibility: public export
softargmaxImpl : Fractionala=>Expa=>Orda=>Nega=>Foldable (Tensor [i]) =>AllAlgebra [i] a=> {default1_ : a} ->Tensor [i] a->Tensor [i] a
  Commonly known as 'softmax'
When `temperature=0` it reduces to `argmax`

Visibility: public export
softargmax : Fractionala=>Expa=>Orda=>Nega=>Foldable (Tensor [i]) =>AllAlgebra [i] a=>Tensor [i] a-\->Tensor [i] a
  Softargmax as a parametric map, with temperature as a parameter
TODO the output type should be a distribution tensor, since distributions
are applicative? https://glaive-research.org/2025/02/11/Generalized-Transformers-from-Applicative-Functors.html

Visibility: public export