Transformer : Num a => Ord a => {auto ac : NewAxisConsistent inputStructure [features]} -> TensorMonoid (inputStructure .cont) => TensorMonoid (features .cont) => AllAlgebra [inputStructure, features] a => {default id _ : (Tensor [inputStructure, inputStructure] a -> Tensor [inputStructure, inputStructure] a)} -> (Tensor [inputStructure] a -> Tensor [inputStructure] a) -> Tensor [inputStructure, features] a -\-> Tensor [inputStructure, features] aSingle-head transformer layer
Only missing layernorm, otherwise a complete definition