SiLU
SiLU Activation (Swish)
Sigmoid Linear Unit: x * sigmoid(x) Self-gated activation that often outperforms ReLU in deep networks.
Shape Contract:
- Input: [*shape] arbitrary shape
- Output: [*shape] same shape as input
Notes:
- Formula: SiLU(x) = x * sigmoid(x)
- Also known as Swish activation (Google Brain, 2017)
- Smooth approximation with learnable properties
- Used in EfficientNet, GPT-NeoX, LLaMA, and modern architectures
- Slightly more expensive than ReLU but often better performance
- Element-wise operation (preserves all dimensions)
Signature
neuron SiLU()
Ports
Inputs:
default:[*shape]
Outputs:
default:[*shape]
Implementation
Source { source: "core", path: "activations/SiLU" }