Skip to main content

Tagged “research”

  1. Hacking "vanilla" FlashAttention for variable-length inputs
  2. Visualizing equivariances in transformer neural networks

See all tags.