Recap on ML

Recap on ML

Notes on AI agents, ML, and dev tooling — by qte77

Home Categories Search About Archive

Transformer distiled, Part 1 of 2

July 1, 2022

An overview of the core mathematical components of the Transformer architecture, covering scaled dot-product attention, softmax, multi-head attention, linear layers, and learned embeddings.
Read More
« Prev 1 2 3 4 5 Next »