GPUs Go Brrr (via) Fascinating, detailed low-level notes on how to get the most out of NVIDIA's H100 GPUs (currently selling for around $40,000 a piece) from the research team at Stanford who created FlashAttention, among other things.
The swizzled memory layouts are flat-out incorrectly documented, which took considerable time for us to figure out.
Recent articles
- The last six months in LLMs, illustrated by pelicans on bicycles - 6th June 2025
- Tips on prompting ChatGPT for UK technology secretary Peter Kyle - 3rd June 2025
- How often do LLMs snitch? Recreating Theo's SnitchBench with LLM - 31st May 2025