4th October 2023 - Link Blog
Think before you speak: Training Language Models With Pause Tokens. Another example of how much low hanging fruit remains to be discovered in basic Large Language Model research: this team from Carnegie Mellon and Google Research note that, since LLMs get to run their neural networks once for each token of input and output, inserting “pause” tokens that don’t output anything at all actually gives them extra opportunities to “think” about their output.
Recent articles
- My fireside chat about agentic engineering at the Pragmatic Summit - 14th March 2026
- Perhaps not Boring Technology after all - 9th March 2026
- Can coding agents relicense open source through a “clean room” implementation of code? - 5th March 2026