21st November 2024 - Link Blog
A warning about tiktoken, BPE, and OpenAI models. Tom MacWright warns that OpenAI's tiktoken Python library has a surprising performance profile: it's superlinear with the length of input, meaning someone could potentially denial-of-service you by sending you a 100,000 character string if you're passing that directly to tiktoken.encode().
There's an open issue about this (now over a year old), so for safety today it's best to truncate on characters before attempting to count or truncate using tiktoken.
Recent articles
- Is Claude Code going to cost $100/month? Probably not - it's all very confusing - 22nd April 2026
- Where's the raccoon with the ham radio? (ChatGPT Images 2.0) - 21st April 2026
- Changes in the system prompt between Claude Opus 4.6 and 4.7 - 18th April 2026