The most surprising part of DeepSeek-R1 is that it only takes ~800k samples of 'good' RL reasoning to convert other models into RL-reasoners. Now that DeepSeek-R1 is available people will be able to refine samples out of it to convert any other model into an RL reasoner.
Recent articles
- Adding dynamic features to an aggressively cached website - 28th January 2026
- ChatGPT Containers can now run bash, pip/npm install packages, and download files - 26th January 2026
- Wilson Lin on FastRender: a browser built by thousands of parallel agents - 23rd January 2026