The most surprising part of DeepSeek-R1 is that it only takes ~800k samples of 'good' RL reasoning to convert other models into RL-reasoners. Now that DeepSeek-R1 is available people will be able to refine samples out of it to convert any other model into an RL reasoner.
Recent articles
- Distributing Go binaries like sqlite-scanner through PyPI using go-to-wheel - 4th February 2026
- Moltbook is the most interesting place on the internet right now - 30th January 2026
- Adding dynamic features to an aggressively cached website - 28th January 2026