We believe that AI tools are at their best when they incorporate and represent the full diversity and breadth of human intelligence and experience. [...] Because copyright today covers virtually every sort of human expression– including blog posts, photographs, forum posts, scraps of software code, and government documents–it would be impossible to train today’s leading AI models without using copyrighted materials. Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens.
Recent articles
- My review of Claude's new Code Interpreter, released under a very confusing name - 9th September 2025
- Recreating the Apollo AI adoption rate chart with GPT-5, Python and Pyodide - 9th September 2025
- GPT-5 Thinking in ChatGPT (aka Research Goblin) is shockingly good at search - 6th September 2025