Yeah, unfortunately vision prompting has been a tough nut to crack. We've found it's very challenging to improve Claude's actual "vision" through just text prompts, but we can of course improve its reasoning and thought process once it extracts info from an image.
In general, I think vision is still in its early days, although 3.5 Sonnet is noticeably better than older models.
— Alex Albert, Anthropic
Recent articles
- How Rob Pike got spammed with an AI slop "act of kindness" - 26th December 2025
- A new way to extract detailed transcripts from Claude Code - 25th December 2025
- Cooking with Claude - 23rd December 2025