Google released a new Gemini 2.5 Computer Use model today, specially designed to help operate a GUI interface by interacting with visible elements using a virtual mouse and keyboard.
I tried the demo hosted by Browserbase at gemini.browserbase.com and was delighted and slightly horrified when it appeared to kick things off by first navigating to Google.com and solving their CAPTCHA in order to run a search!
I wrote a post about it and included this screenshot, but then learned that Browserbase itself has CAPTCHA solving built in and, as shown in this longer video, it was Browserbase that solved the CAPTCHA even while Gemini was thinking about doing so itself.
I deeply regret this error. I've deleted various social media posts about the original entry and linked back to this retraction instead.
Recent articles
- Vibe engineering - 7th October 2025
- OpenAI DevDay 2025 live blog - 6th October 2025
- Embracing the parallel coding agent lifestyle - 5th October 2025