Simon Willison’s Weblog

Subscribe

Guides > Agentic Engineering Patterns > Agentic manual testing

Changes to Agentic manual testing

March 6, 2026, 5:38 p.m.

Body

--- +++ @@ -67,7 +67,7 @@ - Saying "use `uvx rodney --help`" causes the agent to run `rodney --help` via the [uvx](https://docs.astral.sh/uv/guides/tools/) package management tool, which automatically installs Rodney the first time it is called. - The `rodney --help` command is specifically designed to give agents everything they need to know to both understand and use the tool. Here's [that help text](https://github.com/simonw/rodney/blob/main/help.txt). -- Saying "look at screenshots" hints to the agent that it should use the `rodney screenshot` command and remind it that it can use its own vision abilities against the resulting image files to evaluate the visual appearance of the page. +- Saying "look at screenshots" hints to the agent that it should use the `rodney screenshot` command and reminds it that it can use its own vision abilities against the resulting image files to evaluate the visual appearance of the page. That's a whole lot of manual testing baked into a short prompt!

March 6, 2026, 5:37 p.m.

Body

--- +++ @@ -64,6 +64,7 @@ Start a dev server and then use `uvx rodney --help` to test the new homepage, look at screenshots to confirm the menu is in the right place ``` There are three tricks in this prompt: + - Saying "use `uvx rodney --help`" causes the agent to run `rodney --help` via the [uvx](https://docs.astral.sh/uv/guides/tools/) package management tool, which automatically installs Rodney the first time it is called. - The `rodney --help` command is specifically designed to give agents everything they need to know to both understand and use the tool. Here's [that help text](https://github.com/simonw/rodney/blob/main/help.txt). - Saying "look at screenshots" hints to the agent that it should use the `rodney screenshot` command and remind it that it can use its own vision abilities against the resulting image files to evaluate the visual appearance of the page.

March 6, 2026, 4:48 p.m.

Body

--- +++ @@ -52,7 +52,7 @@ The most powerful of these today is **[Playwright](https://playwright.dev/)**, an open source library developed by Microsoft. Playwright offers a full-featured API with bindings in multiple popular programming languages and can automate any of the popular browser engines. -Simply telling your agent to "test that with Playwright" may be enough. The agent can then select the language binding that makes the most sense, or use Playwright's default CLI tool. +Simply telling your agent to "test that with Playwright" may be enough. The agent can then select the language binding that makes the most sense, or use Playwright's [playwright-cli](https://github.com/microsoft/playwright-cli) tool. Coding agents work really well with dedicated CLIs. [agent-browser](https://github.com/vercel-labs/agent-browser) by Vercel is a comprehensive CLI wrapper around Playwright specially designed for coding agents to use.

March 6, 2026, 5:44 a.m.

Draft status changed from draft to published.

March 6, 2026, 5:43 a.m.

Body

--- +++ @@ -48,7 +48,7 @@ Historically these have been difficult to test from code, but the past decade has seen notable improvements in systems for automating real web browsers. Running a real Chrome or Firefox or Safari browser against an application can uncover all sorts of interesting problems in a realistic setting. -The coding agents know how to use these tools extremely well. +Coding agents know how to use these tools extremely well. The most powerful of these today is **[Playwright](https://playwright.dev/)**, an open source library developed by Microsoft. Playwright offers a full-featured API with bindings in multiple popular programming languages and can automate any of the popular browser engines.

March 6, 2026, 5:43 a.m.

Body

--- +++ @@ -50,7 +50,7 @@ The coding agents know how to use these tools extremely well. -The most powerful of these today is **Playwright**, an open source library developed by Microsoft. Playwright offers a full-featured API with bindings in multiple popular programming languages and can automate any of the popular browser engines. +The most powerful of these today is **[Playwright](https://playwright.dev/)**, an open source library developed by Microsoft. Playwright offers a full-featured API with bindings in multiple popular programming languages and can automate any of the popular browser engines. Simply telling your agent to "test that with Playwright" may be enough. The agent can then select the language binding that makes the most sense, or use Playwright's default CLI tool.

March 6, 2026, 5:41 a.m.

Initial version.