8th November 2024 - Link Blog
ChainForge. I'm still on the hunt for good options for running evaluations against prompts. ChainForge offers an interesting approach, calling itself "an open-source visual programming environment for prompt engineering".
The interface is one of those boxes-and-lines visual programming tools, which reminds me of Yahoo Pipes.
It's open source (from a team at Harvard) and written in Python, which means you can run a local copy instantly via uvx like this:
uvx chainforge serve
You can then configure it with API keys to various providers (OpenAI worked for me, Anthropic models returned JSON parsing errors due to a 500 page from the ChainForge proxy) and start trying it out.
The "Add Node" menu shows the full list of capabilities.
The JavaScript and Python evaluation blocks are particularly interesting: the JavaScript one runs outside of a sandbox using plain eval(), while the Python one still runs in your browser but uses Pyodide in a Web Worker.
Recent articles
- Is Claude Code going to cost $100/month? Probably not - it's all very confusing - 22nd April 2026
- Where's the raccoon with the ham radio? (ChatGPT Images 2.0) - 21st April 2026
- Changes in the system prompt between Claude Opus 4.6 and 4.7 - 18th April 2026

