Archive for Monday, 9th February 2026

Monday, 9th February 2026

Release datasette-sqlite-history-json 0.1a0 — Track changes to data using triggers and hidden JSON tables

9th Feb 2026, 12:03 am

Release showboat v0.3.0 — Create executable documents that demonstrate an agent's work

9th Feb 2026, 1:42 am · showboat

Release showboat v0.4.0 — Create executable documents that demonstrate an agent's work

9th Feb 2026, 5:38 am · showboat

AI Doesn’t Reduce Work—It Intensifies It (via) Aruna Ranganathan and Xingqi Maggie Ye from Berkeley Haas School of Business report initial findings in the HBR from their April to December 2025 study of 200 employees at a "U.S.-based technology company".

This captures an effect I've been observing in my own work with LLMs: the productivity boost these things can provide is exhausting.

AI introduced a new rhythm in which workers managed several active threads at once: manually writing code while AI generated an alternative version, running multiple agents in parallel, or reviving long-deferred tasks because AI could “handle them” in the background. They did this, in part, because they felt they had a “partner” that could help them move through their workload.

While this sense of having a “partner” enabled a feeling of momentum, the reality was a continual switching of attention, frequent checking of AI outputs, and a growing number of open tasks. This created cognitive load and a sense of always juggling, even as the work felt productive.

I'm frequently finding myself with work on two or three projects running parallel. I can get so much done, but after just an hour or two my mental energy for the day feels almost entirely depleted.

I've had conversations with people recently who are losing sleep because they're finding building yet another feature with "just one more prompt" irresistible.

The HBR piece calls for organizations to build an "AI practice" that structures how AI is used to help avoid burnout and counter effects that "make it harder for organizations to distinguish genuine productivity gains from unsustainable intensity".

I think we've just disrupted decades of existing intuition about sustainable working practices. It's going to take a while and some discipline to find a good new balance.

# 4:43 pm / careers, ai, generative-ai, llms, ai-assisted-programming, ai-ethics, cognitive-debt

Release sqlite-history-json 0.4 — SQLite table history tracking using a JSON audit log

9th Feb 2026, 5:32 pm

Research Rod: Go Library for Chrome Automation -- Comprehensive API Reference — Rod is an advanced Go library designed to automate Chrome browsers using the Chrome DevTools Protocol, providing a comprehensive API for web scraping, browser control, element interaction, and robust waiting strategies. With high-level convenience methods (such as Must-prefixed methods for fast scripting) and direct protocol access, Rod enables streamlined workflows from simple scraping to complex automation scenarios, all without third-party drivers.

9th Feb 2026, 6:50 pm

Research rod-cli: Chrome Automation from the Command Line — Leveraging the rod browser automation library, rod-cli provides a lightweight Go-based command-line tool for scripting persistent headless Chrome sessions. Each CLI command connects to and manipulates the same long-running Chrome instance via DevTools Protocol, enabling seamless multi-step browser automation in shell scripts or interactive use.

9th Feb 2026, 7:10 pm

Release rodney v0.1.0 — CLI tool for interacting with the web

9th Feb 2026, 11:18 pm

Structured Context Engineering for File-Native Agentic Systems (via) New paper by Damon McMillan exploring challenging LLM context tasks involving large SQL schemas (up to 10,000 tables) across different models and file formats:

Using SQL generation as a proxy for programmatic agent operations, we present a systematic study of context engineering for structured data, comprising 9,649 experiments across 11 models, 4 formats (YAML, Markdown, JSON, Token-Oriented Object Notation [TOON]), and schemas ranging from 10 to 10,000 tables.

Unsurprisingly, the biggest impact was the models themselves - with frontier models (Opus 4.5, GPT-5.2, Gemini 2.5 Pro) beating the leading open source models (DeepSeek V3.2, Kimi K2, Llama 4).

Those frontier models benefited from filesystem based context retrieval, but the open source models had much less convincing results with those, which reinforces my feeling that the filesystem coding agent loops aren't handled as well by open weight models just yet. The Terminal Bench 2.0 leaderboard is still dominated by Anthropic, OpenAI and Gemini.

The "grep tax" result against TOON was an interesting detail. TOON is meant to represent structured data in as few tokens as possible, but it turns out the model's unfamiliarity with that format led to them spending significantly more tokens over multiple iterations trying to figure it out:

# 11:56 pm / ai, prompt-engineering, generative-ai, llms, paper-review, context-engineering

← Sunday, 8th February 2026

Tuesday, 10th February 2026 →

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

Simon Willison’s Weblog

Monday, 9th February 2026