Simon Willison’s Weblog

Subscribe
Atom feed for datasette Random

1,455 posts tagged “datasette”

Datasette is an open source tool for exploring and publishing data.

2025

Release datasette-llm-usage 0.1a1 — Track usage of LLM tokens in a SQLite table
Release datasette-enrichments-llm 0.1a1 — Enrich data by prompting LLMs
Release datasette-debug-events 0.1a1 — Print Datasette events to standard error
Release datasette-acl 0.4a5 — Advanced permission management for Datasette

Claude can write complete Datasette plugins now

This isn’t necessarily surprising, but it’s worth noting anyway. Claude Sonnet 4.5 is capable of building a full Datasette plugin now.

[... 1,296 words]

Two of my public Datasette instances - for my TILs and my blog's backup mirror - were getting hammered with misbehaving bot traffic today. Scaling them up to more Fly instances got them running again but I'd rather not pay extra just so bots can crawl me harder.

The log files showed the main problem was facets: Datasette provides these by default on the table page, but they can be combined in ways that keep poorly written crawlers busy visiting different variants of the same page over and over again.

So I turned those off. I'm now running those instances with --setting allow_facet off (described here), and my logs are full of lines that look like this. The "400 Bad Request" means a bot was blocked from loading the page:

GET /simonwillisonblog/blog_entry?_facet_date=created&_facet=series_id&_facet_size=max&_facet=extra_head_html&_sort=is_draft&created__date=2012-01-30 HTTP/1.1" 400 Bad Request

# 6th October 2025, 12:56 am / crawling, datasette

llm-openrouter 0.5. New release of my LLM plugin for accessing models made available via OpenRouter. The release notes in full:

  • Support for tool calling. Thanks, James Sanford. #43
  • Support for reasoning options, for example llm -m openrouter/openai/gpt-5 'prove dogs exist' -o reasoning_effort medium. #45

Tool calling is a really big deal, as it means you can now use the plugin to try out tools (and build agents, if you like) against any of the 179 tool-enabled models on that platform:

llm install llm-openrouter
llm keys set openrouter
# Paste key here
llm models --tools | grep 'OpenRouter:' | wc -l
# Outputs 179

Quite a few of the models hosted on OpenRouter can be accessed for free. Here's a tool-usage example using the llm-tools-datasette plugin against the new Grok 4 Fast model:

llm install llm-tools-datasette
llm -m openrouter/x-ai/grok-4-fast:free -T 'Datasette("https://datasette.io/content")' 'Count available plugins'

Outputs:

There are 154 available plugins.

The output of llm logs -cu shows the tool calls and SQL queries it executed to get that result.

# 21st September 2025, 12:24 am / llm, llm-reasoning, llm-tool-use, ai, llms, generative-ai, projects, openrouter, datasette

Release datasette-demo-dbs 0.1.1 — Load demo DBs when Datasette starts
Release datasette-demo-dbs 0.1 — Load demo DBs when Datasette starts

LLM 0.27, the annotated release notes: GPT-5 and improved tool calling

I shipped LLM 0.27 today (followed by a 0.27.1 with minor bug fixes), adding support for the new GPT-5 family of models from OpenAI plus a flurry of improvements to the tool calling features introduced in LLM 0.26. Here are the annotated release notes.

[... 1,174 words]

Release datasette-queries 0.1.2 — Save SQL queries in Datasette
Release datasette-queries 0.1.1 — Save SQL queries in Datasette
Release datasette-public 0.3a3 — Make selected Datasette databases and tables visible to the public
Release datasette-queries 0.1 — Save SQL queries in Datasette
Release datasette-load 0.1a3 — API and UI for bulk loading data into Datasette from a URL

We're hosting the sixth in our series of Datasette Public Office Hours livestream sessions this Friday, 6th of June at 2pm PST (here's that time in your location).

The topic is going to be tool support in LLM, as introduced here.

I'll be walking through the new features, and we're also inviting five minute lightning demos from community members who are doing fun things with the new capabilities. If you'd like to present one of those please get in touch via this form.

Datasette Public Office Hours #06 - Tool Support in LLM! Friday June 6th, 2025 @ 2pm PST Hosted in the Datasette Discord https://discord.gg/M4tFcgVFXf

Here's a link to add it to Google Calendar.

# 3rd June 2025, 7:42 pm / datasette-public-office-hours, llm, datasette, generative-ai, llm-tool-use, ai, llms

Saying Bye to Glitch (via) Pirijan, co-creator of Glitch - who stopped working on it six years ago, so has the benefit of distance:

Here lies Glitch, a place on the web you could go to write up a website or a node.js server that would be hosted and updated as you type. 🥀 RIP 2015 – 2025.

Pirijan continues with a poignant retrospective about Glitch's early origins at Fog Greek with the vision of providing "web development with real code that was as easy as editing a Google Doc". Their conclusion:

I still believe there’s a market for easy and fun web development and hosting, but a product like this needs power-users and enthusiasts willing to pay for it. To build any kind of prosumer software, you do have to be an optimist and believe that enough of the world still cares about quality and craft.

Glitch will be shutting down project hosting and user profiles on July 8th.

Code will be available to download until the end of the year. Glitch have an official Python export script that can download all of your projects and assets.

Jenn Schiffer, formerly Director of Community at Glitch and then Fastly, is a little more salty:

all that being said, i do sincerely want to thank fastly for giving glitch the opportunity to live to its 3-year acqui-versary this week. they generously took in a beautiful flower and placed it upon their sunny window sill with hopes to grow it more. the problem is they chose to never water it, and anyone with an elementary school education know what happens then. i wish us all a merry august earnings call season.

I'm very sad to see Glitch go. I've been pointing people to my tutorial on Running Datasette on Glitch for 5 years now, it was a fantastic way to help people quickly get started hosting their own projects.

# 29th May 2025, 8:36 pm / glitch, fastly, datasette

Release llm-tools-datasette 0.1 — Expose Datasette instances to LLM as a tool
Release llm-tools-datasette 0.1a1 — Expose Datasette instances to LLM as a tool
Release llm-tools-datasette 0.1a0 — Expose Datasette instances to LLM as a tool

In addition to my workshop the other day I'm also participating in the poster session at PyCon US this year.

This means that tomorrow (Sunday 18th May) I'll be hanging out next to my poster from 10am to 1pm in Hall A talking to people about my various projects.

I'll confess: I didn't pay close enough attention to the poster information, so when I first put my poster up it looked a little small:

My Datasette poster on a huge black poster board. It looks a bit lonely in the middle surrounded by empty space.

... so I headed to the nearest CVS and printed out some photos to better represent my interests and personality. I'm going for a "teenage bedroom" aesthetic here, I'm very happy with the result:

My Datasette poster is now surrounded by nearly 100 photos - mostly of pelicans, SVGs of pelicans and niche museums I've been to.

Here's the poster in the middle (also available as a PDF). It has columns for Datasette, sqlite-utils and LLM.

Datasette: An ecosystem of tools for finding stories in data. Three projects: Datasette is a tool for exploring and publishing data. It helps data journalists (and everyone else) take data of any shape, analyze and explore it, and publish it as an interactive website and accompanying API. There's a screenshot of the table interface against a legislators table. Datasette has over 180 plugins adding features for visualizing, editing and transforming data. datasette-cluster-map, datasette-graphql, datasette-publish-cloudrun, datasette-comments, datasette-query-assistant, datasette-extract. datasette.io. sqlite-utils is a Python library and CLI tool for manipulating SQLite databases. It aims to make the gap from “I have data” to “that data is in SQLite” as small as possible. There's a code example showing inserting three chickens into a database and configuring full-text search. And in the terminal: sqlite-utils transform places.db roadside_attractions  --rename pk id  --default name Untitled  --drop address.  sqlite-utils.datasette.io. LLM is a Python library and CLI tool for interacting with Large Language Models. It provides a plugin-based abstraction over hundreds of different models, both local and hosted, and logs every interaction with them to SQLite. LLMs are proficient at SQL and extremely good at extracting structured data from unstructured text, images and documents. LLM’s asyncio Python library powers several Datasette plugins, including datasette-query-assistant, datasette-enrichments and datasette-extract. llm.datasette.io

If you're at PyCon I'd love to talk to you about things I'm working on!

Update: Thanks to everyone who came along. Here's a 6MB photo of the poster setup. The museums were all from my www.niche-museums.com site and the pelicans riding a bicycle SVGs came from my pelican-riding-a-bicycle tag.

# 17th May 2025, 8:34 pm / pycon, llm, datasette, sqlite-utils, pelican-riding-a-bicycle, museums

django-simple-deploy. Eric Matthes presented a lightning talk about this project at PyCon US this morning. "Django has a deploy command now". You can run it like this:

pip install django-simple-deploy[fly_io]
# Add django_simple_deploy to INSTALLED_APPS.
python manage.py deploy --automate-all

It's plugin-based (inspired by Datasette!) and the project has stable plugins for three hosting platforms: dsd-flyio, dsd-heroku and dsd-platformsh.

Currently in development: dsd-vps - a plugin that should work with any VPS provider, using Paramiko to connect to a newly created instance and run all of the commands needed to start serving a Django application.

# 17th May 2025, 12:49 pm / fly, heroku, datasette, plugins, django, paramiko, python

Release datasette-chronicle 0.3 — Enable sqlite-chronicle against tables in Datasette
Release datasette-enrichments 0.5.1 — Tools for running enrichments against data stored in Datasette
Release datasette-remote-actors 0.1a5 — Datasette plugin for fetching details of actors from a remote endpoint

Introducing Datasette for Newsrooms. We're introducing a new product suite today called Datasette for Newsrooms - a bundled collection of Datasette Cloud features built specifically for investigative journalists and data teams. We're describing it as an all-in-one data store, search engine, and collaboration platform designed to make working with data in a newsroom easier, faster, and more transparent.

If your newsroom could benefit from a managed version of Datasette we would love to hear from you. We're offering it to nonprofit newsrooms for free for the first year (they can pay us in feedback), and we have a two month trial for everyone else.

Get in touch at hello@datasette.cloud if you'd like to try it out.

One crucial detail: we will help you get started - we'll load data into your instance for you (you get some free data engineering!) and walk you through how to use it, and we will eagerly consume any feedback you have for us and prioritize shipping anything that helps you use the tool. Our unofficial goal: we want someone to win a Pulitzer for investigative reporting where our tool played a tiny part in their reporting process.

Here's an animated GIF demo (taken from our new Newsrooms landing page) of my favorite recent feature: the ability to extract structured data into a table starting with an unstructured PDF, using the latest version of the datasette-extract plugin.

Animated demo. Starts with a PDF file of the San Francisco Planning Commission, which includes a table of data of members and their term ending dates. Switches to a Datasette Cloud with an interface for creating a table - the table is called planning_commission and has Seat Number (integer), Appointing Authority, Seat Holder and Term Ending columns - Term Ending has a hint of YYYY-MM-DD. The PDF is dropped onto the interface and the Extract button is clicked - this causes a loading spinner while the rows are extracted one by one as JSON, then the page refreshes as a table view showing the imported structured data.

# 24th April 2025, 9:51 pm / datasette-cloud, structured-extraction, datasette, projects, data-journalism, journalism

Release datasette 1.0a19 — An open source multi-tool for exploring and publishing data
Release datasette-extract 0.1a10 — Import unstructured data (text and images) into structured tables
Release datasette 1.0a18 — An open source multi-tool for exploring and publishing data