Weeknotes: Datasette, sqlite-utils, Datasette Desktop
20th July 2022
A flurry of releases this week, including a new Datasette alpha and a fixed Datasette Desktop.
datasette 0.62a1
Work on Datasette Cloud continues—the last 10% of the work needed for the beta launch is trending towards taking 90% of the time. It’s been driving all sorts of fixes to the wider Datasette ecosystem, which is nice.
I ran into a bug which would have been easier to investigate using Sentry. The datasette-sentry plugin wasn’t working right, and it turned out I needed a new handle_exception() plugin hook to fix it. This was the impetus I needed to push out a new Datasette alpha—I like to get new hooks into an alpha as quickly as possible so I can depend on that Datasette version from alpha releases of plugins.
Here are some other highlights from the alpha’s release notes:
- The render_cell() plugin hook is now also passed a
row
argument, representing thesqlite3.Row
object that is being rendered. (#1300)
A neat thing about Pluggy is that you can add new arguments to existing plugin hooks without breaking plugins that already use them.
- New
--nolock
option for ignoring file locks when opening read-only databases. (#1744)
Since the very start of the project Datasette has suggested trying the following command to start exploring your Google Chrome history, which is stored using SQLite:
datasette ~/Library/Application\ Support/Google/Chrome/Default/History
I’m not sure when this changed, but I tried running the command recently and got the following error:
sqlite3.OperationalError: database is locked
Since Datasette opens databases in read-only mode I didn’t see why a lock like this should be respected. It turns out SQLite can be told to ignore locks like so:
sqlite3.connect( "file:places.sqlite?mode=ro&nolock=1", uri=True )
So I added a --nolock
option to Datasette which does exactly that:
datasette ~/Library/Application\ Support/Google/Chrome/Default/History --nolock
- Datasette now has a Discord community.
Inspired by 6 More Things I Learned Building Snowpack to 20,000 Stars (Part 2) by Fred K. Schott I finally setup a chat community for Datasette, using Discord.
It’s attracted 88 members already! You can join it here. I wrote detailed notes on how I configured it in this issue.
- Database file downloads now implement conditional GET using ETags. (#1739)
This is a change I made to support Datasette Lite—I noticed that the WASM version of Datasette was downloading a fresh database every time, so I added ETag support to encourage browsers to avoid a duplicate download and use a cached copy of the database file instead, provided it hasn’t changed.
Datasette Desktop
Datasette Desktop was hanging on launch. Paul Everitt figured out a fix, which it took me way too long to get around to applying.
I finally shipped that in Datasette Desktop 0.2.2, but I wanted to reduce the chances of this happening again as much as possible. Datasette Desktop’s Elecron tests used the spectron test harness, but that’s marked as deprecated.
I’m a big fan of Playwright and I was optimistic to see that it has support for testing Electron apps. I figured out how to use that with Datasette Desktop and run the tests in GitHub Actions: I wrote up what I learned in a TIL, Testing Electron apps with Playwright and GitHub Actions.
sqlite-utils 3.28
Annotated release notes:
- New table.duplicate(new_name) method for creating a copy of a table with a matching schema and row contents. Thanks, David. (#449)
- New
sqlite-utils duplicate data.db table_name new_name
CLI command for Duplicating tables. (#454)
davidleejy suggested the table.duplicate()
method and contributed an implementation. This was the impetus for pushing out a fresh release.
I added the CLI equivalent, sqlite-utils duplicate
.
sqlite_utils.utils.rows_from_file()
is now a documented API. It can be used to read a sequence of dictionaries from a file-like object containing CSV, TSV, JSON or newline-delimited JSON. It can be passed an explicit format or can attempt to detect the format automatically. (#443)sqlite_utils.utils.TypeTracker
is now a documented API for detecting the likely column types for a sequence of string rows, see Detecting column types using TypeTracker. (#445)sqlite_utils.utils.chunks()
is now a documented API for splitting an iterator into chunks. (#451)
I have a policy that any time I need to use an undocumented method from sqlite-utils
in some other project I file an issue to add that to the documented API surface.
I had used rows_from_file()
and TypeTracker
in datasette-socrata.
This was inspired by my TIL Ignoring errors in a section of a Bash script—a trick I had to figure out because one of my scripts needed to add columns and enable FTS but only if those changes had not been previously applied.
In looking into that I spotted inconsistencies in the design of the sqlite-utils
commands, so I fixed those as much as I could while still maintaining backwards compatibility with the 3.x releases.
Releases this week
-
s3-ocr: 0.5—(5 releases total)—2022-07-19
Tools for running OCR against files stored in S3 -
datasette-graphql: 2.1.1—(36 releases total)—2022-07-18
Datasette plugin providing an automatic GraphQL API for your SQLite databases -
datasette-sentry: 0.2a0—(3 releases total)—2022-07-18
Datasette plugin for configuring Sentry -
datasette: 0.62a1—(112 releases total)—2022-07-18
An open source multi-tool for exploring and publishing data -
sqlite-utils: 3.28—(102 releases total)—2022-07-15
Python CLI utility and library for manipulating SQLite databases -
datasette-publish-vercel: 0.14—(21 releases total)—2022-07-13
Datasette plugin for publishing data using Vercel -
datasette-app: 0.2.2—(4 releases total)—2022-07-13
The Datasette macOS application -
datasette-app-support: 0.11.6—(19 releases total)—2022-07-12
Part of https://github.com/simonw/datasette-app
TIL this week
More recent articles
- Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac - 12th November 2024
- Visualizing local election results with Datasette, Observable and MapLibre GL - 9th November 2024
- Project: VERDAD - tracking misinformation in radio broadcasts using Gemini 1.5 - 7th November 2024