Simon Willison’s Weblog

Subscribe
Atom feed for datasette Random

1,520 posts tagged “datasette”

Datasette is an open source tool for exploring and publishing data.

2023

Release datasette-explain 0.1a0 — Explain and validate SQL queries as you type them into Datasette

Making SQLite extensions pip install-able (via) Alex Garcia figured out how to bundle a compiled SQLite extension in a Python wheel (building different wheels for different platforms) and publish them to PyPI. This is a huge leap forward in terms of the usability of SQLite extensions, which have previously been pretty difficult to actually install and run. Alex also created Datasette plugins that depend on his packages, so you can now “datasette install datasette-sqlite-regex” (or datasette-sqlite-ulid, datasette-sqlite-fastrand, datasette-sqlite-jsonschema) to gain access to his custom SQLite extensions in your Datasette instance. It even works with “datasette publish --install” to deploy to Vercel, Fly.io and Cloud Run.

# 6th February 2023, 7:44 pm / pip, plugins, python, sqlite, datasette, alex-garcia

datasette-scraper, Big Local News and other weeknotes

Visit datasette-scraper, Big Local News and other weeknotes

In addition to exploring the new MusicCaps training and evaluation data I’ve been working on the big Datasette JSON refactor, and getting excited about a Datasette project that I didn’t work on at all.

[... 1,744 words]

datasette-scraper walkthrough on YouTube (via) datasette-scraper is Colin Dellow’s new plugin that turns Datasette into a powerful web scraping tool, with a web UI based on plugin-driven customizations to the Datasette interface. It’s really impressive, and this ten minute demo shows quite how much it is capable of: it can crawl sitemaps and fetch pages, caching them (using zstandard with optional custom dictionaries for extra compression) to speed up subsequent crawls... and you can add your own plugins to extract structured data from crawled pages and save it to a separate SQLite table!

# 29th January 2023, 5:23 am / plugins, scraping, datasette, colin-dellow

Examples of sites built using Datasette (via) I gave the examples page on the Datasette website a significant upgrade today: it now includes screenshots (taken using shot-scraper) of six projects chosen to illustrate the variety of problems Datasette can be used to tackle.

# 29th January 2023, 3:40 am / projects, datasette, shot-scraper

We’ve built many tools for publishing to the web - but I want to make the claim that we have underdeveloped the tools and platforms for publishing collections, indexes and small databases. It’s too hard to build these kinds of experiences, too hard to maintain them and a lack of collaborative tools.

Tom Critchlow

# 28th January 2023, 4:43 pm / datasette

Release datasette-render-markdown 2.1.1 — Datasette plugin for rendering Markdown

Exploring MusicCaps, the evaluation data released to accompany Google’s MusicLM text-to-music model

Visit Exploring MusicCaps, the evaluation data released to accompany Google's MusicLM text-to-music model

Google Research just released MusicLM: Generating Music From Text. It’s a new generative AI model that takes a descriptive prompt and produces a “high-fidelity” music track. Here’s the paper (and a more readable version using arXiv Vanity).

[... 1,323 words]

Release datasette-youtube-embed 0.1 — Turn YouTube URLs into embedded players in Datasette

datasette-granian (via) Granian is a new Python web server—similar to Gunicorn—written in Rust. I built a small plugin that adds a “datasette granian” command starting a Granian server that serves Datasette’s ASGI application, using the same pattern as my existing datasette-gunicorn plugin.

# 20th January 2023, 2:12 am / rust, datasette, asgi

Release datasette-granian 0.1a0 — Run Datasette using the Granian HTTP server
Release datasette-faiss 0.2 — Maintain a FAISS index for specified Datasette tables

Datasette is my data hammer (via) Jeremia Kimelman—a data journalist at CalMatters in Sacramento—enthuses about how he uses Datasette as his default hammer for all kinds of data projects—in particular how much he appreciates Datasette’s focus on URLs. So nice to see this!

# 17th January 2023, 5:23 pm / data-journalism, datasette

Weeknotes: AI hacking and a SpatiaLite tutorial

Short weeknotes this time because the key things I worked on have already been covered here:

How to implement Q&A against your documentation with GPT3, embeddings and Datasette

Visit How to implement Q&A against your documentation with GPT3, embeddings and Datasette

If you’ve spent any time with GPT-3 or ChatGPT, you’ve likely thought about how useful it would be if you could point them at a specific, current collection of text or documentation and have it use that as part of its input for answering questions.

[... 3,447 words]

Release datasette-cookies-for-magic-parameters 0.1.2 — UI for setting cookies to populate magic parameters
Release datasette-cookies-for-magic-parameters 0.1.1 — UI for setting cookies to populate magic parameters
Release datasette-cookies-for-magic-parameters 0.1 — UI for setting cookies to populate magic parameters
Release datasette 0.64.1 — An open source multi-tool for exploring and publishing data
Release datasette-faiss 0.1a0 — Maintain a FAISS index for specified Datasette tables

Datasette 0.64, with a warning about SpatiaLite

Visit Datasette 0.64, with a warning about SpatiaLite

I release Datasette 0.64 this morning. This release is mainly a response to the realization that it’s not safe to run Datasette with the SpatiaLite extension loaded if that Datasette instance is configured to enable arbitrary SQL queries from untrusted users.

[... 675 words]

Release datasette-auth-passwords 1.1 — Datasette plugin for authentication using passwords
Release datasette 0.64 — An open source multi-tool for exploring and publishing data
Release datasette-publish-fly 1.3 — Datasette plugin for publishing data using Fly

2022

Weeknotes: Datasette 0.63.3, datasette-ripgrep

Visit Weeknotes: Datasette 0.63.3, datasette-ripgrep

We’re back in the UK to see family over Christmas (our first trip back since 2019). Here are a few notes from the past couple of weeks.

[... 801 words]

Release datasette-gunicorn 0.1.1 — Plugin for running Datasette using Gunicorn
Release datasette 0.63.3 — An open source multi-tool for exploring and publishing data

Datasette 1.0a2: Upserts and finely grained permissions

Visit Datasette 1.0a2: Upserts and finely grained permissions

I’ve released the third alpha of Datasette 1.0. The 1.0a2 release introduces upsert support to the new JSON API and makes some major improvements to the Datasette permissions system.

[... 2,844 words]

Release datasette 1.0a2 — An open source multi-tool for exploring and publishing data
Release datasette-ripgrep 0.8 — Web interface for searching your code using ripgrep, built as a Datasette plugin