Simon Willison’s Weblog

Subscribe
Atom feed for datasette Random

1,455 posts tagged “datasette”

Datasette is an open source tool for exploring and publishing data.

2024

Weeknotes: Llama 3, AI for Data Journalism, llm-evals and datasette-secrets

Visit Weeknotes: Llama 3, AI for Data Journalism, llm-evals and datasette-secrets

Llama 3 landed on Thursday. I ended up updating a whole bunch of different plugins to work with it, described in Options for accessing Llama 3 from the terminal using LLM.

[... 1,030 words]

Release datasette-secrets 0.1a1 — Manage secrets such as API keys for use with other Datasette plugins
Release datasette-secrets 0.1a0 — Manage secrets such as API keys for use with other Datasette plugins

AI for Data Journalism: demonstrating what we can do with this stuff right now

Visit AI for Data Journalism: demonstrating what we can do with this stuff right now

I gave a talk last month at the Story Discovery at Scale data journalism conference hosted at Stanford by Big Local News. My brief was to go deep into the things we can use Large Language Models for right now, illustrated by a flurry of demos to help provide starting points for further conversations at the conference.

[... 6,081 words]

Release datasette-cors 1.0.1 — Datasette plugin for configuring CORS headers
Release datasette-enrichments-gpt 0.4 — Datasette enrichment for analyzing row data using OpenAI's GPT models

Extracting data from unstructured text and images with Datasette and GPT-4 Turbo. Datasette Extract is a new Datasette plugin that uses GPT-4 Turbo (released to general availability today) and GPT-4 Vision to extract structured data from unstructured text and images.

I put together a video demo of the plugin in action today, and posted it to the Datasette Cloud blog along with screenshots and a tutorial describing how to use it.

# 9th April 2024, 11:03 pm / datasette-cloud, openai, gpt-4, ai, llms, datasette, generative-ai, projects, vision-llms, structured-extraction

Release datasette-public 0.2.3 — Make selected Datasette databases and tables visible to the public
Release datasette-enrichments 0.3.2 — Tools for running enrichments against data stored in Datasette
Release datasette-cors 1.0 — Datasette plugin for configuring CORS headers
Release datasette-embeddings 0.1a3 — Store and query embedding vectors in Datasette tables

datasette-import. A new plugin for importing data into Datasette. This is a replacement for datasette-paste, duplicating and extending its functionality. datasette-paste had grown beyond just dealing with pasted CSV/TSV/JSON data—it handles file uploads as well now—which inspired the new name.

# 6th April 2024, 10:40 pm / projects, datasette, plugins

Release datasette-studio 0.1a3 — Datasette pre-configured with useful plugins. Experimental alpha.
Release datasette-paste 0.1a5 — Paste data to create tables in Datasette
Release datasette-import 0.1a4 — Tools for importing data into Datasette
Release datasette-enrichments-quickjs 0.1a2 — Enrich data with a custom JavaScript function
Release datasette-embeddings 0.1a2 — Store and query embedding vectors in Datasette tables
Release datasette-paste 0.1a4 — Paste data to create tables in Datasette
Release datasette-embeddings 0.1a0 — Store and query embedding vectors in Datasette tables
Release datasette-paste 0.1a3 — Paste data to create tables in Datasette
Release datasette-paste 0.1a2 — Paste data to create tables in Datasette
Release datasette-paste 0.1a1 — Paste data to create tables in Datasette
Release datasette-enrichments 0.3.1 — Tools for running enrichments against data stored in Datasette
Release datasette-studio 0.1a2 — Datasette pre-configured with useful plugins. Experimental alpha.
Release datasette-write 0.3.2 — Datasette plugin providing a UI for executing SQL writes against the database
Release datasette-enrichments 0.3 — Tools for running enrichments against data stored in Datasette

Add ETag header for static responses. I’ve been procrastinating on adding better caching headers for static assets (JavaScript and CSS) served by Datasette for several years, because I’ve been wanting to implement the perfect solution that sets far-future cache headers on every asset and ensures the URLs change when they are updated.

Agustin Bacigalup just submitted the best kind of pull request: he observed that adding ETag support for static assets would side-step the complexity while adding much of the benefit, and implemented it along with tests.

It’s a substantial performance improvement for any Datasette instance with a number of JavaScript plugins... like the ones we are building on Datasette Cloud. I’m just annoyed we didn’t ship something like this sooner!

# 17th March 2024, 7:25 pm / datasette-cloud, datasette, web-performance, etags

Weeknotes: the aftermath of NICAR

Visit Weeknotes: the aftermath of NICAR

NICAR was fantastic this year. Alex and I ran a successful workshop on Datasette and Datasette Cloud, and I gave a lightning talk demonstrating two new GPT-4 powered Datasette plugins—datasette-enrichments-gpt and datasette-extract. I need to write more about the latter one: it enables populating tables from unstructured content (using a variant of this technique) and it’s really effective. I got it working just in time for the conference.

[... 1,430 words]

Release datasette-export-database 0.2.1 — Export a copy of a mutable SQLite database on demand
Release datasette-export-database 0.2 — Export a copy of a mutable SQLite database on demand