Simon Willison on datasette

437 posts tagged “datasette”

Datasette is an open source tool for exploring and publishing data.

2022

Datasette’s new JSON write API: The first alpha of Datasette 1.0

This week I published the first alpha release of Datasette 1.0, with a significant new feature: Datasette core now includes a JSON API for creating and dropping tables and inserting, updating and deleting data.

[... 2,817 words]

11:15 pm / 2nd December 2022 / projects, apis, json, datasette

Weeknotes: Implementing a write API, Mastodon distractions

Everything is so distracting at the moment. The ongoing Twitter catastrophe, the great migration (at least amongst most of the people I pay attention to) to Mastodon, the FTX calamity. It’s been very hard to focus!

[... 916 words]

4:57 am / 23rd November 2022 / mastodon, datasette, weeknotes

Tracking Mastodon user numbers over time with a bucket of tricks

Mastodon is definitely having a moment. User growth is skyrocketing as more and more people migrate over from Twitter.

[... 1,534 words]

7 am / 20th November 2022 / github-actions, git-history, mastodon, datasette-lite, s3-credentials, git-scraping, datasette, projects, observable, github, cors

Datasette Lite: Loading JSON data (via) I added a new feature to Datasette Lite: you can now pass it the URL to a JSON file (hosted on a CORS-compatible hosting provider such as GitHub or GitHub Gists) and it will load that file into a database table for you. It expects an array of objects, but if your file has an object as the root it will search through it looking for the first key that is an array of objects and load those instead.

# 18th November 2022, 6:43 pm / datasette-lite, json, projects, datasette, cors

Datasette is 5 today: a call for birthday presents

Five years ago today I published the first release of Datasette, in Datasette: instantly create and publish an API for your SQLite databases.

[... 548 words]

7:27 pm / 13th November 2022 / projects, datasette

Designing a write API for Datasette

Building out Datasette Cloud has made one thing clear to me: Datasette needs a write API for ingesting new data into its attached SQLite databases.

[... 1,493 words]

7:44 pm / 9th November 2022 / api-design, projects, datasette, sqlite-utils, weeknotes, datasette-cloud

Datasette 0.63: The annotated release notes

I released Datasette 0.63 today. These are the annotated release notes.

[... 1,531 words]

10:13 pm / 27th October 2022 / projects, annotated-release-notes, datasette

Weeknotes: DjangoCon, SQLite in Django, datasette-gunicorn

I spent most of this week at DjangoCon in San Diego—my first outside-of-the-Bay-Area conference since the before-times.

[... 1,184 words]

7:58 pm / 23rd October 2022 / djangocon, sqlite, datasette, projects, gunicorn, django, my-talks, shot-scraper, weeknotes, carlton-gibson

Measuring traffic during the Half Moon Bay Pumpkin Festival

This weekend was the 50th annual Half Moon Bay Pumpkin Festival.

[... 2,693 words]

3:41 pm / 19th October 2022 / datasette-lite, projects, git-scraping, data-journalism, git-history, datasette, sqlite, half-moon-bay, natalie-downe

Automating screenshots for the Datasette documentation using shot-scraper

I released shot-scraper back in March as a tool for keeping screenshots in documentation up-to-date.

[... 1,810 words]

11:44 pm / 14th October 2022 / shot-scraper, datasette, documentation, github-actions

Weeknotes: Publishing data using Datasette Cloud

My initial preview releases of Datasette Cloud (the SaaS version of my Datasette open source project) have focused on private data collaboration.

[... 582 words]

8:08 pm / 12th October 2022 / datasette-cloud, plugins, datasette, projects, weeknotes

Figure out how to serve an AWS Lambda function with a Function URL from a custom subdomain (via) This took me five hours and 77 issue comments to figure out, but I finally managed to serve an AWS Lambda function running Datasette on a custom subdomain with an HTTPS certificate. I was going to write this up as a TIL but I’m exhausted so I decided to share my private notes thread instead.

# 3rd October 2022, 12:29 am / aws, lambda, datasette

Weeknotes: Datasette Cloud preview invitations

This week I finally started sending out invitations for people to try out the preview of the new Datasette Cloud, my SaaS offering for Datasette.

[... 713 words]

11:05 pm / 30th September 2022 / projects, datasette-cloud, datasette, weeknotes

Exploring 10m scraped Shutterstock videos used to train Meta’s Make-A-Video text-to-video model

Make-A-Video is a new “state-of-the-art AI system that generates videos from text” from Meta AI. It looks incredible—it really is DALL-E / Stable Diffusion for video. And it appears to have been trained on 10m video preview clips scraped from Shutterstock.

[... 923 words]

7:31 pm / 29th September 2022 / machine-learning, datasette, facebook, projects, ai, generative-ai, ethics, training-data, text-to-video, ai-ethics

Deploying Python web apps as AWS Lambda functions. After literally years of failed half-hearted attempts, I finally managed to deploy an ASGI Python web application (Datasette) to an AWS Lambda function! Here are my extensive notes.

# 19th September 2022, 4:05 am / serverless, lambda, datasette, python, aws, asgi

Weeknotes: Datasette Lite, s3-credentials, shot-scraper, datasette-edit-templates and more

Despite distractions from AI I managed to make progress on a bunch of different projects this week, including new releases of s3-credentials and shot-scraper, a new datasette-edit-templates plugin and a small but neat improvement to Datasette Lite.

[... 1,562 words]

2:55 am / 16th September 2022 / shot-scraper, datasette, plugins, datasette-lite, projects, s3-credentials, weeknotes, github-copilot

Spevktator: OSINT analysis tool for VK. This is a really cool project that came out of a recent Bellingcat hackathon. Spevktator takes 67,000 posts from five popular Russian news channels on VK (a popular Russian social media platform) and makes them available in Datasette, along with automated translations to English, post sharing metrics and sentiment analysis scores. This README includes some detailed analysis of the data, plus a link to an Observable notebook that implements custom visualizations against queries run directly against the Datasette instance.

# 5th September 2022, 8:48 pm / political-hacking, bellingcat, datasette, observable

Exploring the training data behind Stable Diffusion

Two weeks ago, the Stable Diffusion image generation model was released to the public. I wrote about this last week, in Stable Diffusion is a really big deal—a post which has since become one of the top ten results for “stable diffusion” on Google and shown up in all sorts of different places online.

[... 2,897 words]

12:18 am / 5th September 2022 / stable-diffusion, datasette, sqlite-utils, andy-baio, search, ai, weeknotes, fly, generative-ai, ethics, laion, training-data, text-to-image, ai-ethics

Building a searchable archive for the San Francisco Microscopical Society

The San Francisco Microscopical Society was founded in 1870 by a group of scientists dedicated to advancing the field of microscopy.

[... 1,845 words]

5:24 pm / 25th August 2022 / ocr, projects, datasette, weeknotes, pdf

Digitizing 55,000 pages of civic meetings (via) Philip James has been building public, searchable archives of city council meetings for various cities—Oakland and Alamedia so far—using my s3-ocr script to run Textract OCR against the PDFs of the minutes, and deploying them to Fly using Datasette. This is a really cool project, and very much the kind of thing I’ve been hoping to support with the tools I’ve been building.

# 22nd August 2022, 4:26 pm / archiving, ocr, fly, datasette, political-hacking

Analyzing ScotRail audio announcements with Datasette—from prototype to production

Scottish train operator ScotRail released a two-hour long MP3 file containing all of the components of its automated station announcements. Messing around with them is proving to be a huge amount of fun.

[... 4,428 words]

2:04 am / 21st August 2022 / datasette-lite, projects, datasette

The Datasette Newsletter: Datasette Lite, Datasette Tutorials, Datasette Cloud. It’s been quite a while since I’ve sent one of these out now—hoping to get this on to a more regular schedule.

# 19th August 2022, 1:20 am / datasette

Plugin support for Datasette Lite

I’ve added a new feature to Datasette Lite, my distribution of Datasette that runs entirely in the browser using Python and SQLite compiled to WebAssembly. You can now install additional Datasette plugins by passing them in the URL.

[... 865 words]

6:20 pm / 17th August 2022 / webassembly, datasette, plugins, datasette-lite, projects, pypi, pyodide, cors

Litestream backups for Datasette Cloud (and weeknotes)

My main focus this week has been adding robust backups to the forthcoming Datasette Cloud.

[... 1,604 words]

5:19 pm / 11th August 2022 / litestream, gpt-3, datasette-cloud, dalle, datasette, ocr, s3, weeknotes, fly

datasette on Open Source Insights (via) Open Source Insights is "an experimental service developed and hosted by Google to help developers better understand the structure, security, and construction of open source software packages". It calculates scores for packages using various automated heuristics. A JSON version of the resulting score card can be accessed using https://deps.dev/_/s/pypi/p/{package_name}/v/

# 11th August 2022, 1:06 am / open-source, security, datasette

Introducing sqlite-html: query, parse, and generate HTML in SQLite (via) Another brilliant SQLite extension module from Alex Garcia, this time written in Go. sqlite-html adds a whole family of functions to SQLite for parsing and constructing HTML strings, built on the Go goquery and cascadia libraries. Once again, Alex uses an Observable notebook to describe the new features, with embedded interactive examples that are backed by a Datasette instance running in Fly.

# 3rd August 2022, 5:31 pm / sqlite, datasette, go, html, alex-garcia

Cleaning data with sqlite-utils and Datasette (via) I wrote a new tutorial for the Datasette website, showing how to use sqlite-utils to import a CSV file, clean up the resulting schema, fix date formats and extract some of the columns into a separate table. It’s accompanied by a ten minute video originally recorded for the HYTRADBOI conference.

# 31st July 2022, 7:57 pm / tutorials, sqlite-utils, datasette, documentation