Simon Willison’s Weblog

Subscribe

November 2022

64 posts: 8 entries, 18 links, 3 quotes, 35 beats

Nov. 1, 2022

TIL GitHub Pages: The Missing Manual — [GitHub Pages](https://pages.github.com/) is an excellent free hosting platform, but the documentation is missing out on some crucial details.

RFC 7807: Problem Details for HTTP APIs (via) This RFC has been brewing for quite a while, and is currently in last call (ends 2022-11-03). I’m designing the JSON error messages for Datasette at the moment so this could not be more relevant for me.

# 3:15 am / errors, http, json, mark-nottingham, rfc, standards

Nov. 3, 2022

TIL Getting Mastodon running on a custom domain — This TIL is mainly a rehash of these two articles by Jacob and Andrew:

Nov. 4, 2022

Don’t Read Off The Screen (via) Stuart Langridge provides a fantastic set of public speaking tips in a five minute lightning talk remix of Sunscreen. Watch with sound.

# 4:02 pm / speaking, stuart-langridge

Nov. 5, 2022

TIL Export a Mastodon timeline to SQLite — I've been playing around with [the Mastodon timelines API](https://docs.joinmastodon.org/methods/timelines/). It's pretty fun!

Nikodemus’ Guide to Mastodon (via) I’ve been reading a bunch of different Mastodon guides and this one had pretty much exactly the information I needed to see when I first started out.

# 4:18 am / mastodon

It looks like I’m moving to Mastodon

Elon Musk laid off about half of Twitter this morning. There are many terrible stories emerging about how this went down, but one that particularly struck me was that he laid off the entire accessibility team. For me this feels like a microcosm of the whole situation. Twitter’s priorities are no longer even remotely aligned with my own.

[... 1,546 words]

GOV.UK: Rules for getting production access (via) Fascinating piece of internal documentation on GOV.UK describing their rules, procedures and granted permissions for their deployment and administrative ops roles.

# 6:25 pm / security, gov-uk

Nov. 6, 2022

What to blog about

You should start a blog. Having your own little corner of the internet is good for the soul!

[... 520 words]

Nov. 7, 2022

Blessed.rs Crate List (via) Rust doesn’t have a very large standard library, so part of learning Rust is figuring out which of the third-party crates are the best for tackling common problems. This here is an opinionated guide to crates, which looks like it could be really useful.

# 7:25 pm / rust

Nov. 8, 2022

Mastodon is just blogs

And that’s great. It’s also the return of Google Reader!

[... 1,560 words]

Nov. 9, 2022

Designing a write API for Datasette

Building out Datasette Cloud has made one thing clear to me: Datasette needs a write API for ingesting new data into its attached SQLite databases.

[... 1,493 words]

Inside the mind of a frontend developer: Hero section. Ahmad Shadeed provides a fascinating, hyper-detailed breakdown of his approach to implementing a “hero section” component using HTML and CSS, including notes on CSS grids and gradient backgrounds.

# 7:54 pm / css, ahmad-shadeed

Semantic text search using embeddings. Example Python notebook from OpenAI demonstrating how to build a search engine using embeddings rather than straight up token matching. This is a fascinating way of implementing search, providing results that match the intent of the search (“delicious beans” for example) even if none of the keywords are actually present in the text.

# 7:57 pm / machine-learning, search, openai, embeddings

PyScript Updates: Bytecode Alliance, Pyodide, and MicroPython. Absolutely huge news about Python on the Web tucked into this announcement: Anaconda have managed to get a version of MicroPython compiled to WebAssembly running in the browser. Pyodide weighs in at around 6.5MB compressed, but the MicroPython build is just 303KB—the size of a large image. This makes Python in the web browser applicable to so many more potential areas.

# 10:26 pm / python, webassembly, pyodide

Nov. 11, 2022

Home invasion: Mastodon’s Eternal September begins. Hugh Rundle’s thoughtful write-up of the impact of the massive influx of new users from Twitter on the existing Mastodon community. If you’re new to Mastodon (like me) you should read this and think carefully about how best to respectfully integrate with your new online space.

# 12:47 am / mastodon

Release datasette 0.63.1 — An open source multi-tool for exploring and publishing data

Nov. 13, 2022

Datasette is 5 today: a call for birthday presents

Visit Datasette is 5 today: a call for birthday presents

Five years ago today I published the first release of Datasette, in Datasette: instantly create and publish an API for your SQLite databases.

[... 548 words]

TIL Generating OpenAPI specifications using GPT-3 — I wanted to start playing around with [OpenAPI](https://www.openapis.org/). I decided to see if I could get GPT-3 to generate the first version of a specification for me.

Nov. 14, 2022

TIL JSON Pointer — I'm [looking at options](https://github.com/simonw/datasette/issues/1875) for representing JSON validation errors in more JSON. The recent [RFC 7807: Problem Details for HTTP APIs](https://datatracker.ietf.org/doc/draft-ietf-httpapi-rfc7807bis/) looks relevant here.

Nov. 15, 2022

TIL Writing tests with Copilot — I needed to write a relatively repetitive collection of tests, for a number of different possible error states.
TIL HTML datalist — A [Datasette feature suggestion](https://github.com/simonw/datasette/issues/1890) concerning autocomplete against a list of known values inspired me to learn how to use the HTML `<datalist>` element ([see MDN](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/datalist)).

Nov. 16, 2022

JSON Changelog with SQLite (via) One of my favourite database challenges is how to track changes to rows over time. This is a neat recipe from 2018 which uses SQLite triggers and the SQLite JSON functions to serialize older versions of the rows and store them in TEXT columns.

# 3:41 am / databases, json, sqlite

fasiha/yamanote (via) Yamanote is “a guerrilla bookmarking server” by Ahmed Fasih—it works using a bookmarklet that grabs a full serialized copy of the page—the innerHTML of both the head and body element—and passes it to the server, which stores it in a SQLite database. The files are then served with a Content-Security-Policy’: `default-src ’self’ header to prevent stored pages from fetching ANY external assets when they are viewed.

# 3:48 am / bookmarks, sqlite, content-security-policy

TIL How to create a tarball of a git repository using "git archive" — I figured this out in [a Gist in 2016](https://gist.github.com/simonw/a44af92b4b255981161eacc304417368) which has attracted a bunch of comments over the years. Now I'm upgrading it to a retroactive TIL.
TIL Verifying your GitHub profile on Mastodon — Mastodon has a really neat way of implementing verification, using the [rel=me microformat](https://microformats.org/wiki/rel-me).

These kinds of biases aren’t so much a technical problem as a sociotechnical one; ML models try to approximate biases in their underlying datasets and, for some groups of people, some of these biases are offensive or harmful. That means in the coming years there will be endless political battles about what the ‘correct’ biases are for different models to display (or not display), and we can ultimately expect there to be as many approaches as there are distinct ideologies on the planet. I expect to move into a fractal ecosystem of models, and I expect model providers will ‘shapeshift’ a single model to display different biases depending on the market it is being deployed into. This will be extraordinarily messy.

Jack Clark

# 11:04 pm / machine-learning, ai, generative-ai, jack-clark, llms

Nov. 18, 2022

Release datasette-search-all 1.1.1 — Datasette plugin for searching all searchable tables at once
Release datasette-ripgrep 0.7.1 — Web interface for searching your code using ripgrep, built as a Datasette plugin
Release datasette-socrata 0.3.1 — Import data from Socrata into Datasette

2022 » November

MTWTFSS
 123456
78910111213
14151617181920
21222324252627
282930