Simon Willison’s Weblog

Subscribe

December 2019

Dec. 3, 2019

datasette-atom: Define an Atom feed using a custom SQL query

I’ve been having a ton of fun iterating on www.niche-museums.com. I put together some notes on how the site works last week, and I’ve been taking advantage of the Thanksgiving break to continue exploring ways in which Datasette can be used to quickly build database-backed static websites.

[... 1,084 words]

Let’s agree that no matter what we call the situation that the humans who are elsewhere are at a professional disadvantage. There is a communication, culture, and context tax applied to the folks who are distributed. Your job as a leader to actively invest in reducing that tax.

Michael Lopp

# 1:34 pm / meetings, rands, management, remote, leadership

Dec. 4, 2019

flk: A LISP that runs wherever Bash is (via) This is a heck of a project: an implementation of LISP written entirely in Bash, meaning you can run it as a script on any machine that has a Bash installation.

# 5:19 am / lisp, bash

Dec. 5, 2019

Two malicious Python libraries caught stealing SSH and GPG keys. Nasty. Two typosquatting libraries were spotted on PyPI—targetting dateutil and jellyfish but with tricky variants of their names. They attempted to exfiltrate SSH and GPG keys and send them to an IP address defined server. npm has seen this kind of activity too—it’s important to consider this when installing packages.

# 6:07 am / npm, security, pypi

Dec. 10, 2019

Better presentations through storytelling and STAR moments

Last week I completed GSBGEN 315: Strategic Communication at the Stanford Graduate School of Business.

[... 643 words]

The Blue Tape List (via) I’ve often thought there’s something magical about your first month at a new job—you can meet anyone and ask any question, taking advantage of your “newbie” status. I like this suggestion by Michael Lopp to encourage your new hires to take notes on things that they think are broken but reserve acting on them for long enough to gain fuller context of how the new organization works.

# 6:09 pm / management, rands

Dec. 12, 2019

London Silver Vaults on Niche Museums. I’m keeping up my streak of posting a new museum I’ve visited to niche-museums.com daily—today’s entry is the London Silver Vaults, which I think are one of London’s best kept secrets: 30 specialist silver merchants in a network of vaults five storeys below Chancery Lane.

# 2:40 am / london, museums

Dec. 16, 2019

Monarch Bear Grove on Niche Museums (via) Monarch Bear Grove is my favourite hidden corner of Golden Gate Park in San Francisco. It has stone circles formed from pieces of a Spanish monastery that was exported to the USA by press baron William Randolph Hearst. And there are druids. You should read the whole thing. (I added paragraph breaks for this using datasette-render-markdown—Niche Museums is basically a full-blown blog now.)

# 9:19 pm / projects, san-francisco, museums

Logging to SQLite using ASGI middleware

I had some fun playing around with ASGI middleware and logging during our flight back to England for the holidays.

[... 2,535 words]

Dec. 18, 2019

Microbrowsers are Everywhere (via) Colin Bendell introduces a new-to-me term, “microbrowsers”, to describe the user-agents which hit websites to generate unfurled link previews in messenger apps. Twitter and Facebook first popularized them, but today you’re likely getting far more preview-generating traffic from chat clients such as iMessage, WhatsApp and Slack (which won’t execute script and ignore cookies, and hence won’t show up in Google Analytics). Lots of great tips here—one example: if you provide three og:image meta tags iMessage will render them as a collage.

# 8:32 am / urls, 24-ways, metadata

GitHub Actions ci.yml for deno. Spotted this today: it’s one of the cleanest examples I’ve seen of a complex CI configuration for GitHub Actions, testing, linting, benchmarking and building Ryan Dahl’s Deno JavaScript runtime.

# 8:49 am / continuous-integration, ryan-dahl, github, deno, github-actions

athena-sqlite (via) Amazon Athena is the AWS tool for querying data stored in S3—as CSV, JSON or Apache Parquet files—using SQL. It’s an interesting way of buliding a very cheap data warehouse on top of S3 without having to run any additional services. Athena recently added a query federation SDK which lets you define additional custom data sources using Lambda functions. Damon Cortesi used this to write a custom connector for SQLite, which lets you run queries against data stored in SQLite files that you have uploaded to S3. You can then run joins between that data and other Athena sources.

# 9:05 am / sqlite, sql, aws, athena, s3

Dec. 20, 2019

Building tools to bring data-driven reporting to more newsrooms. I wrote about my fellowship project so far and my goals for the next few months for the JSK Medium publication. My next priority: an invite-only hosted version for newsrooms so that figuring out how to install and manage the software isn’t the biggest barrier to entry.

# 11:17 am / jsk, data-journalism, datasette

Dec. 23, 2019

The Guardian’s nifty old-article trick is a reminder of how news organizations can use metadata to limit misinformation (via) The Guardian displays prominent banners on news stories from more than a year ago warning that it is an older article to help prevent accidental or intentional spread of misinformation using their content as ammunition. Impressively they also display the year prominently on the card images they serve as social media previews fir older articles.

# 9:36 am / guardian, news, misinformation

Weeknotes: Datasette 0.33

I released Datasette 0.33 yesterday. The release represents an accumulation of small changes and features since Datasette 0.32 back in November. Duplicating the release notes:

[... 678 words]

Dec. 26, 2019

free-for.dev (via) It’s pretty amazing how much you can build on free tiers these days—perfect for experimenting with side-projects. free-for.dev collects free SaaS tools for developers via pull request, and has had contributions from over 500 people.

# 10:03 am / tools, crowdsourcing

For creative work, you can't cheat. My believe is that there are 5 creative hours in everyone's day. All I ask of people at Shopify is that 4 of those are channeled into the company.

Tobi Lutke

# 7:06 pm / productivity, management

Dec. 30, 2019

sqlite-utils 2.0: real upserts

I just released version 2.0 of my sqlite-utils library/CLI tool to PyPI.

[... 1,140 words]

Machine Learning on Mobile and at the Edge: 2019 industry year-in-review (via) This is a fantastic detailed overview of advances made in the field of machine learning on the edge (primarily on mobile devices) over 2019. I’m really excited about this trend: I love the improved privacy implications of running models on my phone without uploading data to a server, and it’s great to see techniques like Federated Learning (from Google Labs) which enable devices to privately train models in a distributed way without having to upload their training data.

# 10:17 pm / machine-learning, mobile

Guide To Using Reverse Image Search For Investigations (via) Detailed guide from Bellingcat’s Aric Toler on using reverse image search for investigative reporting. Surprisingly Google Image Search isn’t the state of the art: Russian search engine Yandex offers a much more powerful solution, mainly because it’s the largest public-facing image search engine to integrate scary levels of face recognition.

# 10:23 pm / bellingcat, journalism, search

Scaling React Server-Side Rendering (via) Outstanding, detailed essay from 2017 on challenges and solutions for scaling React server-side rendering at Kijiji, Canada’s largest classified site (owned by eBay). There’s a lot of great stuff in here, including a detailed discussion of different approaches to load balancing, load shedding, component caching, client-side rendering fallbacks and more.

# 10:26 pm / scaling, react, load-balancing

2019 » December

MTWTFSS
      1
2345678
9101112131415
16171819202122
23242526272829
3031