Simon Willison’s Weblog

Subscribe

November 2017

Nov. 1, 2017

Live htop. Neat, simplest-thing-that-could-possibly-work implementation of a tool that continually pipes the output of the htop command to a browser over a WebSocket. The htopgen.sh scripts loops every 2 seconds, runs htop, pipes it through a utility to convert the output to HTML and writes that to a file. Then the server.js Node.js script watches for changes to that file and pipes the entire file contents to the browser via socket.io. The index.html page in the browser subscribes to the WebSocket and updates the entire page using innerHTML every time it receives an event.

# 6:07 pm / websockets, nodejs

Hemingway Editor. Hemingway is a web-based editor that applies style checks to your writing. It looks for complicated words, unnecessary adverbs and sentences that are hard to read. It highlighted the previous sentence as hard to read. It gave this whole paragraph a Grade 8 readability score.

# 8:38 pm / writing

Nov. 2, 2017

A Minimalist Guide to SQLite. Pretty comprehensive actually—covers the sqlite3 command line app, importing CSVs, integrating with Python, Pandas and Jupyter notebooks, visualization and more.

# 1:23 am / sqlite, pandas, jupyter, python

The idea that our 5 committees would sanction further cute graphic characters based on this should embarrass absolutely everyone who votes yes on such an excrescence. Will we have a CRYING PILE OF POO next? PILE OF POO WITH TONGUE STICKING OUT? PILE OF POO WITH QUESTION MARKS FOR EYES? PILE OF POO WITH KARAOKE MIC? Will we have to encode a neutral FACELESS PILE OF POO?

Michael Everson

# 4:41 pm / unicode, emoji

I'm concerned that this character will open the floodgates for an open-ended set of PILE OF POO emoji with emotions, such as CRYING PILE OF POO, PILE OF POO WITH LOOK OF TRIUMPH, PILE OF POO SCREAMING IN FEAR, etc. Is there really any need to add a range of emotions to PILE OF POO? I personally think that changing PILE OF POO to a de facto SMILING PILE OF POO was wrong, but adding F|FROWNING PILE OF POO as a counterpart is even worse. If this is accepted then there will be no neutral, expressionless PILE OF POO, so at least a PILE OF POO WITH NO FACE would be required to be encoded to restore some balance.

Andrew West

# 4:45 pm / unicode, emoji

How Adversarial Attacks Work. Adversarial attacks against machine learning classifiers involve constructing an input that deliberately produces the wrong classification. This article shows how these can be constructed, and includes examples generated using PyTorch which produce a sports car that gets identified as a toaster and a photo of Sylvester Stallone that gets classified as Keanu Reeves.

# 8:25 pm / machine-learning, python

Nov. 3, 2017

Connecting to Google Sheets with Python. Useful guide to interacting with Google Sheets via the gspread python library, including how to work with Google’s unintuitive “service account keys”.

# 4:13 am / googlespreadsheet, python

The Story Behind the Chicago Newspaper That Bought a Bar (via) Absolutely fascinating story—the Chicago Sun-Times bought a bar back in 1976 to investigate corrupt city inspectors, staffing it with journalists and with photographers hidden in a back room.

# 3:27 pm / journalism

Nov. 4, 2017

My blog: Items tagged “askmetafilter”. I’ve imported all 55 of my answers to questions on Ask MetaFilter (to accompany my previous Quora import) going back to 2005.

# 4:43 am / ask-metafilter

Apple reserves new emojis for point releases, instead of major upgrades, to incentivize people to keep updating. Very smart strategy.

SwiftOnSecurity

# 4:15 pm / apple, emoji

Using “import refs” to iteratively import data into Django

I’ve been writing a few scripts to backfill my blog with content I originally posted elsewhere. So far I’ve imported answers I posted on Quora (background), answers I posted on Ask MetaFilter and content I recovered from the Internet Archive.

[... 559 words]

Animoji karaoke performing Bohemian Rhapsody (via) Animoji just might be the most important advance in computer science in a decade.

# 7:29 pm / emoji

gillyb/sensitive: A native desktop version of the kibana sense plugin. I love using the Sense UI for developing against Elasticsearch, but it’s infuriatingly hard to obtain these days. You can install it as a Kibana plugin but I work with multiple Elasticsearch instances and I don’t want to have to get it installed on all of them. Until recently I was using a Chrome extension for it, but that’s now been disabled as containing malware and removed from the Chrome extension store. I’ve now switched to Sensitive, which packages Sense up as a native OS X application using Electron.

# 7:35 pm / elasticsearch, electron

Nov. 5, 2017

Running a load testing Go utility using Docker for Mac

I’m playing around with Zeit Now at the moment (see my previous entry) and decided to hit it with some traffic using Apache Bench. I got this SSL handshake error:

[... 818 words]

Docker Containers on the Desktop (via) Jessie Frazelle’s classic explanation from 2015 of how she runs every desktop application on her Linux machine in its own Docker container.

# 4:16 am / docker, linux

On Being A Senior Engineer. Thoughts on characteristics of a mature engineer from John Allspaw back in 2012. So much good thinking in here—my favourite piece of writing on the subject.

# 6:16 am / john-allspaw, careers

Docker.qcow2 never shrinks—disk space usage leak in docker for mac (via) Interesting year-long thread on disk usage by Docker for Mac, including a bunch of potential workarounds for if it swallows too much disk space.

# 3:06 pm / docker

Super Fast String Matching in Python (via) Interesting technique for calculating string similarity at scale in Python, with much better performance than Levenshtein distances. The trick here uses TF/IDF against N-Grams, plus a CSR (Compressed Sparse Row) scipy matrix to run the calculations. Includes clear explanations of each of these concepts.

# 3:26 pm / scipy, python

Try hosting on PyPy by simonw. I had a go at hosting my blog on PyPy. Thanks to the combination of Travis CI, Sentry and Heroku it was pretty easy to give it a go—I had to swap psycopg2 for psycopg2cffi and switch to the currently undocumented pypy3-5.8.0 Heroku runtime (pypy3-5.5.0 is only compatible with Python 3.3, which Django 2.0 does not support). I ran it in production for a few minutes and didn’t get any Sentry errors but did end up using more Heroku dyno memory than I’m comfortable with—see the graph I posted in a comment. I’m going to stick with CPython 3.6 for the moment. Amusingly I did almost all of the work on this on my phone! Travis CI means it’s easy to create and test a branch through GitHub’s web UI, and deploying a tested branch to Heroku is then just a button click.

# 7:17 pm / python, pypy, heroku, travis, sentry

Landsat on AWS (via) TIL Amazon make data from the Landsat 8 satellite available for free on S3 (though they are no doubt hoping you’ll pay for EC2 instances to process the data). “All new Landsat 8 scenes are made available each day, often within hours of production. The satellite images the entire Earth every 16 days at a roughly 30 meter resolution”.

# 7:56 pm / satellite, aws

direnv (via) A shell extension (for bash, zsh and others) which can automatically set and unset environment variables when you cd into specific directories. Useful for managing things like a project’s GOPATH or automatically activating Python virtual environments.

# 7:59 pm / shell, bash, zsh

Nov. 6, 2017

walrus. Fascinating collection of Python utilities for working with Redis, by Charles Leifer. There are a ton of interesting ideas in here. It starts with Python object wrappers for Redis so you can interact with lists, sets, sorted sets and Redis hashes using Python-like objects. Then it gets really interesting: walrus ships with implementations of autocomplete, rate limiting, a graph engine (using a sorted set hexastore) and an ORM-style models mechanism which manages secondary indexes and even implements basic full-text search.

# 1:14 am / redis, python, charles-leifer

Alt-texts: The Ultimate Guide. By Daniel Göransson, a web developer with vision impairment who uses a screen reader. This is the best, most practical guide to writing image alt text I’ve seen. Just one of the neat tips contained within: consider ending your alt text in a period, so the screen user knows to pause.

# 4:54 pm / alt-attribute, accessibility

Skip the title text! Nobody uses them – they don’t work on touch screens and on desktop they require that the user hovers for a while over an image, which nobody does. Also, adding a title-text makes some screen readers both read the title-text and the alt-text, which becomes redundant.

Daniel Göransson

# 4:56 pm / accessibility

How technology helped a blind athlete run free at the New York Marathon. Fascinating piece on technology to help blind people better navigate the world—combing GPS and chest-mounted ultrasonic sonar.

# 4:58 pm / accessibility

Nov. 7, 2017

I’m a Unicorn. I got to try out Animoji on an iPhone X, and it was amazing.

# 1:50 am / emoji

Secondary indexing with Redis. I haven’t seen this section of the official Redis documentation before, and it’s absolutely fantastic—well worth reading the whole thing. It talks through various ways in which you can set up indexes in Redis, mainly by leaning on sorted sets—which it turns out will binary lexicographically sort items with the same score. This makes it easy to implement autocomplete with Redis—but if you use them creatively you can implement subject/predicate/object graph searches or even N-dimensional range queries as well.

# 2 am / redis

How Balanced does Database Migrations with Zero-Downtime. I’m fascinated by the idea of “pausing” traffic during a blocking site maintenance activity (like a database migration) and then un-pausing when the operation is complete—so end clients just see some of their requests taking a few seconds longer than expected. I first saw this trick described by Braintree. Balanced wrote about a neat way of doing this just using HAproxy, which lets you live reconfigure the maxconns to your backend down to zero (causing traffic to be queued up) and then bring the setting back up again a few seconds later to un-pause those requests.

# 11:36 am / highavailability, migrations, http, scaling, haproxy, zero-downtime

The only thing that would have been nice is that after the project had been finished and the chip deployed, that someone from Intel would have told me, just as a courtesy, that MINIX 3 was now probably the most widely used operating system in the world on x86 computers. That certainly wasn't required in any way, but I think it would have been polite to give me a heads up, that's all.

Andrew S. Tanenbaum

# 11:50 am / intel, open-source

In the official timeline, Peppa is appropriately reassured by a kindly dentist. In the version above, she is basically tortured, before turning into a series of Iron Man robots and performing the Learn Colours dance. A search for “peppa pig dentist” returns the above video on the front page, and it only gets worse from here.

James Bridle

# 12:34 pm / youtube, james-bridle