November 2021
60 posts: 5 entries, 9 links, 3 quotes, 43 beats
Nov. 15, 2021
Weeknotes: git-history, created for a Git scraping workshop
My main project this week was a 90 minute workshop I delivered about Git scraping at Coda.Br 2021, a Brazilian data journalism conference, on Friday. This inspired the creation of a brand new tool, git-history, plus smaller improvements to a range of other projects.
[... 1,239 words]Nov. 16, 2021
Nov. 17, 2021
Nov. 18, 2021
Cookiecutter Data Science (via) Some really solid thinking in this documentation for the DrivenData cookiecutter template. They emphasize designing data science projects for repeatability, such that just the src/ and data/ folders can be used to recreate all of the other analysis from scratch. I like the suggestion to give each project a dedicated S3 bucket for keeping immutable copies of the original raw data that might be too large for GitHub.
Many Web3 boosters see themselves as disruptors, but “tokenize all the things” is nothing if not an obedient continuation of “market-ize all the things”, the campaign started in the 1970s, hugely successful, ongoing. I think the World Wide Web was the real rupture — “Where … is the money?”—which Web 2.0 smoothed over and Web3 now attempts to seal totally.
Nov. 19, 2021
Nov. 20, 2021
Nov. 21, 2021
Nov. 22, 2021
Hurl (via) Hurl is “a command line tool that runs HTTP requests defined in a simple plain text format”—written in Rust on top of curl, it lets you run HTTP requests and then execute assertions against the response, defined using JSONPath or XPath for HTML. It can even assert that responses were returned within a specified duration.
Weeknotes: Apache proxies in Docker containers, refactoring Datasette
Updates to six major projects this week, plus finally some concrete progress towards Datasette 1.0.
[... 1,630 words]Introduction to heredocs in Dockerfiles
(via)
This is a fantastic upgrade to Dockerfile syntax, enabled by BuildKit and a new frontend for executing the Dockerfile that can be specified with a #syntax= directive. I often like to create a standalone Dockerfile that works without needing other files from a directory, so being able to use <<EOF syntax to populate configure files from inline blocks of code is really handy.
htmlspecialchars was a very early function. Back when PHP had less than 100 functions and the function hashing mechanism was strlen(). In order to get a nice hash distribution of function names across the various function name lengths names were picked specifically to make them fit into a specific length bucket. This was circa late 1994 when PHP was a tool just for my own personal use and I wasn't too worried about not being able to remember the few function names.
Nov. 25, 2021
PHP 8.1 release notes (via) PHP is gaining “Fibers” for lightweight cooperative concurrency—very similar to Python asyncio. Interestingly you don’t need to use separate syntax like “await fn()” to call them—calls to non-blocking functions are visually indistinguishable from calls to blocking functions. Considering how much additional library complexity has emerged in Python world from having these two different colours of functions it’s noteworthy that PHP has chosen to go in a different direction here.
