Simon Willison’s Weblog

Subscribe

21 items tagged “julia-evans”

2024

SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL (via) A new paper from Google Research describing custom syntax for analytical SQL queries that has been rolling out inside Google since February, reaching 1,600 "seven-day-active users" by August 2024.

A key idea is here is to fix one of the biggest usability problems with standard SQL: the order of the clauses in a query. Starting with SELECT instead of FROM has always been confusing, see SQL queries don't start with SELECT by Julia Evans.

Here's an example of the new alternative syntax, taken from the Pipe query syntax documentation that was added to Google's open source ZetaSQL project last week.

For this SQL query:

SELECT component_id, COUNT(*)
FROM ticketing_system_table
WHERE
  assignee_user.email = 'username@email.com'
  AND status IN ('NEW', 'ASSIGNED', 'ACCEPTED')
GROUP BY component_id
ORDER BY component_id DESC;

The Pipe query alternative would look like this:

FROM ticketing_system_table
|> WHERE
    assignee_user.email = 'username@email.com'
    AND status IN ('NEW', 'ASSIGNED', 'ACCEPTED')
|> AGGREGATE COUNT(*)
   GROUP AND ORDER BY component_id DESC;

The Google Research paper is released as a two-column PDF. I snarked about this on Hacker News:

Google: you are a web company. Please learn to publish your research papers as web pages.

This remains a long-standing pet peeve of mine. PDFs like this are horrible to read on mobile phones, hard to copy-and-paste from, have poor accessibility (see this Mastodon conversation) and are generally just bad citizens of the web.

Having complained about this I felt compelled to see if I could address it myself. Google's own Gemini Pro 1.5 model can process PDFs, so I uploaded the PDF to Google AI Studio and prompted the gemini-1.5-pro-exp-0801 model like this:

Convert this document to neatly styled semantic HTML

This worked surprisingly well. It output HTML for about half the document and then stopped, presumably hitting the output length limit, but a follow-up prompt of "and the rest" caused it to continue from where it stopped and run until the end.

Here's the result (with a banner I added at the top explaining that it's a conversion): Pipe-Syntax-In-SQL.html

I haven't compared the two completely, so I can't guarantee there are no omissions or mistakes.

The figures from the PDF aren't present - Gemini Pro output tags like <img src="figure1.png" alt="Figure 1: SQL syntactic clause order doesn't match semantic evaluation order. (From [25].)"> but did nothing to help me create those images.

Amusingly the document ends with <p>(A long list of references, which I won't reproduce here to save space.)</p> rather than actually including the references from the paper!

So this isn't a perfect solution, but considering it took just the first prompt I could think of it's a very promising start. I expect someone willing to spend more than the couple of minutes I invested in this could produce a very useful HTML alternative version of the paper with the assistance of Gemini Pro.

One last amusing note: I posted a link to this to Hacker News a few hours ago. Just now when I searched Google for the exact title of the paper my HTML version was already the third result!

I've now added a <meta name="robots" content="noindex, follow"> tag to the top of the HTML to keep this unverified AI slop out of their search index. This is a good reminder of how much better HTML is than PDF for sharing information on the web!

# 24th August 2024, 11 pm / sql, gemini, seo, llms, slop, google, generative-ai, pdf, julia-evans, ai

Migrating Mess With DNS to use PowerDNS (via) Fascinating in-depth write-up from Julia Evans about how she upgraded her "mess with dns" playground application to use PowerDNS, an open source DNS server with a comprehensive JSON API.

If you haven't explored mess with dns it's absolutely worth checking out. No login required: when you visit the site it assigns you a random subdomain (I got garlic299.messwithdns.com just now) and then lets you start adding additional sub-subdomains with their own DNS records - A records, CNAME records and more.

The interface then shows a live (WebSocket-powered) log of incoming DNS requests and responses, providing instant feedback on how your configuration affects DNS resolution.

# 19th August 2024, 10:12 pm / dns, go, julia-evans

Someone elsewhere left a comment like "I CAN’T BELIEVE IT TOOK HER 15 YEARS TO LEARN BASIC READLINE COMMANDS". those comments are very silly and I'm going to keep writing “it took me 15 years to learn this basic thing" forever because I think it's important for people to know that it's normal to take a long time to learn “basic" things

Julia Evans

# 8th July 2024, 11:15 pm / julia-evans

Reasons to use your shell’s job control. Julia Evans summarizes an informal survey of useful things you can do with shell job control features - fg, bg, Ctrl+Z and the like. Running tcdump in the background so you can see its output merged in with calls to curl is a neat trick.

# 7th July 2024, 4:30 pm / unix, julia-evans

Inside .git. This single diagram filled in all sorts of gaps in my mental model of how git actually works under the hood.

# 25th January 2024, 2:59 pm / julia-evans, git

2023

Notes on using a single-person Mastodon server. Julia Evans experiences running a single-person Mastodon server (on masto.host—the same host I use for my own) pretty much exactly match what I’ve learned so far as well. The biggest disadvantage is the missing replies issue, where your server only shows replies to posts that come from people who you follow—so it’s easy to reply to something in a way that duplicates other replies that are invisible to you.

# 16th September 2023, 10:35 pm / mastodon, julia-evans

Lima VM—Linux Virtual Machines On macOS (via) This looks really useful: “brew install lima” to install, then “limactl start default” to start an Ubuntu VM running and “lima” to get a shell. Julia Evans wrote about the tool this morning, and here Adam Gordon Bell includes details on adding a writable directory (by default lima mounts your macOS home directory in read-only mode).

# 10th July 2023, 7:01 pm / virtualization, macosx, linux, julia-evans

In general my approach to running arbitrary untrusted code is 20% sandboxing and 80% making sure that it’s an extremely low value attack target so it’s not worth trying to break in.

Programs are terminated after 1 second of runtime, they run in a container with no network access, and the machine they’re running on has no sensitive data on it and a very small CPU.

Julia Evans

# 25th May 2023, 8:12 pm / sandboxing, security, julia-evans

Implement DNS in a weekend (via) Fantastically clear and useful guide to implementing DNS lookups, from scratch, using Python’s struct, socket and dataclass modules—Julia Evans plans to follow this up with one for TLS which I am very much looking forward to.

# 12th May 2023, 6:14 pm / dns, julia-evans, python

Writing Javascript without a build system (via) Julia Evans perfectly captures why I prefer not to use build systems in the majority of my projects that use JavaScript: “... my experience with build systems (not just Javascript build systems!), is that if you have a 5-year-old site, often it’s a huge pain to get the site built again. And because most of my websites are pretty small, the advantage of using a build system is pretty small.”

# 18th February 2023, 5:25 am / julia-evans, javascript

Examples of floating point problems (via) I learned so much practical stuff from this post by Julia Evans. There are no 32-bit floating point numbers between 262144.0 and 262144.03125, which breaks code that attempts to keep incrementing by 0.01. I knew about the JavaScript tweet ID problem (JavaScript can’t handle numbers like 1612850010110005250) but I didn’t realize it affected jq as well. Lots more great examples in here.

# 13th January 2023, 3:41 pm / julia-evans, jq, javascript

2022

sqlite-utils: a nice way to import data into SQLite for analysis (via) Julia Evans on my sqlite-utils Python library and CLI tool.

# 13th May 2022, 6:17 pm / sqlite-utils, julia-evans, sqlite

2021

New tool: an nginx playground. Julia Evans built a sandbox tool for interactively trying out an nginx configuration and executing test requests through it. I love this kind of tool, and Julia’s explanation of how they built it using a tiny fly.io instance and a network namespace to reduce the amount of damage any malicious usage could cause is really interesting.

# 24th September 2021, 6:44 pm / fly, nginx, security, julia-evans

How to look at the stack with gdb. Useful short tutorial on gdb from first principles.

# 24th May 2021, 6:23 pm / c, julia-evans, debugger

No feigning surprise (via) Don’t feign surprise if someone doesn’t know something that you think they should know. Even better: even if you are surprised, don’t let them know! “When people feign surprise, it’s usually to make them feel better about themselves and others feel worse.”

# 17th May 2021, 4:30 pm / julia-evans, teaching, communication

2020

entr: rerun your build when files change. “WHY DID NOBODY TELL ME ABOUT THIS BEFORE?!?!” is one of my favourite genres of blog post.

# 1st July 2020, 3:58 pm / julia-evans

2019

SQL queries don’t start with SELECT. This is really useful. Understanding that SELECT (and associated window functions) happen after the WHERE, GROUP BY and HAVING helps explain why you can’t filter a query based on the results of a window function for example.

# 3rd October 2019, 8:56 pm / sql, julia-evans

2018

Build impossible programs. Delightful talk by Julia Evans describing how she went about building a Ruby profiler in Rust despite having no knowledge of Ruby internals and only beginner’s knowledge of Rust.

# 19th September 2018, 6:38 pm / ruby, rust, julia-evans

Using flamegraphs. I really like flamegraphs as a profiling tool—we have support for them baked into our Tikibar debugging toolbar at Eventbrite—but interpreting them isn’t particularly intuitive on first glance. Julia Evans has put together a great explanation of how to read them as part of the documentation for her rbspy Ruby profiler.

# 21st March 2018, 8:56 pm / profiling, julia-evans

2017

How do Ruby & Python profilers work? Julia Evans: “As a precursor to writing a Ruby profiler I wanted to do a survey of how existing Ruby & Python profilers work.”

# 18th December 2017, 12:12 pm / profiler, julia-evans

How to teach technical concepts with cartoons. Julia Evans: “This post is about a few patterns I use when illustrating ideas about computers. If you are interested in using drawings to teach people about your very favorite computer topics, hopefully this will help you!”

# 28th October 2017, 2:55 pm / teaching, julia-evans