29 items tagged “fly”
2024
Fly: We’re Cutting L40S Prices In Half (via) Interesting insider notes from Fly.io on customer demand for GPUs:
If you had asked us in 2023 what the biggest GPU problem we could solve was, we’d have said “selling fractional A100 slices”. [...] We guessed wrong, and spent a lot of time working out how to maximize the amount of GPU power we could deliver to a single Fly Machine. Users surprised us. By a wide margin, the most popular GPU in our inventory is the A10.
[…] If you’re trying to do something GPU-accelerated in response to an HTTP request, the right combination of GPU, instance RAM, fast object storage for datasets and model parameters, and networking is much more important than getting your hands on an H100.
Making Machines Move. Another deep technical dive into Fly.io infrastructure from Thomas Ptacek, this time describing how they can quickly boot up an instance with a persistent volume on a new host (for things like zero-downtime deploys) using a block-level cloning operation, so the new instance gets a volume that becomes accessible instantly, serving proxied blocks of data until the new volume has been completely migrated from the old host.
GPUs on Fly.io are available to everyone! We’ve been experimenting with GPUs on Fly for a few months for Datasette Cloud. They’re well documented and quite easy to use—any example Python code you find that uses NVIDIA CUDA stuff generally Just Works. Most interestingly of all, Fly GPUs can scale to zero—so while they cost $2.50/hr for a A100 40G (VRAM) and $3.50/hr for a A100 80G you can configure them to stop running when the machine runs out of things to do.
We’ve successfully used them to run Whisper and to experiment with running various Llama 2 LLMs as well.
To look forward to: “We are working on getting some lower-cost A10 GPUs in the next few weeks”.
Macaroons Escalated Quickly (via) Thomas Ptacek’s follow-up on Macaroon tokens, based on a two year project to implement them at Fly.io. The way they let end users calculate new signed tokens with additional limitations applied to them (“caveats” in Macaroon terminology) is fascinating, and allows for some very creative solutions.
2023
Introducing datasette-write-ui: a Datasette plugin for editing, inserting, and deleting rows. Alex García is working with me on Datasette Cloud for the next few months, graciously sponsored by Fly. We will be working in public, releasing open source code and documenting how to build a multi-tenant SaaS product using Fly Machines.
Alex’s first project is datasette-write-ui, a plugin that finally lets you directly edit data stored inside Datasette. Alex wrote about the plugin on our new Datasette Cloud blog.
Vector Search. Amjith Ramanujam provides a very thorough tutorial on implementing vector similarity search using SentenceTransformers embeddings (all-MiniLM-L6-v2) executed using sqlite-utils, then served via datasette-sqlite-vss and deployed using Fly.
Carving the Scheduler Out of Our Orchestrator (via) Thomas Ptacek describes Fly’s new custom-built alternative to Nomad and Kubernetes in detail, including why they eventually needed to build something custom to best serve their platform. In doing so he provides the best explanation I’ve ever seen of what an orchestration system actually does.
2022
Stringing together several free tiers to host an application with zero cost using fly.io, Litestream and Cloudflare. Alexander Dahl provides a detailed description (and code) for his current preferred free hosting solution for small sites: SQLite (and a Go application) running on Fly’s free tier, with the database replicated up to Cloudflare’s R2 object storage (again on a free tier) by Litestream.
Introducing LiteFS (via) LiteFS is the new SQLite replication solution from Fly, now ready for beta testing. It’s from the same author as Litestream but has a very different architecture; LiteFS works by implementing a custom FUSE filesystem which spies on SQLite transactions being written to the journal file and forwards them on to other nodes in the cluster, providing full read-replication. The signature Litestream feature of streaming a backup to S3 should be coming within the next few months.
[SQLite is] a database that in full-stack culture has been relegated to "unit test database mock" for about 15 years that is (1) surprisingly capable as a SQL engine, (2) the simplest SQL database to get your head around and manage, and (3) can embed directly in literally every application stack, which is especially interesting in latency-sensitive and globally-distributed applications.
Reason (3) is clearly our ulterior motive here, so we're not disinterested: our model user deploys a full-stack app (Rails, Elixir, Express, whatever) in a bunch of regions around the world, hoping for sub-100ms responses for users in most places around the world. Even within a single data center, repeated queries to SQL servers can blow that budget. Running an in-process SQL server neatly addresses it.
Exploring the training data behind Stable Diffusion
Two weeks ago, the Stable Diffusion image generation model was released to the public. I wrote about this last week, in Stable Diffusion is a really big deal—a post which has since become one of the top ten results for “stable diffusion” on Google and shown up in all sorts of different places online.
[... 2,897 words]Digitizing 55,000 pages of civic meetings (via) Philip James has been building public, searchable archives of city council meetings for various cities—Oakland and Alamedia so far—using my s3-ocr script to run Textract OCR against the PDFs of the minutes, and deploying them to Fly using Datasette. This is a really cool project, and very much the kind of thing I’ve been hoping to support with the tools I’ve been building.
Litestream backups for Datasette Cloud (and weeknotes)
My main focus this week has been adding robust backups to the forthcoming Datasette Cloud.
[... 1,604 words]How SQLite Helps You Do ACID (via) Ben Johnson’s series of posts explaining the internals of SQLite continues with a deep look at how the rollback journal works. I’m learning SO much from this series.
SOC2 is about the security of the company, not the company’s products. A SOC2 audit would tell you something about whether the customer support team could pop a shell on production machines; it wouldn’t tell you anything about whether an attacker could pop a shell with a SQL Injection vulnerability.
Weeknotes: Datasette Cloud ready to preview
I made an absolute ton of progress building Datasette Cloud on Fly this week, and also had a bunch of fun playing with GPT-3.
[... 370 words]Weeknotes: Building Datasette Cloud on Fly Machines, Furo for documentation
Hosting provider Fly released Fly Machines this week. I got an early preview and I’ve been working with it for a few days—it’s a fascinating new piece of technology. I’m using it to get my hosting service for Datasette ready for wider release.
[... 1,005 words]Using SQLite and Datasette with Fly Volumes
A few weeks ago, Fly announced Free Postgres Databases as part of the free tier of their hosting product. Their announcement included this snippet:
[... 1,463 words]Help scraping: track changes to CLI tools by recording their --help using Git
I’ve been experimenting with a new variant of Git scraping this week which I’m calling Help scraping. The key idea is to track changes made to CLI tools over time by recording the output of their --help
commands in a Git repository.
Weeknotes: python_requires, documentation SEO
Fixed Datasette on Python 3.6 for the last time. Worked on documentation infrastructure improvements. Spent some time with Fly Volumes.
[... 1,497 words]2021
New tool: an nginx playground. Julia Evans built a sandbox tool for interactively trying out an nginx configuration and executing test requests through it. I love this kind of tool, and Julia’s explanation of how they built it using a tiny fly.io instance and a network namespace to reduce the amount of damage any malicious usage could cause is really interesting.
API Tokens: A Tedious Survey. Thomas Ptacek reviews different approaches to implementing secure API tokens, from simple random strings stored in a database through various categories of signed token to exotic formats like Macaroons and Biscuits, both new to me.
Macaroons carry a signed list of restrictions with them, but combine it with a mechanism where a client can add their own additional restrictions, sign the combination and pass the token on to someone else.
Biscuits are similar, but “embed Datalog programs to evaluate whether a token allows an operation”.
Last Mile Redis (via) Fly.io article about running a local redis cache in each of their geographic regions—“Cache data overlaps a lot less than you assume it will. For the most part, people in Singapore will rely on a different subset of data than people in Amsterdam or São Paulo or New Jersey.” But then they note that Redis has the ability to act as both a replica of a primary AND a writable server at the same time (“replica-read-only no”), which actually makes sense for a cache—it lets you cache local data but send out cluster-wide cache purges if necessary.
Multi-region PostgreSQL on Fly (via) Really interesting piece of architectural design from Fly here. Fly can run your application (as a Docker container run using Firecracker) in multiple regions around the world, and they’ve now quietly added PostgreSQL multi-region support. The way it works is that all-but-one region can have a read-only replica, and requests sent to application servers can perform read-only queries against their local region’s replica. If a request needs to execute a SQL update your application code can return a “fly-replay: region=scl” HTTP header and the Fly CDN will transparently replay the request against the region containing the leader database. This also means you can implement tricks like setting a 10s expiring cookie every time the user performs a write, such that their requests in the next 10s will go straight to the leader and avoid them experiencing any replication lag that hasn’t caught up with their latest update.
2020
Sandboxing and Workload Isolation (via) Fly.io run other people’s code in containers, so workload isolation is a Big Deal for them. This blog post goes deep into the history of isolation and the various different approaches you can take, and fills me with confidence that the team at Fly.io know their stuff. I got to the bottom and found it had been written by Thomas Ptacek, which didn’t surprise me in the slightest.
How CDNs Generate Certificates. Thomas Ptacek (now at Fly) describes in intricate detail the challenges faced by large-scale hosting providers that want to securely issue LetsEncrypt certificates for customer domains. Lots of detail here on the different ACME challenges supported by LetsEncrypt and why the new tls-alpn-01 challenge is the right option for operating at scale.
Making Datasets Fly with Datasette and Fly (via) It’s always exciting to see a Datasette tutorial that wasn’t written by me! This one is great—it shows how to load Central Park Squirrel Census data into a SQLite database, explore it with Datasette and then publish it to the Fly hosting platform using datasette-publish-fly and datasette-cluster-map.
Weeknotes: Datasette 0.39 and many other projects
This week’s theme: Well, I’m not going anywhere. So a ton of progress to report on various projects.
[... 806 words]datasette-publish-fly (via) Fly is a neat new Docker hosting provider with a very tempting pricing model: Just $2.67/month for their smallest always-on instance, and they give each user $10/month in free credit. datasette-publish-fly is the first plugin I’ve written using the publish_subcommand plugin hook, which allows extra hosting providers to be added as publish targets. Install the plugin and you can run “datasette publish fly data.db” to deploy SQLite databases to your Fly account.