April 2024
147 posts: 5 entries, 59 links, 26 quotes, 57 beats
April 1, 2024
OpenAI: Start using ChatGPT instantly. ChatGPT no longer requires signing in with an account in order to use the GPT-3.5 version, at least in some markets. I can access the service without login in an incognito browser window here in California.
The login-free free version includes “additional content safeguards for this experience, such as blocking prompts and generations in a wider range of categories”, with no more details provided as to what that means.
Interestingly, even logged out free users get the option (off by default) to opt-out of having their conversations used to “improve our models for everyone”.
OpenAI say that this initiative is to support “the aim to make AI accessible to anyone curious about its capabilities.” This makes sense to me: there are still a huge number of people who haven’t tried any of the LLM chat tools due to the friction of creating an account.
Diving Deeper into AI Package Hallucinations. Bar Lanyado noticed that LLMs frequently hallucinate the names of packages that don’t exist in their answers to coding questions, which can be exploited as a supply chain attack.
He gathered 2,500 questions across Python, Node.js, Go, .NET and Ruby and ran them through a number of different LLMs, taking notes of any hallucinated packages and if any of those hallucinations were repeated.
One repeat example was “pip install huggingface-cli” (the correct package is “huggingface[cli]”). Bar then published a harmless package under that name in January, and observebd 30,000 downloads of that package in the three months that followed.
PEP 738 – Adding Android as a supported platform (via) The BeeWare project got PEP 730—Adding iOS as a supported platform—accepted by the Python Steering Council in December, now it’s Android’s turn. Both iOS and Android will be supported platforms for CPython 3.13.
It’s been possible to run custom compiled Python builds on those platforms for years, but official support means that they’ll be included in Python’s own CI and release process.
April 2, 2024
LLMs are like a trained circus bear that can make you porridge in your kitchen. It's a miracle that it's able to do it at all, but watch out because no matter how well they can act like a human on some tasks, they're still a wild animal. They might ransack your kitchen, and they could kill you, accidentally or intentionally!
Bringing Python to Workers using Pyodide and WebAssembly (via) Cloudflare Workers is Cloudflare’s serverless hosting tool for deploying server-side functions to edge locations in their CDN.
They just released Python support, accompanied by an extremely thorough technical explanation of how they got that to work. The details are fascinating.
Workers runs on V8 isolates, and the new Python support was implemented using Pyodide (CPython compiled to WebAssembly) running inside V8.
Getting this to work performantly and ergonomically took a huge amount of work.
There are too many details in here to effectively summarize, but my favorite detail is this one:
“We scan the Worker’s code for import statements, execute them, and then take a snapshot of the Worker’s WebAssembly linear memory. Effectively, we perform the expensive work of importing packages at deploy time, rather than at runtime.”
Cally: Accessibility statement (via) Cally is a neat new open source date (and date range) picker Web Component by Nick Williams.
It’s framework agnostic and weighs less than 9KB grilled, but the best feature is this detailed page of documentation covering its accessibility story, including how it was tested—in JAWS, NVDA and VoiceOver.
I’d love to see other open source JavaScript libraries follow this example.
April 3, 2024
Enforcing conventions in Django projects with introspection (via) Luke Plant shows how to use the Django system checks framework to introspect models on startup and warn if a DateTime or Date model field has been added that doesn’t conform to a specific naming convention.
Luke also proposes “*_at” as a convention for DateTimes, contrasting with “*_on” or “*_date” (I prefer the latter) for Dates.
April 4, 2024
Kobold letters (via) Konstantin Weddige explains a sophisticated HTML email phishing vector he calls Kobold emails.
When you forward a message, most HTML email clients will indent the forward by nesting it inside another element.
This means CSS rules within the email can be used to cause an element that was invisible in the original email to become visible when it is forwarded—allowing tricks like a forwarded innocuous email from your boss adding instructions for wiring money from the company bank account.
Gmail strips style blocks before forwarding—which it turns out isn’t protection against this, because you can put a style block in the original email to hide the attack text which will then be stripped for you when the email is forwarded.
The cost of AI reasoning over time (via) Karina Nguyen from Anthropic provides a fascinating visualization illustrating the cost of different levels of LLM over the past few years, plotting their cost-per-token against their scores on the MMLU benchmark.
Claude 3 Haiku currently occupies the lowest cost to score ratio, over on the lower right hand side of the chart.
llm-command-r. Cohere released Command R Plus today—an open weights (non commercial/research only) 104 billion parameter LLM, a big step up from their previous 35 billion Command R model.
Both models are fine-tuned for both tool use and RAG. The commercial API has features to expose this functionality, including a web-search connector which lets the model run web searches as part of answering the prompt and return documents and citations as part of the JSON response.
I released a new plugin for my LLM command line tool this morning adding support for the Command R models.
In addition to the two models it also adds a custom command for running prompts with web search enabled and listing the referenced documents.
Before Google Reader was shut down, they were internally looking for maintainers. It turned out you have to deal with three years of infra migrations if you sign up to be the new owner of Reader. No one wanted that kind of job for a product that is not likely to grow 10x.
April 5, 2024
s3-credentials 0.16.
I spent entirely too long this evening trying to figure out why files in my new supposedly public S3 bucket were unavailable to view. It turns out these days you need to set a PublicAccessBlockConfiguration of {"BlockPublicAcls": false, "IgnorePublicAcls": false, "BlockPublicPolicy": false, "RestrictPublicBuckets": false}.
The s3-credentials --create-bucket --public option now does that for you. I also added a s3-credentials debug-bucket name-of-bucket command to help figure out why a bucket isn't working as expected.
Everything I Know About the XZ Backdoor (via) Evan Boehs provides the most detailed timeline I’ve seen of the recent xz story, where a backdoor was inserted into the xz compression library in an attempt to compromise OpenSSH.
April 6, 2024
datasette-import. A new plugin for importing data into Datasette. This is a replacement for datasette-paste, duplicating and extending its functionality. datasette-paste had grown beyond just dealing with pasted CSV/TSV/JSON data—it handles file uploads as well now—which inspired the new name.
April 7, 2024
The lifecycle of a code AI completion (via) Philipp Spiess provides a deep dive into how Sourcegraph's Cody code completion assistant works. Lots of fascinating details in here:
"One interesting learning was that if a user is willing to wait longer for a multi-line request, it usually is worth it to increase latency slightly in favor of quality. For our production setup this means we use a more complex language model for multi-line completions than we do for single-line completions."
This article is from October 2023 and talks about Claude Instant. The code for Cody is open source so I checked to see if they have switched to Haiku yet and found a commit from March 25th that adds Haiku as an A/B test.
April 8, 2024
in July 2023, we [Hugging Face] wanted to experiment with a custom license for this specific project [text-generation-inference] in order to protect our commercial solutions from companies with bigger means than we do, who would just host an exact copy of our cloud services.
The experiment however wasn't successful.
It did not lead to licensing-specific incremental business opportunities by itself, while it did hamper or at least complicate the community contributions, given the legal uncertainty that arises as soon as you deviate from the standard licenses.