Archive for September 2024

September 2024

Sept. 25, 2024

Llama 3.2. In further evidence that AI labs are terrible at naming things, Llama 3.2 is a huge upgrade to the Llama 3 series - they've released their first multi-modal vision models!

Today, we’re releasing Llama 3.2, which includes small and medium-sized vision LLMs (11B and 90B), and lightweight, text-only models (1B and 3B) that fit onto edge and mobile devices, including pre-trained and instruction-tuned versions.

The 1B and 3B text-only models are exciting too, with a 128,000 token context length and optimized for edge devices (Qualcomm and MediaTek hardware get called out specifically).

Meta partnered directly with Ollama to help with distribution, here's the Ollama blog post. They only support the two smaller text-only models at the moment - this command will get the 3B model (2GB):

ollama run llama3.2

And for the 1B model (a 1.3GB download):

ollama run llama3.2:1b

I had to first upgrade my Ollama by clicking on the icon in my macOS task tray and selecting "Restart to update".

The two vision models are coming to Ollama "very soon".

Once you have fetched the Ollama model you can access it from my LLM command-line tool like this:

pipx install llm
llm install llm-ollama
llm chat -m llama3.2:1b

I tried running my djp codebase through that tiny 1B model just now and got a surprisingly good result - by no means comprehensive, but way better than I would ever expect from a model of that size:

files-to-prompt **/*.py -c | llm -m llama3.2:1b --system 'describe this code'

Here's a portion of the output:

The first section defines several test functions using the @djp.hookimpl decorator from the djp library. These hook implementations allow you to intercept and manipulate Django's behavior.

test_middleware_order: This function checks that the middleware order is correct by comparing the MIDDLEWARE setting with a predefined list.

test_middleware: This function tests various aspects of middleware:

It retrieves the response from the URL /from-plugin/ using the Client object, which simulates a request to this view.

It checks that certain values are present in the response:

X-DJP-Middleware-After

X-DJP-Middleware

X-DJP-Middleware-Before [...]

I found the GGUF file that had been downloaded by Ollama in my ~/.ollama/models/blobs directory. The following command let me run that model directly in LLM using the llm-gguf plugin:

llm install llm-gguf
llm gguf register-model ~/.ollama/models/blobs/sha256-74701a8c35f6c8d9a4b91f3f3497643001d63e0c7a84e085bed452548fa88d45 -a llama321b
llm chat -m llama321b

Meta themselves claim impressive performance against other existing models:

Our evaluation suggests that the Llama 3.2 vision models are competitive with leading foundation models, Claude 3 Haiku and GPT4o-mini on image recognition and a range of visual understanding tasks. The 3B model outperforms the Gemma 2 2.6B and Phi 3.5-mini models on tasks such as following instructions, summarization, prompt rewriting, and tool-use, while the 1B is competitive with Gemma.

Here's the Llama 3.2 collection on Hugging Face. You need to accept the new Llama 3.2 Community License Agreement there in order to download those models.

You can try the four new models out via the Chatbot Arena - navigate to "Direct Chat" there and select them from the dropdown menu. You can upload images directly to the chat there to try out the vision features.

# 8:28 pm / ai, generative-ai, llama, local-llms, llms, vision-llms, meta, ollama, llm-release, files-to-prompt, chatbot-arena

Sept. 26, 2024

I think individual creators or publishers tend to overestimate the value of their specific content in the grand scheme of [AI training]. […]

We pay for content when it’s valuable to people. We’re just not going to pay for content when it’s not valuable to people. I think that you’ll probably see a similar dynamic with AI, which my guess is that there are going to be certain partnerships that get made when content is really important and valuable. I’d guess that there are probably a lot of people who have a concern about the feel of it, like you’re saying. But then, when push comes to shove, if they demanded that we don’t use their content, then we just wouldn’t use their content. It’s not like that’s going to change the outcome of this stuff that much.

— Mark Zuckerberg

# 1:56 am / meta, generative-ai, training-data, mark-zuckerberg, ai

django-plugin-datasette. I did some more work on my DJP plugin mechanism for Django at the DjangoCon US sprints today. I added a new plugin hook, asgi_wrapper(), released in DJP 0.3 and inspired by the similar hook in Datasette.

The hook only works for Django apps that are served using ASGI. It allows plugins to add their own wrapping ASGI middleware around the Django app itself, which means they can do things like attach entirely separate ASGI-compatible applications outside of the regular Django request/response cycle.

Datasette is one of those ASGI-compatible applications!

django-plugin-datasette uses that new hook to configure a new URL, /-/datasette/, which serves a full Datasette instance that scans through Django’s settings.DATABASES dictionary and serves an explore interface on top of any SQLite databases it finds there.

It doesn’t support authentication yet, so this will expose your entire database contents - probably best used as a local debugging tool only.

I did borrow some code from the datasette-mask-columns plugin to ensure that the password column in the auth_user column is reliably redacted. That column contains a heavily salted hashed password so exposing it isn’t necessarily a disaster, but I like to default to keeping hashes safe.

# 9:57 pm / django, plugins, projects, sqlite, datasette, djp

Sept. 27, 2024

Consumer products have had growth hackers for many years optimizing every part of the onboarding funnel. Dev tools should do the same. Getting started shouldn't be an afterthought after you built the product. Getting started is the product!

And I mean this to the point where I think it's worth restructuring your entire product to enable fast onboarding. Get rid of mandatory config. Make it absurdly easy to set up API tokens. Remove all the friction. Make it possible for users to use your product on their laptop in a couple of minutes, tops.

— Erik Bernhardsson

# 2:33 pm / usability, developers

Niche Museums: The Vincent and Ethel Simonetti Historic Tuba Collection. DjangoCon was in Durham, North Carolina this year and thanks to Atlas Obscura I found out about the fabulous Vincent and Ethel Simonetti Historic Tuba Collection. We got together a group of five for a visit and had a wonderful time being shown around the collection by curator Vincent Simonetti. This is my first update to Niche Museums in quite a while, it's nice to get that project rolling again.

More than a dozen varied and beautiful tubas, each with a neat attached label.

# 10:23 pm / museums, music

Themes from DjangoCon US 2024

I just arrived home from a trip to Durham, North Carolina for DjangoCon US 2024. I’ve already written about my talk where I announced a new plugin system for Django; here are my notes on some of the other themes that resonated with me during the conference.

[... 1,470 words]

11:36 pm / accessibility, conferences, django, djangocon, jacob-kaplan-moss, python, redis, dsf, pydantic, htmx

Some Go web dev notes. Julia Evans on writing small, self-contained web applications in Go:

In general everything about it feels like it makes projects easy to work on for 5 days, abandon for 2 years, and then get back into writing code without a lot of problems.

Go 1.22 introduced HTTP routing in February of this year, making it even more practical to build a web application using just the Go standard library.

# 11:43 pm / go, http, web-development, julia-evans

Sept. 28, 2024

DjangoTV (via) Brand new site by Jeff Triplett gathering together videos from Django conferences around the world. Here's Jeff's blog post introducing the project.

# 4:48 am / django, jeff-triplett

OpenFreeMap (via) New free map tile hosting service from Zsolt Ero:

OpenFreeMap lets you display custom maps on your website and apps for free. […] Using our public instance is completely free: there are no limits on the number of map views or requests. There’s no registration, no user database, no API keys, and no cookies. We aim to cover the running costs of our public instance through donations.

The site serves static vector tiles that work with MapLibre GL. It deliberately doesn’t offer any other services such as search or routing.

From the project README looks like it’s hosted on two Hetzner machines. I don’t think the public server is behind a CDN.

Part of the trick to serving the tiles efficiently is the way it takes advantage of Btrfs:

Production-quality hosting of 300 million tiny files is hard. The average file size is just 450 byte. Dozens of tile servers have been written to tackle this problem, but they all have their limitations.

The original idea of this project is to avoid using tile servers altogether. Instead, the tiles are directly served from Btrfs partition images + hard links using an optimised nginx config.

The self-hosting guide describes the scripts that are provided for downloading their pre-built tiles (needing a fresh Ubuntu server with 300GB of SSD and 4GB of RAM) or building the tiles yourself using Planetiler (needs 500GB of disk and 64GB of RAM).

Getting started is delightfully straightforward:

const map = new maplibregl.Map({
  style: 'https://tiles.openfreemap.org/styles/liberty',
  center: [13.388, 52.517],
  zoom: 9.5,
  container: 'map',
})

I got Claude to help build this demo showing a thousand random markers dotted around San Francisco. The 3D tiles even include building shapes!

Map of San Francisco in 3D with building shapes and small blue random markers dotted around.

Zsolt built OpenFreeMap based on his experience running MapHub over the last 9 years. Here’s a 2018 interview about that project.

It’s pretty incredible that the OpenStreetMap and open geospatial stack has evolved to the point now where it’s economically feasible for an individual to offer a service like this. I hope this turns out to be sustainable. Hetzner charge just €1 per TB for bandwidth (S3 can cost $90/TB) which should help a lot.

# 9:41 pm / geospatial, maps, openstreetmap, ai-assisted-programming

OpenAI’s revenue in August more than tripled from a year ago, according to the documents, and about 350 million people — up from around 100 million in March — used its services each month as of June. […]

Roughly 10 million ChatGPT users pay the company a $20 monthly fee, according to the documents. OpenAI expects to raise that price by $2 by the end of the year, and will aggressively raise it to $44 over the next five years, the documents said.

— Mike Isaac and Erin Griffith

# 11:41 pm / chatgpt, openai, new-york-times, ai

Sept. 29, 2024

If you use a JavaScript framework you should:

be able to justify with evidence, how using JavaScript would benefit users

be aware of any negative impacts and be able to mitigate them

consider whether the benefits of using it outweigh the potential problems

only use the framework for parts of the user interface that cannot be built using HTML and CSS alone

design each part of the user interface as a separate component

Having separate components means that if the JavaScript fails to load, it will only be that single component that fails. The rest of the page will load as normal.

— GOV.UK service manual

# 12:32 am / gov-uk, progressive-enhancement, javascript

Ensuring a block is overridden in a Django template (via) Neat Django trick by Tom Carrick: implement a Django template tag that raises a custom exception, then you can use this pattern in your templates:

{% block title %}{% ensure_overridden %}{% endblock %}

To ensure you don't accidentally extend a base template but forget to fill out a critical block.

# 7:25 pm / django, python

In the future, we won't need programmers; just people who can describe to a computer precisely what they want it to do.

— Jason Gorman

# 8:21 pm / ai-assisted-programming, llms, ai, generative-ai

mlx-vlm (via) The MLX ecosystem of libraries for running machine learning models on Apple Silicon continues to expand. Prince Canuma is actively developing this library for running vision models such as Qwen-2 VL and Pixtral and LLaVA using Python running on a Mac.

I used uv to run it against this image with this shell one-liner:

uv run --with mlx-vlm \
  python -m mlx_vlm.generate \
  --model Qwen/Qwen2-VL-2B-Instruct \
  --max-tokens 1000 \
  --temp 0.0 \
  --image https://static.simonwillison.net/static/2024/django-roadmap.png \
  --prompt "Describe image in detail, include all text"

The --image option works equally well with a URL or a path to a local file on disk.

Django roadmap diagram, described below

This first downloaded 4.1GB to my ~/.cache/huggingface/hub/models--Qwen--Qwen2-VL-2B-Instruct folder and then output this result, which starts:

The image is a horizontal timeline chart that represents the release dates of various software versions. The timeline is divided into years from 2023 to 2029, with each year represented by a vertical line. The chart includes a legend at the bottom, which distinguishes between different types of software versions.

Legend

Mainstream Support:

4.2 (2023)

5.0 (2024)

5.1 (2025)

5.2 (2026)

6.0 (2027) [...]

# 9:38 pm / apple, python, ai, generative-ai, local-llms, llms, vision-llms, uv, qwen, mlx, prince-canuma

NotebookLM’s automatically generated podcasts are surprisingly effective

Audio Overview is a fun new feature of Google’s NotebookLM which is getting a lot of attention right now. It generates a one-off custom podcast against content you provide, where two AI hosts start up a “deep dive” discussion about the collected content. These last around ten minutes and are very podcast, with an astonishingly convincing audio back-and-forth conversation.

[... 1,489 words]

11:59 pm / audio, google, podcasts, text-to-speech, ai, prompt-engineering, generative-ai, llms, gemini, rag, notebooklm

Sept. 30, 2024

But in terms of the responsibility of journalism, we do have intense fact-checking because we want it to be right. Those big stories are aggregations of incredible journalism. So it cannot function without journalism. Now, we recheck it to make sure it's accurate or that it hasn't changed, but we're building this to make jokes. It's just we want the foundations to be solid or those jokes fall apart. Those jokes have no structural integrity if the facts underneath them are bullshit.

— John Oliver

# 4:08 pm / comedy, journalism

llama-3.2-webgpu (via) Llama 3.2 1B is a really interesting models, given its 128,000 token input and its tiny size (barely more than a GB).

This page loads a 1.24GB q4f16 ONNX build of the Llama-3.2-1B-Instruct model and runs it with a React-powered chat interface directly in the browser, using Transformers.js and WebGPU. Source code for the demo is here.

It worked for me just now in Chrome; in Firefox and Safari I got a “WebGPU is not supported by this browser” error message.

# 4:27 pm / ai, webassembly, generative-ai, llama, llms, transformers-js, webgpu

Conflating Overture Places Using DuckDB, Ollama, Embeddings, and More. Drew Breunig's detailed tutorial on "conflation" - combining different geospatial data sources by de-duplicating address strings such as RESTAURANT LOS ARCOS,3359 FOOTHILL BLVD,OAKLAND,94601 and LOS ARCOS TAQUERIA,3359 FOOTHILL BLVD,OAKLAND,94601.

Drew uses an entirely offline stack based around Python, DuckDB and Ollama and finds that a combination of H3 geospatial tiles and mxbai-embed-large embeddings (though other embedding models should work equally well) gets really good results.

# 5:24 pm / gis, python, ai, duckdb, embeddings, drew-breunig, overture, ollama

Weeknotes: Three podcasts, two trips and a new plugin system

I fell behind a bit on my weeknotes. Here’s most of what I’ve been doing in September.

[... 693 words]

5:43 pm / django, podcasts, weeknotes, psf, llms, djp

I listened to the whole 15-minute podcast this morning. It was, indeed, surprisingly effective. It remains somewhere in the uncanny valley, but not at all in a creepy way. Just more in a “this is a bit vapid and phony” way. [...] But ultimately the conversation has all the flavor of a bowl of unseasoned white rice.

— John Gruber

# 6:56 pm / llms, generative-ai, notebooklm, ai, john-gruber, podcasts

Bop Spotter (via) Riley Walz: "I installed a box high up on a pole somewhere in the Mission of San Francisco. Inside is a crappy Android phone, set to Shazam constantly, 24 hours a day, 7 days a week. It's solar powered, and the mic is pointed down at the street below."

Some details on how it works from Riley on Twitter:

The phone has a Tasker script running on loop (even if the battery dies, it’ll restart when it boots again)

Script records 10 min of audio in airplane mode, then comes out of airplane mode and connects to nearby free WiFi.

Then uploads the audio file to my server, which splits it into 15 sec chunks that slightly overlap. Passes each to Shazam’s API (not public, but someone reverse engineered it and made a great Python package). Phone only uses 2% of power every hour when it’s not charging!

# 7:03 pm / android, hardware-hacking, music

Gergely Orosz’s edited clip of me talking about Open Source. Gergely Orosz released this clip to help promote our podcast conversation AI tools for software engineers, but without the hype - it's a neat bite-sized version of my argument for why Open Source has provided the single biggest enhancement to developer productivity I've seen in my entire career.

One of the big challenges everyone talked about was software reusability. Like, why are we writing the same software over and over again?

And at the time, people thought OOP was the answer. They were like, oh, if we do everything as classes in Java, then we can subclass those classes, and that's how we'll solve reusable software.

That wasn't the fix. The fix was open source. The fix was having a diverse and vibrant open source community releasing software that's documented and you can package and install and all of those kinds of things.

That's been incredible. The cost of building software today is a fraction of what it was 20 years ago, purely thanks to open source.

# 8:24 pm / open-source, gergely-orosz, podcast-appearances

«« first « previous page 3 / 3