Archive for September 2023

A practical guide to deploying Large Language Models Cheap, Good *and* Fast. Joel Kang’s extremely comprehensive notes on what he learned trying to run Vicuna-13B-v1.5 on an affordable cloud GPU server (a T4 at $0.615/hour). The space is in so much flux right now—Joel ended up using MLC but the best option could change any minute.

Vicuna 13B quantized to 4-bit integers needed 7.5GB of the T4’s 16GB of VRAM, and returned tokens at 20/second.

An open challenge running MLC right now is around batching and concurrency: “I did try making 3 concurrent requests to the endpoint, and while they all stream tokens back and the server doesn’t OOM, the output of all 3 streams seem to actually belong to a single prompt.”

# 1:43 pm / ai, generative-ai, llama, llms, mlc, vicuna

Release llm-cluster 0.1 — LLM plugin for clustering embeddings

4th Sep 2023, 4:01 pm · llm

Release llm-cluster 0.2 — LLM plugin for clustering embeddings

4th Sep 2023, 4:34 pm · llm

LLM now provides tools for working with embeddings

LLM is my Python library and command-line tool for working with language models. I just released LLM 0.9 with a new set of features that extend LLM to provide tools for working with embeddings.

[... 3,521 words]

8:32 pm / cli, open-source, projects, sqlite, ai, generative-ai, vector-search, llms, embeddings, llm, rag

Wikipedia search-by-vibes through millions of pages offline (via) Really cool demo by Lee Butterman, who built embeddings of 2 million Wikipedia pages and figured out how to serve them directly to the browser, where they are used to implement “vibes based” similarity search returning results in 250ms. Lots of interesting details about how he pulled this off, using Arrow as the file format and ONNX to run the model in the browser.

# 9:13 pm / embedding, search, wikipedia, webassembly

Sept. 5, 2023

A token-wise likelihood visualizer for GPT-2. Linus Lee built a superb visualization to help demonstrate how Large Language Models work, in the form of a video essay where each word is coloured to show how “surprising” it is to the model. It’s worth carefully reading the text in the video as each term is highlighted to get the full effect.

# 3:39 am / ai, generative-ai, llms, gpt-2

Release llm 0.10a0 — Access large language models from the command-line

5th Sep 2023, 6:43 am · llm

Release symbex 1.4 — Find the Python code for specified symbols

5th Sep 2023, 3:32 pm

Symbex 1.4. New release of my Symbex tool for finding symbols (functions, methods and classes) in a Python codebase. Symbex can now output matching symbols in JSON, CSV or TSV in addition to plain text.

I designed this feature for compatibility with the new “llm embed-multi” command—so you can now use Symbex to find every Python function in a nested directory and then pipe them to LLM to calculate embeddings for every one of them.

I tried it on my projects directory and embedded over 13,000 functions in just a few minutes! Next step is to figure out what kind of interesting things I can do with all of those embeddings.

# 5:29 pm / projects, ai, generative-ai, embeddings, symbex, llm

Sept. 6, 2023

Perplexity: interactive LLM visualization (via) I linked to a video of Linus Lee's GPT visualization tool the other day. Today he's released a new version of it that people can actually play with: it runs entirely in a browser, powered by a 120MB version of the GPT-2 ONNX model loaded using the brilliant Transformers.js JavaScript library.

# 3:33 am / javascript, ai, webassembly, generative-ai, llms, transformers-js

Using ChatGPT Code Intepreter (aka “Advanced Data Analysis”) to analyze your ChatGPT history. I posted a short thread showing how to upload your ChatGPT history to ChatGPT itself, then prompt it with “Build a dataframe of the id, title, create_time properties from the conversations.json JSON array of objects. Convert create_time to a date and plot it daily”.

# 3:42 pm / ai, generative-ai, chatgpt, llms

hubcap.php (via) This PHP script by Dave Hulbert delights me. It’s 24 lines of code that takes a specified goal, then calls my LLM utility on a loop to request the next shell command to execute in order to reach that goal... and pipes the output straight into `exec()` after a 3s wait so the user can panic and hit Ctrl+C if it’s about to do something dangerous!

# 3:45 pm / php, security, ai, generative-ai, llms, llm, ai-agents

Release datasette-edit-schema 0.6a0 — Datasette plugin for modifying table schemas

6th Sep 2023, 10:30 pm · datasette

Release datasette-edit-schema 0.6a1 — Datasette plugin for modifying table schemas

6th Sep 2023, 11:26 pm · datasette

Sept. 7, 2023

Release datasette-edit-schema 0.6a2 — Datasette plugin for modifying table schemas

7th Sep 2023, 3:54 pm · datasette

Release datasette-graphql 3.0a0 — Datasette plugin providing an automatic GraphQL API for your SQLite databases

7th Sep 2023, 4:51 pm · datasette

Sept. 8, 2023

Release datasette-remote-actors 0.1a0 — Datasette plugin for fetching details of actors from a remote endpoint

8th Sep 2023, 3:33 am · datasette

Release datasette-debug-actors-from-ids 0.1a0 — Datasette plugin for trying out the actors_from_ids hook

8th Sep 2023, 3:57 am · datasette

Release datasette 1.0a6 — An open source multi-tool for exploring and publishing data

8th Sep 2023, 4:45 am · datasette

Release datasette-remote-actors 0.1a1 — Datasette plugin for fetching details of actors from a remote endpoint

8th Sep 2023, 4:56 am · datasette

Release datasette-debug-actors-from-ids 0.1a1 — Datasette plugin for trying out the actors_from_ids hook

8th Sep 2023, 4:56 am · datasette

TIL Running Datasette on Hugging Face Spaces — [Julien Chaumond](https://twitter.com/julien_c/status/1700142113713758438), this morning (replying to my tweet about [my Hugging Face TheBloke model git scraper](https://twitter.com/simonw/status/1700130557638869140)):

8th Sep 2023, 3:28 pm

page 1 / 4 next » last »»