Simon Willison’s Weblog

Subscribe
Atom feed for llms

756 items tagged “llms”

Large Language Models (LLMs) are the class of technology behind generative text AI systems like OpenAI's ChatGPT, Google's Gemini and Anthropic's Claude.

2023

Eight Things to Know about Large Language Models (via) This unpublished paper by Samuel R. Bowman is succinct, readable and dense with valuable information to help understand the field of modern LLMs.

# 5th April 2023, 3:36 am / gpt-3, llms, ai, generative-ai

Scaling laws allow us to precisely predict some coarse-but-useful measures of how capable future models will be as we scale them up along three dimensions: the amount of data they are fed, their size (measured in parameters), and the amount of computation used to train them (measured in FLOPs). [...] Our ability to make this kind of precise prediction is unusual in the history of software and unusual even in the history of modern AI research. It is also a powerful tool for driving investment since it allows R&D teams to propose model-training projects costing many millions of dollars, with reasonable confidence that these projects will succeed at producing economically valuable systems.

Sam Bowman

# 5th April 2023, 3:32 am / llms, ai, generative-ai

Weeknotes: A new llm CLI tool, plus automating my weeknotes and newsletter

Visit Weeknotes: A new llm CLI tool, plus automating my weeknotes and newsletter

I started publishing weeknotes in 2019 partly as a way to hold myself accountable but mainly as a way to encourage myself to write more.

[... 830 words]

Guess we could start calling this a ’hallucitation’? Kate Crawford coins an excellent neologism for hallucinated citations in LLMs like ChatGPT.

# 4th April 2023, 10:21 pm / chatgpt, llms

ROOTS search tool (via) BLOOM is one of the most interesting completely openly licensed language models. The ROOTS corpus is the training data that was collected for it, and this tool lets you run searches directly against that corpus. I tried searching for my own name and got an interesting insight into what it knows about me.

# 3rd April 2023, 8:40 pm / llms, ai, generative-ai, bloom, training-data

Think of language models like ChatGPT as a “calculator for words”

One of the most pervasive mistakes I see people using with large language model tools like ChatGPT is trying to use them as a search engine.

[... 1,162 words]

What AI can do for you on the Theory of Change podcast

Matthew Sheffield invited me on his show Theory of Change to talk about how AI models like ChatGPT, Bing and Bard work and practical applications of things you can do with them.

[... 548 words]

You’ll often find prompt engineers come from a history, philosophy, or English language background, because it’s wordplay. You're trying to distill the essence or meaning of something into a limited number of words.

Albert Phelps

# 31st March 2023, 5:54 pm / prompt-engineering, ai, llms

How to use AI to do practical stuff: A new guide (via) Ethan Mollick’s guide to practical usage of large language model chatbot like ChatGPT 3.5 and 4, Bing, Claude and Bard is the best I’ve seen so far. He includes useful warnings about common traps and things that these models are both useful for and useless at.

# 31st March 2023, 6:17 am / chatgpt, bing, bard, ai, llms, ethan-mollick, claude

Downloading and converting the original models (Cerebras-GPT) (via) Georgi Gerganov added support for the Apache 2 licensed Cerebras-GPT language model to his ggml C++ inference library, as used by llama.cpp.

# 31st March 2023, 4:28 am / opensocial, llama, edge-llms, llms, cerebras

Schillace Laws of Semantic AI (via) Principles for prompt engineering against large language models, developed by Microsoft’s Sam Schillace.

# 30th March 2023, 12:20 am / prompt-engineering, ai, generative-ai, llms

gpt4all. Similar to Alpaca, here’s a project which takes the LLaMA base model and fine-tunes it on instruction examples generated by GPT-3—in this case, it’s 800,000 examples generated using the ChatGPT GPT 3.5 turbo model (Alpaca used 52,000 generated by regular GPT-3). This is currently the easiest way to get a LLaMA derived chatbot running on your own computer: the repo includes compiled binaries for running on M1/M2, Intel Mac, Windows and Linux and provides a link to download the 3.9GB 4-bit quantized model.

# 29th March 2023, 6:03 pm / llama, open-source, ai, generative-ai, edge-llms, llms, fine-tuning

I would say ChatGPT (mostly the new GPT-4 model), with a lot of hand-holding and cajoling from me, wrote 60-70% of the code (PHP, Javascript, CSS, SQL) for this AMA site. And we easily did it in a third of the time it would have taken me by myself, without having to look something up on Stack Overflow every four minutes or endlessly consulting CSS and PHP reference guides or tediously writing tests, etc. etc. etc. In fact, I never would have even embarked on building this little site-let had ChatGPT not existed...I would have done something much simpler and more manual instead. And it was a blast. I had so much fun and learned so much along the way.

Jason Kottke

# 28th March 2023, 10:36 pm / chatgpt, ai, jason-kottke, llms

Cerebras-GPT: A Family of Open, Compute-efficient, Large Language Models (via) The latest example of an open source large language model you can run your own hardware. This one is particularly interesting because the entire thing is under the Apache 2 license. Cerebras are an AI hardware company offering a product with 850,000 cores—this release was trained on their hardware, presumably to demonstrate its capabilities. The model comes in seven sizes from 111 million to 13 billion parameters, and the smaller sizes can be tried directly on Hugging Face.

# 28th March 2023, 10:05 pm / gpt-3, open-source, ai, generative-ai, edge-llms, llms, cerebras

Announcing Open Flamingo (via) New from LAION: “OpenFlamingo is a framework that enables training and evaluation of large multimodal models (LMMs)”. Multimodal here means it can answer questions about images—their interactive demo includes tools for image captioning, animal recognition, counting objects and visual question answering. Theye’ve released the OpenFlamingo-9B model built on top of LLaMA 7B and CLIP ViT/L-14—the model checkpoint is a 5.24 GB download from Hugging Face, and is available under a non-commercial research license.

# 28th March 2023, 9:59 pm / laion, ai, generative-ai, llama, llms, clip

By gaining mastery of language, A.I. is seizing the master key to civilization, from bank vaults to holy sepulchers.

What would it mean for humans to live in a world where a large percentage of stories, melodies, images, laws, policies and tools are shaped by nonhuman intelligence, which knows how to exploit with superhuman efficiency the weaknesses, biases and addictions of the human mind — while knowing how to form intimate relationships with human beings?

Yuval Harari, Tristan Harris and Aza Raskin

# 28th March 2023, 7:09 pm / ai, ethics, generative-ai, llms

LLaMA voice chat, with Whisper and Siri TTS. llama.cpp author Georgi Gerganov has stitched together the LLaMA language model, the Whisper voice to text model (with his whisper.cpp library) and the macOS “say” command to create an entirely offline AI agent that he can talk to with his voice and that can speak replies straight back to him.

# 27th March 2023, 9:06 pm / llama, ai, macosx, generative-ai, whisper, edge-llms, llms, text-to-speech

Every wave of technological innovation has been unleashed by something costly becoming cheap enough to waste. Software production has been too complex and expensive for too long, which has caused us to underproduce software for decades, resulting in immense, society-wide technical debt. This technical debt is about to contract in a dramatic, economy-wide fashion as the cost and complexity of software production collapses, releasing a wave of innovation.

Paul Kedrosky and Eric Norlin

# 27th March 2023, 5:14 pm / software-development, ai, generative-ai, llms, technical-debt

AI-enhanced development makes me more ambitious with my projects

Visit AI-enhanced development makes me more ambitious with my projects

The thing I’m most excited about in our weird new AI-enhanced reality is the way it allows me to be more ambitious with my projects.

[... 3,334 words]

I think it’s likely that soon all computer users will have the ability to develop small software tools from scratch, and to describe modifications they’d like made to software they’re already using.

Geoffrey Litt

# 27th March 2023, 6:10 am / ai, generative-ai, llms, geoffrey-litt

After three decades of working with software, I'm also seeing myself learning faster using ChatGPT. So apparently it works even for us more seasoned programmers.

Salvatore Sanfilippo

# 26th March 2023, 2:55 pm / salvatore-sanfilippo, chatgpt, ai, llms

scrapeghost (via) Scraping is a really interesting application for large language model tools like GPT3. James Turk’s scrapeghost is a very neatly designed entrant into this space—it’s a Python library and CLI tool that can be pointed at any URL and given a roughly defined schema (using a neat mini schema language) which will then use GPT3 to scrape the page and try to return the results in the supplied format.

# 26th March 2023, 5:29 am / scraping, gpt-3, generative-ai, gpt-4, ai, llms

Hello Dolly: Democratizing the magic of ChatGPT with open models. A team at DataBricks applied the same fine-tuning data used by Stanford Alpaca against LLaMA to a much older model—EleutherAI’s GPT-J 6B, first released in May 2021. As with Alpaca, they found that instruction tuning took the raw model—which was extremely difficult to interact with—and turned it into something that felt a lot more like ChatGPT. It’s a shame they reused the license-encumbered 52,000 training samples from Alpaca, but I doubt it will be long before someone recreates a freely licensed alternative to that training set.

# 24th March 2023, 5:05 pm / llama, ai, generative-ai, edge-llms, llms, dolly, chatgpt, fine-tuning

I built a ChatGPT plugin to answer questions about data hosted in Datasette

Visit I built a ChatGPT plugin to answer questions about data hosted in Datasette

Yesterday OpenAI announced support for ChatGPT plugins. It’s now possible to teach ChatGPT how to make calls out to external APIs and use the responses to help generate further answers in the current conversation.

[... 1,801 words]

If you ask Microsoft’s Bing chatbot if Google’s Bard chatbot has been shut down, it says yes, citing as evidence a news article that discusses a tweet in which a user asked Bard when it would be shut down and Bard said it already had, itself citing a comment from Hacker News in which someone joked about this happening, and someone else used ChatGPT to write fake news coverage about the event.

James Vincent

# 23rd March 2023, 12:10 am / bard, bing, ai, google, llms, chatgpt

Weeknotes: AI won’t slow down, a new newsletter and a huge Datasette refactor

I’m a few weeks behind on my weeknotes, but it’s not through lack of attention to my blog. AI just keeps getting weirder and more interesting.

[... 1,255 words]

Don’t trust AI to talk accurately about itself: Bard wasn’t trained on Gmail

Visit Don't trust AI to talk accurately about itself: Bard wasn't trained on Gmail

Earlier this month I wrote about how ChatGPT can’t access the internet, even though it really looks like it can. Consider this part two in the series. Here’s another common and non-intuitive mistake people make when interacting with large language model AI systems: asking them questions about themselves.

[... 1,950 words]

GPT-4, like GPT-3 before it, has a capability overhang; at the time of release, neither OpenAI or its various deployment partners have a clue as to the true extent of GPT-4's capability surface - that's something that we'll get to collectively discover in the coming years. This also means we don't know the full extent of plausible misuses or harms.

Jack Clark

# 22nd March 2023, 12:40 am / jack-clark, generative-ai, openai, gpt-4, ai, llms

The Age of AI has begun. Bill Gates calls GPT-class large language models “the most important advance in technology since the graphical user interface”. His essay here focuses on the philanthropy angle, mostly from the point of view of AI applications in healthcare, education and concerns about keeping access to these new technologies as equitable as possible.

# 21st March 2023, 9:14 pm / gpt-3, generative-ai, openai, bill-gates, ai, ethics, llms

Here are some absurdly expensive things you can do on a trip to Tokyo: Buy a golden toilet. There is a toilet in Tokyo that is made of gold and costs around 10 million yen. If you are looking for a truly absurd experience, you can buy this toilet and use it for your next bowel movement. [...]

Google Bard

# 21st March 2023, 6:27 pm / ai, google, generative-ai, bard, llms