Language models can explain neurons in language models

Language models can explain neurons in language models (via) Fascinating interactive paper by OpenAI, describing how they used GPT-4 to analyze the concepts tracked by individual neurons in their much older GPT-2 model. “We generated cluster labels by embedding each neuron explanation using the OpenAI Embeddings API, then clustering them and asking GPT-4 to label each cluster.”

Posted 9th May 2023 at 5:35 pm

Simon Willison’s Weblog

Recent articles

Monthly briefing