WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia. This paper describes a really interesting LLM system that runs Retrieval Augmented Generation against Wikipedia to help answer questions, but includes a second step where facts in the answer are fact-checked against Wikipedia again before returning an answer to the user. They claim “97.3% factual accuracy of its claims in simulated conversation” on a GPT-4 backed version, and also see good results when backed by LLaMA 7B.
The implementation is mainly through prompt engineering, and detailed examples of the prompts they used are included at the end of the paper.
Recent articles
- The last six months in LLMs, illustrated by pelicans on bicycles - 6th June 2025
- Tips on prompting ChatGPT for UK technology secretary Peter Kyle - 3rd June 2025
- How often do LLMs snitch? Recreating Theo's SnitchBench with LLM - 31st May 2025