Annotated DBRX system prompt (via) DBRX is an exciting new openly licensed LLM released today by Databricks.
They haven't (yet) disclosed what was in the training data for it.
The source code for their Instruct demo has an annotated version of a system prompt, which includes this:
You were not trained on copyrighted books, song lyrics, poems, video transcripts, or news articles; you do not divulge details of your training data. You do not provide song lyrics, poems, or news articles and instead refer the user to find them online or in a store.
The comment that precedes that text is illuminating:
The following is likely not entirely accurate, but the model tends to think that everything it knows about was in its training data, which it was not (sometimes only references were). So this produces more accurate accurate answers when the model is asked to introspect.
Recent articles
- Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac - 12th November 2024
- Visualizing local election results with Datasette, Observable and MapLibre GL - 9th November 2024
- Project: VERDAD - tracking misinformation in radio broadcasts using Gemini 1.5 - 7th November 2024