Cookiecutter Data Science (via) Some really solid thinking in this documentation for the DrivenData cookiecutter template. They emphasize designing data science projects for repeatability, such that just the src/ and data/ folders can be used to recreate all of the other analysis from scratch. I like the suggestion to give each project a dedicated S3 bucket for keeping immutable copies of the original raw data that might be too large for GitHub.
Recent articles
- ChatGPT in "4o" mode is not running the new features yet - 15th May 2024
- Slop is the new name for unwanted AI-generated content - 8th May 2024
- Weeknotes: more datasette-secrets, plus a mystery video project - 7th May 2024
- Weeknotes: Llama 3, AI for Data Journalism, llm-evals and datasette-secrets - 23rd April 2024
- Options for accessing Llama 3 from the terminal using LLM - 22nd April 2024
- AI for Data Journalism: demonstrating what we can do with this stuff right now - 17th April 2024