A test of how seriously your firm is taking AI: when o-1 (& the new Gemini) came out this week, were there assigned folks who immediately ran the model through internal, validated, firm-specific benchmarks to see how useful it as? Did you update any plans or goals as a result?
Or do you not have people (including non-technical people) assigned to test the new models? No internal benchmarks? No perspective on how AI will impact your business that you keep up-to-date?
No one is going to be doing this for organizations, you need to do it yourself.
Recent articles
- LLM 0.22, the annotated release notes - 17th February 2025
- Run LLMs on macOS using llm-mlx and Apple's MLX framework - 15th February 2025
- URL-addressable Pyodide Python environments - 13th February 2025