A brief update with some numbers for hardware load-balanced mongrels. 4000 requests/second on 48 mongrels behind a hardware load balancer.
Recent articles
- LLM 0.22, the annotated release notes - 17th February 2025
- Run LLMs on macOS using llm-mlx and Apple's MLX framework - 15th February 2025
- URL-addressable Pyodide Python environments - 13th February 2025