Thursday
Room 3
15:00 - 16:00
(UTC+11)
Talk (60 min)
Production-Grade LLM Architecture: Lessons from Processing 50 Million AI Requests
Most AI talks show you how to call an API. This one shows you how to build production systems that don't fall over when real users arrive. Over 18 months, we scaled from "ChatGPT prototype" to a production AI platform processing 50 million LLM requests monthly.
AI
We learned that LLMs aren't just APIs with fancy responses—they're distributed systems with non-deterministic failure modes, unpredictable costs, and reliability challenges that break traditional architectural patterns. This talk is a technical deep-dive into building AI infrastructure that survives production.
