Video
QCon London: A blueprint for agentic AI services
1 minute read
About this session
We’re all excited to build and deliver agentic AI services. But what about running at the exponentially greater scale that agents create? LLMs suffer from poor latency and availability issues. More frequent model training drives more frequent updates to agentic services. Most of all, the LLM cost of running at agentic scale breaks the bank—fast.
So, what can you do?
In this session, we dug into how engineering and operations can address:
- Making agentic services fail-proof when their LLMs were not
- Managing a two-order-of-magnitude increase in TPS, including a 2M TPS RAG case study
- Navigating cost vs. quality tradeoffs, with LLMs costing up to 100,000x more than a database transaction
- Continuously redeploying agents that require frequent retraining
Click here to view the presentation slides.
[BlogPost 192535757951 Akka Orchestration: Guide, moderate, and control agents, BlogPost 192541833036 Akka Agents: Quickly create agents, MCP tools, and HTTP / gRPC APIs, BlogPost 192666666118 Akka Memory: Durable, in-memory, and sharded data, BlogPost 192660796973 Akka Streaming: High-performance stream processing for real-time AI, BlogPost 190801449047 Cell-based architectures and Akka, a perfect match, BlogPost 190197893297 MCP, A2A, ACP: What does it all mean?, BlogPost 189568799092 5 key capabilities for agentic AI: Unlocking the future of intelligent automation with Akka, BlogPost 189466099540 QCon London: A blueprint for agentic AI services]
Posts by this author