Video

QCon London: A blueprint for agentic AI services

1 minute read April 28, 2025

About this session

We’re all excited to build and deliver agentic AI services. But what about running at the exponentially greater scale that agents create? LLMs suffer from poor latency and availability issues. More frequent model training drives more frequent updates to agentic services. Most of all, the LLM cost of running at agentic scale breaks the bank—fast.

So, what can you do?

In this session, we dug into how engineering and operations can address:

Making agentic services fail-proof when their LLMs were not
Managing a two-order-of-magnitude increase in TPS, including a 2M TPS RAG case study
Navigating cost vs. quality tradeoffs, with LLMs costing up to 100,000x more than a database transaction
Continuously redeploying agents that require frequent retraining

Click here to view the presentation slides.

Posted By

Overview

Stories

Resources

Architecture

Latest Blogs

SDK

Benchmarks

Try Akka for free

Develop your own Akka app

Request demo

Video

QCon London: A blueprint for agentic AI services

About this session

When AI Needs an SLA

Akka

Follow Us