Behind the Scenes: Accelerating the AI Agent DevOps Lifecycle with End-to-End | LIVE159
Vivek Bhadauria discusses how Microsoft built an end-to-end “observe → evaluate → optimize” workflow for AI agents, sharing practical lessons on agent observability, context-specific evaluation rubrics, and using inner- and outer-loop signals to continuously improve agent behavior in production.
Overview
This Microsoft Build 2026 interview focuses on what it took to deliver an end-to-end Agent DevOps lifecycle, spanning inner-loop offline signals through to continuous improvement in production.
Key themes called out in the session description include:
- Simplifying the getting-started experience with out-of-the-box observability
- Using context-specific evaluation rubrics to power evaluations
- Streamlining developer experience with guided, skill-based flows
- Leveraging a complete set of inner-loop and outer-loop signals for continuous improvement
Session chapters (from the video description)
Why agents differ from regular software
- Discussion framing how AI agents behave differently than traditional software systems.
Using evaluations to monitor agent health
- Using evaluations as a mechanism to track whether an agent is behaving as expected.
Continuous optimization of agents in production
- How to think about ongoing improvement once an agent is deployed.
Agent Optimizer demo and preview notes
- Reference to a BRK252 demo showcasing “Agent Optimizer”.
- Preview status mentioned in the description:
- Gated at the time of recording
- Public availability targeted by month-end
User simulations and trace replacement
- Using user simulations and “Trace Replace” to identify harmful traces.
Developer collaboration and end-to-end observability
- Discussion about collaboration and the need for end-to-end observability across the agent lifecycle.