Run AI SREs without burning token budgets | ODSP928

Natan Yellin breaks down why AI SRE-style investigations can cost around $2 per alert and shows practical ways to reduce LLM token spend so enterprises can run AI investigations across high alert volumes without blowing budgets.

Overview

This Microsoft Build 2026 session focuses on the economics of using LLMs for SRE-style alert investigations at enterprise scale, and the optimizations that can make “AI investigations on every alert” financially viable.

Naive cost model for large alert volumes

Optimization: using cheaper models

Optimization: LLM-native grouping instead of deterministic rules

Optimization: reusing cached context windows

Why cost optimization changes the operating model