Agentic infrastructure needs agentic observability | ODSP933

Microsoft Developer argues that observability pipelines designed for humans manually standardizing logs, traces, and metrics are struggling as AI systems generate services and infrastructure faster than those pipelines can keep up.

Overview

The session explores a shift from deterministic, event-centric observability toward an approach where the observability layer can reason over telemetry and adapt as systems change. It frames this as a response to agent-driven workflows where behavior is stochastic, context-dependent, and can fail silently even when underlying infrastructure appears healthy.

Core concept: agentic infrastructure requires agentic observability

New analytical focus: understanding agent reasoning (not just events)

Limits of deterministic observability with stochastic agent behavior

Sampling and scalability challenges under high span counts

Silent failures in agent systems

Human cognition constraints shape current observability

Critique of MCP demos: partial context for external agents

A new model: move LLM up, push analysis down

Data quality as a prerequisite for reliable agent reasoning

Session metadata