Low-latency AI for developer workflows | ODSP930

Microsoft Developer shares a Build 2026 session with developer experience leaders from Cerebras and OpenAI on how low-latency inference changes developer workflows, including coding agents, voice-driven interactions, and automation scenarios that depend on responses arriving in seconds.

Overview

The session focuses on how faster model responses (low-latency inference) can unlock different interaction patterns in developer tooling and agentic workflows.

Topics called out in the session description include:

Session chapters (from the video metadata)