Low-latency AI for developer workflows | ODSP930
Microsoft Developer shares a Build 2026 session with developer experience leaders from Cerebras and OpenAI on how low-latency inference changes developer workflows, including coding agents, voice-driven interactions, and automation scenarios that depend on responses arriving in seconds.
Overview
The session focuses on how faster model responses (low-latency inference) can unlock different interaction patterns in developer tooling and agentic workflows.
Topics called out in the session description include:
- Coding agents that can iterate more quickly when responses land in seconds
- Voice-driven workflows where latency directly affects usability
- Slide generation scenarios that benefit from rapid feedback loops
- Workplace automation examples that chain multiple steps together
Session chapters (from the video metadata)
- Preview of demos showcasing Codex Spark speed
- Jason’s first automation project at OpenAI
- Automation for monitoring topics and product launches
- Sub-agents analyzing Slack, Google Drive, and Google Meet data
- Detecting duplicate issues and tracking a project pipeline
- Comparing Codex workflow speed and productivity challenges
- Adding a “See More” link to the bottom right of every cell
- Overview of multiple ways to leverage Spark within the Codex app
- Shift in thinking: Spark as always-on infrastructure