Build multimodal agents that reason interact and take action | DEM330

Henk Boelman live-codes a real-time, voice-first multimodal agent in Azure AI Foundry using the Voice Live API, showing how to combine speech input, model reasoning, and speech output, then connect the agent to external tools via MCP so it can take real actions.

Overview

What this session builds

Key technologies and concepts

Resources

Session context