Building Agentic Apps with Azure Speech and Voice Live API
Microsoft Events showcases Azure Speech innovations, featuring speakers Jacky Kang, James McMahon, Heiko Rahmel, and Sidd Shah, who detail how developers can create agentic AI apps and voice-enabled agents using the latest Azure Speech APIs.
Building Agentic Apps with Azure Speech and Voice Live API
Speakers: Jacky Kang, James McMahon, Heiko Rahmel, Sidd Shah
Session: BRK198 | Microsoft Ignite 2025 | Advanced (300)
Explore how Azure Speech enables next-generation voice experiences for real-time agentic applications and multimodal translation tools.
Key Highlights
- Voice Live API (GA): Build real-time voice agents for fast, contextual conversations.
- LLM Speech API: Improved transcription and context understanding with large language models for advanced speech recognition.
- Prompt Engineering for Agents: Select generative AI models and set up prompts for smarter voice agent interactions.
- Custom Speech Models: Demonstration of audio enhancements, customizable voice options, and branded voice creation.
- Neural HD Voices: Introduction of 41 upgraded voices supporting over 100 locales for global applications.
- Custom Avatars: Create branded digital avatars with HD voice capabilities for unique customer engagement.
- Audio Solutions: Address audio interruptions and ensure clear speech recognition for healthcare and other use cases.
- Future Expansion: Roadmap includes real-time translation capabilities and additional API enhancements.
Developer Takeaways
- Rapidly integrate Azure Speech APIs to enable natural language understanding, adaptive conversation, and multimodal agentic workflows.
- Utilize custom and branded voices for differentiated user experiences.
- Leverage prompt-based model selection for tailored agent behavior.
- Deploy scalable and resilient voice-enabled AI assistants for customer service and global solutions.
Resources
Learn how Azure Speech advances intelligent voice agent architectures and empowers developers to build faster, smarter, and more engaging AI-powered applications.