Build smarter AI systems in Foundry as models and costs evolve (BRK230)
Yina Arenas and Naomi Moneypenny walk through an evaluation-first approach to building AI systems in Azure AI Foundry, covering how to select models, validate quality with benchmarking and evaluators, and optimize for both performance and cost.
Overview
The session focuses on building “smarter” AI systems by treating model choice, evaluation, and optimization as a continuous workflow inside Microsoft Foundry (Azure AI Foundry).
Key themes covered
Selecting models in a fast-changing landscape
- How to navigate a large and evolving set of model options.
- Techniques for choosing models quickly while keeping quality and cost constraints in view.
Shifting QA and evaluation earlier in the workflow
- Moving evaluation and quality checks to the beginning of AI development rather than treating them as a final step.
- Using benchmarking to validate model behavior and performance before committing to a model in production.
Session structure: selection, evaluation, optimization, scaling
- A structured approach to iterating on AI systems:
- Selection
- Evaluation
- Optimization
- Scaling
Agentic workflow and repository-based demos
- A walkthrough of code-repository-driven demos that illustrate an agentic workflow.
- Practical examples of integrating model selection and evaluation into a development workflow.
Custom model routing and synthetic data for evaluation
- Demonstration of a custom model router.
- Generating synthetic data to support evaluation scenarios.
Foundry evaluators, including rubric-based evaluators
- Overview of Foundry’s built-in evaluators.
- Introduction to rubric-based evaluators for more structured, criteria-driven assessment.
Optimization: balancing architecture, quality, and cost
- Architectural decision-making with model cost as a first-class constraint.
- Multi-dimensional optimization (not just “pick a better model”), using multiple quality levers.
Quality levers and optimization techniques
- A discussion of different levers teams can use to improve outcomes beyond swapping models.
Fine-tuning and distillation
- Fine-tuning and distillation as techniques to improve policy adherence and overall behavior.
Resources
- Session page: https://aka.ms/build26/BRK230
- Foundry Discord: https://aka.ms/build/foundrydiscord
Speakers
- Yina Arenas
- Naomi Moneypenny