Run AI at scale with Ray + Kubernetes using Anyscale on Azure | ODSP914
Katarina Stanley and Daniel Arrizza present an end-to-end look at running distributed AI workloads on Azure using Anyscale on Azure, built on Ray and Azure Kubernetes Service (AKS).
Overview
The session shows how to build multimodal data pipelines, train and fine-tune models, and deploy them as inference services without directly managing the underlying infrastructure. It walks through the architecture and includes a live demo of building, training, and serving AI workloads inside an Azure subscription using a Python-native interface.
What the session covers
- Why teams choose to own their AI stack
- Common challenges when scaling AI workloads
- Ray as a distributed compute framework for AI/ML workloads
- Anyscale on Azure as a production-ready platform that runs Ray on AKS
- An end-to-end example: a multimodal e-commerce recommendation engine
- Training, fine-tuning, embeddings, and serving workflows using Ray
Speakers
- Katarina Stanley
- Daniel Arrizza
Chapters (from the video)
- 0:00 - Introduction and Session Overview by Katarina Stanley
- 00:00:27 - Why Teams Choose to Own Their AI Stack
- 00:01:31 - Challenges of Scaling AI Workloads
- 00:02:17 - Introduction to Ray as a Distributed Compute Framework
- 00:02:43 - Overview of Anyscale on Azure and Its Core Features
- 00:03:38 - Daniel Arrizza Demonstrates Setting Up Anyscale Cloud on Azure
- 00:06:02 - Exploring Workspaces, Jobs, and Services in Anyscale Console
- 00:10:00 - Use Case: Building a Multimodal E-commerce Recommendation Engine
- 00:13:01 - Fine-Tuning, Training, Embedding, and Serving with Ray
- 00:22:14 - Conclusion and Next Steps with QR Codes and Build Conference Info