Brendan Burns explains how Microsoft and Anyscale are collaborating to bring managed Ray to Azure, empowering developers to scale distributed AI and ML workloads seamlessly with Python and Kubernetes.

Powering Distributed AI/ML at Scale with Azure and Anyscale

Author: Brendan Burns

Scaling artificial intelligence (AI) and machine learning (ML) workloads from experimentation to production is complex. Microsoft and Anyscale have partnered to bring Anyscale’s managed Ray service to Azure in private preview, simplifying distributed Python workloads with an enterprise-ready, cloud-native experience.

Ray: Distributed Computing Framework for Python

Ray is an open-source framework designed to make distributed computing accessible to Python developers. Developers can:

Scale applications from a single laptop to large clusters with minimal code changes
Leverage Pythonic APIs that transform functions and classes into distributed tasks and actors
Use native libraries for:
- Ray Train (distributed training)
- Ray Data (data processing)
- Ray Serve (model serving)
- Ray Tune (hyperparameter tuning)
Integrate with PyTorch, TensorFlow, and other ML libraries

Ray abstracts distributed infrastructure concerns, enabling teams to focus on model development and innovation.

Anyscale: Enterprise-Grade Ray on Azure

Anyscale, the company founded by Ray’s creators, delivers a managed Ray service—now tightly integrated with Azure. This new service introduces RayTurbo, a performance-focused Ray runtime. Key capabilities on Azure include:

Instantly spinning Ray clusters from the Azure Portal or CLI (no Kubernetes expertise needed)
Dynamic allocation of tasks across CPUs, GPUs, and mixed clusters
Elastic scaling and support for Azure spot VMs
Production reliability: automatic fault recovery, zero-downtime upgrades, built-in observability
Data locality and governance: Clusters run inside your Azure subscription with unified billing and compliance

Azure Kubernetes Service (AKS): Foundation for Distributed AI/ML

AKS provides:

Dynamic resource orchestration (scaling clusters as demand shifts)
High availability (self-healing and failover)
Elastic scaling for clusters of any size
Native integration with Azure services like Azure Monitor, Microsoft Entra ID, and Blob Storage
Enterprise governance, security, and compliance

AKS supports Ray and Anyscale at every scale, from development clusters to global, mission-critical deployments.

Empowering Teams: AI and ML Innovation at Scale

Microsoft and Anyscale’s partnership enables organizations to:

Easily move from prototype to production for AI/ML
Adopt distributed Python computing without deep infrastructure expertise
Integrate Ray’s open ecosystem with Azure-native tools and governance
Start small for experimentation and scale to full enterprise deployments

Developers benefit from more choice, flexibility, and focus—with less operational effort and complexity. Further information about preview access and Azure Marketplace availability is provided in the article.

Learn More

This post appeared first on “Microsoft All Things Azure Blog”. Read the entire article here