Omar Khan explores the expanded collaboration between Microsoft and NVIDIA, describing new Azure AI Foundry services, advanced GPU integration, and key hybrid cloud AI orchestration innovations aimed at enterprise AI scalability and performance.

Microsoft and NVIDIA Announce Major AI Advancements for Azure AI and Edge

Author: Omar Khan

Microsoft has announced significant enhancements to its AI platform through an expanded partnership with NVIDIA, introducing enterprise-grade AI solutions built for scalability, performance, and flexibility across hybrid and edge environments.

Azure AI Foundry: Enabling Scalable Enterprise AI

Azure AI Foundry provides a robust platform for organizations to build, deploy, and scale AI applications and agents, leveraging NVIDIA’s advanced GPU technologies.
New NVIDIA Nemotron and Cosmos models are available in Azure AI Foundry, supporting multimodal reasoning, coding, scientific analysis, robotics planning, and document intelligence.
Microsoft TRELLIS, available via Microsoft Research, uses generative AI to create accurate 3D assets for digital twins, immersive retail and gaming experiences, and simulation workflows.

Expanded GPU Support on Azure Local

Azure Local now features NVIDIA RTX PRO 6000 Blackwell Server Edition GPU support, enabling distributed AI and visual compute workloads in hybrid, edge, and sovereign environments.
Supports edge use cases such as healthcare diagnostics, public safety video analytics, predictive maintenance, and secure low-latency inferencing in disconnected or regulated environments.
Azure Arc integration lets customers manage on-premises AI workloads with the simplicity of the cloud.
Supported hardware includes Dell AX-770, HPE ProLiant DL380 Gen12, and Lenovo ThinkAgile MX650a V4.

Key Edge and Hybrid AI Innovations

Edge Retrieval Augmented Generation (RAG) empowers sovereign AI deployments with secure, scalable inferencing on local data.
Azure AI Video Indexer (enabled by Azure Arc) allows for real-time and recorded video analytics at the edge.
Virtual desktop and AI-enhanced graphical capabilities are bolstered by NVIDIA vGPU and Multi-Instance GPU (MIG) features.

Enterprise GPU Orchestration with NVIDIA Run:ai on Azure

NVIDIA Run:ai is now deeply integrated with Azure, allowing enterprises to dynamically allocate and manage GPU resources across Azure NC, ND series, AKS, and Azure AI Foundry environments.
Benefits include one-click job submission, automated queueing, and integrated governance.
Orchestration tools help teams efficiently share and prioritize GPU resources, reducing operational friction and driving AI innovation.
Vertical use cases cover medical image analysis, financial modeling, manufacturing quality control, and personalized retail (recommendation engines).

Cutting-Edge AI Infrastructure: NVIDIA GB300 NVL72 Supercomputing Cluster

Microsoft is first to deploy the NVIDIA GB300 NVL72 cluster at scale on Azure, with 4600+ NVIDIA Blackwell Ultra GPUs and 36 Grace CPUs per rack, designed for demanding AI workloads.
Enhanced with Azure Boost and integrated hardware security modules for high security and I/O performance.
This infrastructure supports the training and deployment of frontier and multimodal AI models at unprecedented speeds and efficiency.

High-Performance Inference with ND GB200-v6 VMs and NVIDIA Dynamo

Microsoft and NVIDIA have optimized stack performance for large-scale inference workloads, combining ND GB200-v6 VMs, NVIDIA Dynamo, and AKS.
Demonstrated production performance on the gpt-oss 120b model, with up to 1.2 million tokens per second throughput.
Dynamo framework offers distributed inference, LLM-aware routing, and advanced caching for maximum efficiency on Blackwell GPUs.

Learn More

These developments signal a major leap in making advanced, scalable AI accessible to organizations via Microsoft Azure and NVIDIA’s combined technologies.

This post appeared first on “The Azure Blog”. Read the entire article here