Satya Nadella, Microsoft CEO, announces an industry milestone with Azure and NVIDIA achieving 1.1 million tokens/sec on a single rack—setting a new benchmark for AI infrastructure and cloud performance.

Azure and NVIDIA Set Industry Record: 1.1M Tokens/sec on ND GB300 v Rack

Author: Satya Nadella (announced via LinkedIn)

Microsoft Azure and NVIDIA have reached a significant milestone: delivering 1.1 million tokens per second using just one rack of ND GB300 v GPUs in the Azure fleet. This achievement is noted as an industry record and is the result of a sustained engineering partnership between the two companies.

Key Highlights

  • Industry Record Performance:
    • Achieved 1.1 million tokens/sec throughput with Azure ND GB300 v GPUs, demonstrating extremely high compute density and AI throughput.
    • Enables new possibilities for running large AI and language models at production scale.
  • Hardware-Software Collaboration:
    • The accomplishment is a joint effort, leveraging NVIDIA’s specialized hardware and Microsoft’s extensive experience in running large-scale, cloud-based AI infrastructure.
    • The abstracted boundaries between model optimization and infrastructure design are further diminishing, enabling more efficient deployment of advanced AI workloads.
  • AI Infrastructure Innovation:
    • This milestone signals a new era in AI performance engineering, where compute density, memory hierarchy, and innovative networking fabric jointly support the demands of large-scale AI training and inference.
    • Cross-organization comments recognize the technological implications—pushing the limits of high-performance computing (HPC) in cloud environments.

Community Acknowledgments

  • Industry professionals and AI engineers echoed that this is not simply a performance statistic but reflects a leap in infrastructure, governance, and operational trust for AI at scale.
  • Governance, traceability, and responsible AI practices are identified as the next critical frontiers as technical throughput reaches unprecedented levels.
  • CTOs and partner organizations cite this as proof of hardware-software synergy and a blueprint for future AI-driven cloud solutions.

Further Reference


The announcement marks a new performance ceiling for cloud-based AI platforms and provides a foundation for future research, enterprise AI deployments, and responsible, scalable compute engineering.

This post appeared first on “Microsoft News”. Read the entire article here