Build AI across client and cloud with AMD ROCm and Microsoft | BRKSP93
Anush Elangovan explains how AMD ROCm and Microsoft enable building and optimizing AI workloads across client devices, cloud, and on-prem environments, with an emphasis on portability and performance across different AMD hardware targets.
Overview
As AI moves from experimentation to production, the session focuses on practical approaches for building, testing, and optimizing AI workloads across multiple deployment targets (client, cloud, and on-prem).
Key themes covered in the session description and chapter outline include:
- Using AMD ROCm as a common AI software foundation across:
- Radeon
- Ryzen AI
- AMD Instinct
- Improving portability with fewer code changes via integrations with:
- PyTorch
- ONNX
- MIGraphX
- Optimizing performance per target while keeping a consistent software layer.
Session segments (from the published chapters)
Execution velocity and intent-driven development
- Focus on execution velocity as a key advantage.
- Shifting value from syntax implementation to intent-driven execution.
- Parallel development and accelerated feedback loops.
Unified software layer and AMD platform capabilities
- Building a unified and pervasive software layer.
- Enhanced model support and memory capabilities on AMD platforms.
Client application architecture and proxy workflow
- Explanation of a client application architecture and a proxy-based workflow.
Model routing for safety and privacy
- Description of loaded models used for:
- Domain classification
- Jailbreak detection
- PII detection
- Handling private data by routing to a local model for security.
- Configuration flexibility for routing queries between models (local vs other targets).