Content by cormacgarvey (3)

Performance Analysis: DeepSeek R1 Inference with vLLM on Azure ND-H100-v5

Aug 28, 2025 by CormacGarvey

CormacGarvey examines the deployment and benchmarking of the DeepSeek R1 reasoning model on Azure ND_H100_v5 nodes using vLLM, providing practical insights into infrastructure demands and performance.

Community

Benchmarking Llama 3.1 8B Inference with vLLM on Azure GPU and CPU VMs

Aug 26, 2025 by CormacGarvey

Cormac Garvey evaluates the inference performance and cost-efficiency of Llama 3.1 8B using vLLM across Azure GPU and CPU virtual machines, offering actionable benchmarks and deployment strategies for enterprise AI workloads.

Community

Benchmarking Llama 3.1 8B AI Inference on Azure ND-H100-v5 with vLLM

Aug 26, 2025 by CormacGarvey

Cormac Garvey offers an in-depth benchmarking study of Llama 3.1 8B model inference using vLLM on Azure ND-H100-v5 GPUs. The article explains AI inference stages, optimization techniques, and reports on throughput, latency, and deployment tips for enterprise AI practitioners.

Community

End of content