Browse Machine Learning Community (36)
Michael Flanakin summarizes FinOps toolkit 14, including a Copilot Studio agent template for querying FinOps hub data with KQL, a new recommendations pipeline that ingests Azure Advisor and Resource Graph results, a simplified hub deployment UI, and a preview dataset for commitment discount eligibility.
FaizaanMerchant explains a Zero Trust network design for Azure Databricks that avoids public workspace exposure by fronting external access with Azure Application Gateway WAF and routing traffic to the workspace through Private Endpoints, while keeping internal access on private connectivity (VPN/ExpressRoute).
robece announces General Availability of Stripe as a partner event source for Azure Event Grid, and outlines how to route Stripe events into Azure services (Functions, Logic Apps, Event Hubs, Service Bus) and Microsoft Fabric Eventstream for real-time processing and analytics.
mscagliola shows how to use GitHub Copilot skills for spec-driven development, turning a Medallion Architecture blog post into a repeatable repo that generates Terraform for Azure platform setup and Databricks bundle files for workloads, while enforcing strict placeholder/TODO rules to avoid invented environment values.
pauledwards explains how to cut “model weight pre-flight” time on multi-node Azure GPU clusters by sharding downloads from Azure storage and broadcasting the remaining data over InfiniBand using MPI, with practical launch patterns for both Slurm and AKS.
KonstantinaF outlines a practical, phased disaster recovery strategy for Azure Databricks, focused on cross-region resilience for lakehouse workloads. The post explains RTO/RPO trade-offs, compares active-active vs warm standby patterns, and details how to replicate Unity Catalog metadata and Delta data using IaC, CI/CD, and repeatable DR pipelines.
Amit Damle and RK Iyer describe a “Discovery” utility for Azure Databricks that inventories workspace assets into Unity Catalog-backed Delta tables and a Lakeview dashboard, helping platform teams quickly understand clusters, jobs, warehouses, pipelines, security settings, and DBU usage.
sameeraman explains how Microsoft Discovery can automate a scientific simulation workflow using a coordinated set of AI agents, reducing manual scripting and job monitoring while keeping scientific decision-making with researchers.
PrabalDeb lays out a practical reference architecture for running diffusion model workloads on Azure Kubernetes Service (AKS), focusing on GPU/CPU lane separation, dispatch and autoscaling options (Kubernetes-native vs Service Bus + KEDA), secure ingress and identity, durable storage for outputs and model caches, and end-to-end observability for both apps and GPU hardware.
Parvathy_R_Pillai compares traditional ML pipelines with Azure AI Foundry, focusing on the shift from model-centric delivery to operating end-to-end AI applications (including agents) with built-in governance, evaluation, and observability for production use.
kmalkov shares a real-world fintech lending ML decisioning workload evaluated using Microsoft’s Analog Optical Computer (AOC) digital twin on Azure, focusing on production-scale volumes, weighted ensemble models, and end-to-end explainability and auditability for credit, affordability, and risk decisions.
PeterTHLee shares a validated Azure reference architecture for drone-based industrial inspections that combines deterministic computer vision with Azure OpenAI reasoning. The post breaks down an event-driven pipeline (Blob Storage → Functions → Vision/AML → OpenAI → Foundry evaluation → Cosmos DB → Power BI) and calls out security controls needed for production use.
Subhajit1994 breaks down the real design choices behind a Bronze/Silver/Gold medallion framework, focusing on where responsibilities should live (staging, cleaning, modeling, marts), and how to make decisions around load patterns, orchestration, retries, observability, schema evolution, and replayability.
ankitasarkar explains why a pure RAG approach can produce inconsistent or logically wrong matches in enterprise document mapping, and how adding a knowledge-graph layer to constrain retrieval improves consistency, relevance, and explainability.
GalimahB shares a Microsoft Build //local host kit overview, listing breakout sessions and hands-on labs you can run in your city—covering GitHub Copilot agentic workflows, Microsoft Foundry (agents, models, evals), and Azure topics like Container Apps, AKS, databases, and Cobalt VMs.
Moaz_Mirza outlines a reference architecture for “agentic” data governance across hybrid/multi-cloud estates using Azure Arc, Microsoft Purview, and Microsoft Fabric, with a Copilot-style agent (via Power Platform/Teams) that reports on compliance and can enforce selected controls through Azure Functions and policy-driven actions.
NaufalPrawironegoro walks through setting up Microsoft Fabric Operations Agent end-to-end: capacity and Eventhouse prerequisites, enabling the preview in the Admin Portal, wiring a KQL database as a knowledge source, and triggering Power Automate actions via Teams when conditions (like failed pipeline runs) are detected.
In this community deep dive, junjieli walks through the GA release of Microsoft Foundry Toolkit for Visual Studio Code—covering model experimentation, agent development (no-code and code-first), evaluations, deployment to Microsoft Foundry Agent Service, and workflows for converting, profiling, and fine-tuning local models on Windows.
Gapandey lays out a practical, end-to-end MLOps template on Azure: train a scikit-learn model from data in Azure Blob Storage, package it as a self-contained pickle bundle, register it in an Azure ML Registry with auto-versioning, and deploy it to an Azure ML Managed Online Endpoint via an Azure DevOps multi-stage pipeline.
AnjaliSadhukhan argues that AI agents fail on enterprise questions mainly due to fragmented data and missing semantics, and outlines how Microsoft Fabric (OneLake, semantic models, Data Agents) and Azure AI Foundry can work together to provide governed, agent-ready access to business data.
ShivaniThadiyan explains how Azure SQL Managed Instance is evolving from a SQL Server-compatible PaaS into an AI-enabled platform, covering built-in operational intelligence, vector search, in-database Python/R machine learning, and Copilot-assisted diagnostics with security and governance considerations.
Vaibhav Pandey shares a production-oriented “Bring Your Own Model” (BYOM) pattern for Azure AI applications, showing how to package, register, and deploy a custom model on Azure Machine Learning with secure identity, networking, and scalable managed endpoints.
In this post, robece explains how to route Stripe events into Azure Event Grid to build scalable, real-time payment workflows, and how to extend those streams into Microsoft Fabric Real-Time Intelligence for live analytics.
ashish-chhabria argues that Azure Event Hubs is the practical default for Kafka-style streaming on Azure, focusing on its Kafka-compatible endpoint, managed scaling, tier capabilities (Standard/Premium/Dedicated), and integrations like Capture to Azure Data Lake Storage and streaming into Microsoft Fabric for real-time analytics.
Connected-Seth shares March 2026 updates for Azure Event Grid MQTT Broker, covering protocol support (MQTT v3.1.1/v5, HTTP publish), security options (Entra ID/OAuth JWT, X.509, webhook auth, TLS 1.2+), scaling characteristics, and native routing into Azure services like Fabric Eventstreams, Azure Data Explorer, Event Hubs, Functions, and Logic Apps.
AnaviNahar walks through a near-real-time ingestion and transformation setup on Azure Databricks using Lakeflow (Connect, Spark Declarative Pipelines, and Jobs), covering CDC from SQL Server, streaming telemetry ingestion, Bronze/Silver/Gold modeling, Unity Catalog governance, and monitoring via system tables.
AbhishekTiwari (with Azure Networking leaders) explains how Azure Front Door improved recovery time objectives by hardening its local configuration cache, avoiding fleet-wide rebuilds, and introducing ML-driven lazy loading so recovery scales with active traffic rather than total tenants.
Coryskimming from Microsoft introduces the packed line-up for Azure at KubeCon Europe 2026, spotlighting hands-on AKS labs, AI/ML workload sessions, security, cloud-native DevOps practices, and open-source solutions from Microsoft's top engineers.
damocelj offers a practical walkthrough on securely deploying LLM inferencing with vLLM and NVIDIA NIM microservices in air-gapped Azure Kubernetes Service clusters, tackling network isolation, GPU configuration, and model artifact challenges.
bobmital shares a hands-on playbook for optimizing enterprise LLM inference on Azure, guiding technical teams through architecture, hardware selection, quantization, and model serving best practices across AKS, Ray Serve, and vLLM.
bobmital examines the architectural and economic challenges of large language model inference at enterprise scale, with a focus on Azure and Anyscale’s Ray integration for distributed AI workloads.
bobmital examines the unique challenges of enterprise-scale LLM inference, focusing on the interplay of accuracy, latency, and cost in Azure deployments using Anyscale Ray and AKS. This article provides actionable insights for architects and engineers deploying AI workloads in the cloud.
AnaviNahar introduces Azure Databricks Lakebase, now generally available, highlighting its serverless architecture and AI-native features for building real-time, intelligent applications on Azure.
bobmital presents a comprehensive and practical guide for deploying and optimizing large language model inference on Azure Kubernetes Service, focusing on engineering tradeoffs, GPU efficiency strategies, open-source model evaluation, and robust enterprise security architecture.
Chunlong Yu and co-authors present GenRec Direct Learning (DirL), a Microsoft-driven approach that transforms traditional ranking pipelines by leveraging end-to-end token-native sequence modeling, with experiments and production deployment on Azure Machine Learning.
Yongguang Zhang presents an in-depth view of Microsoft’s AI-powered RAN and intelligent edge strategy, showing how AI, Azure, and advanced platforms are set to revolutionize the future of telecom networks through automation, edge intelligence, and innovative new services.
End of content