Browse DevOps Community (179)

mosiddi explains how the Agent Governance Toolkit (AGT) tackles post-hoc accountability for autonomous agents: proving who authorized an action, what scope was delegated across multi-agent chains, and whether audit evidence was tampered with, using cryptographic identities, signed delegation links, and append-only audit logs.
SurenderSinghMalik breaks down recent Azure App Service (Linux) changes that make Python deployments faster and more reliable for AI-heavy workloads, including new compression and packaging defaults, fewer expensive file operations, and client-side improvements that reduce transient deployment failures.
EfratNauerman announces a public preview update for the Azure Copilot Observability Agent in Azure Monitor, focused on using chat-driven investigations and exploration to speed up triage and root-cause analysis across logs, metrics, traces, and alerts in distributed systems.
TulikaC introduces Platform Release Channel for Azure App Service for Linux, a setting that lets teams control how quickly runtime patch updates are applied so they can balance security updates with validation time in production.
RavinderGupta outlines a “self-healing” CI/CD pattern where an agent observes Azure DevOps pipeline failures, uses Azure OpenAI (via Microsoft AI Foundry) to analyze build logs, and then proposes or applies fixes—such as updating Terraform for Azure Internal Load Balancer configuration—by opening a pull request for review.
Jingwei Wang introduces “Open in VS Code” from Azure Copilot in the Azure Portal, a guided workflow that takes AI-generated Terraform configurations into an Azure-hosted VS Code environment so teams can validate, configure state backends, and deploy to Azure with fewer handoffs.
kinfey explains why AI agents running model-generated code need stronger isolation than standard containers, then walks through deploying a GitHub Copilot SDK agent on AKS using Kata Containers (kata-vm-isolation) plus layered hardening like seccomp, NetworkPolicy egress allowlists, and deny-by-default tool permissions.
mohit-kanojia explains what AKS Arc is and how Azure Arc extends Azure’s control plane to run and manage Kubernetes on-premises, at the edge, and in multicloud. The post covers core components (Arc agents, custom locations, logical networks), a CLI-driven deployment flow, and practical networking and troubleshooting guidance.
Alex-wdy explains why Azure CLI on macOS is moving away from Homebrew Core and introducing new Preview installation options in Azure CLI 2.86.0, including a Homebrew Cask package and an offline tarball for restricted environments, with a focus on signed, notarized binaries and future enterprise authentication needs.
osmancokakoglu announces the winners of the AI Dev Days Hackathon and summarizes the projects and the Microsoft stack they used, including Azure AI Foundry, Azure OpenAI models, and the Microsoft Agent Framework, plus common Azure services and DevOps practices used to ship production-grade agentic apps.
robece announces General Availability of Stripe as a partner event source for Azure Event Grid, and outlines how to route Stripe events into Azure services (Functions, Logic Apps, Event Hubs, Service Bus) and Microsoft Fabric Eventstream for real-time processing and analytics.
Paulams732 describes a reusable Azure DevOps YAML pipeline template for scaling GitHub Advanced Security across many repositories by detecting repo contents, running CodeQL only when relevant, and adding IaC scanning with centralized reporting and SARIF artifacts.
SagarPatra explains how enterprise QA teams can use GitHub Copilot to reduce the mechanical overhead of writing and maintaining automated tests, while keeping trust through human review, governance, and intentional test design that supports reliable regression cycles.
ranjan_ashish explains why Azure Resource Manager deployments can fail with the DeploymentQuotaExceeded (800) limit in a resource group, especially in high-frequency CI/CD scenarios using Bicep or ARM templates, and outlines practical cleanup and prevention approaches.
Brian Benz summarizes the first independent security audit of Inspektor Gadget, an eBPF-based Kubernetes observability and Linux host inspection tool, including the vulnerabilities found, the fixes shipped in recent releases, and practical hardening recommendations for teams running it in production.
shwetayadav explains how index-based Terraform for_each keys can trigger destructive disk churn on Azure, and shows a safer migration approach using stable keys plus terraform state mv, with a reusable GitHub Copilot skill to generate deterministic state-move commands.
mscagliola shows how to use GitHub Copilot skills for spec-driven development, turning a Medallion Architecture blog post into a repeatable repo that generates Terraform for Azure platform setup and Databricks bundle files for workloads, while enforcing strict placeholder/TODO rules to avoid invented environment values.
SagarPatra explains how their QA team used GitHub Copilot as a practical assistant for test design, automation scaffolding, and maintenance work, while keeping human review and responsible AI practices non-negotiable.
Steven Bucher announces the public preview of the Azure Resource Manager MCP Server, a remote MCP server that lets AI agents query and operate on Azure resources via Azure Resource Manager and Azure Resource Graph, including generating KQL queries from natural language and deploying ARM templates from within VS Code.
hcamposu introduces the Logic Apps Migration Agent, an open-source, AI-assisted approach for migrating BizTalk Server (and other integration platforms) to Azure Logic Apps Standard, with a structured workflow, human review checkpoints, and a code-first experience via VS Code and GitHub Copilot.
KimVaddi lays out a reference architecture for governing “agent sprawl” with a multi-region AI agent landing zone on Azure, using layered control planes to enforce policy, safety, evaluation, and observability across agents, models, and tools.
Akshita Bajpai explains how a validation-driven Terraform approach made Azure Functions deployments more predictable by shifting common failures from terraform apply to PR and plan stages, using PR checks, Azure pre-flight checks, and Terraform-native validations with clearer, actionable errors.
lapadman lays out a practical phased-parallel cutover approach for enterprise Azure PaaS migrations, with a focus on keeping downtime near zero while avoiding message loss and split-brain scenarios. It covers traffic shifting with Azure Front Door, Service Bus relay patterns, HA/DR design, observability, and rollback criteria.
pauledwards explains how to cut “model weight pre-flight” time on multi-node Azure GPU clusters by sharding downloads from Azure storage and broadcasting the remaining data over InfiniBand using MPI, with practical launch patterns for both Slurm and AKS.
KonstantinaF outlines a practical, phased disaster recovery strategy for Azure Databricks, focused on cross-region resilience for lakehouse workloads. The post explains RTO/RPO trade-offs, compares active-active vs warm standby patterns, and details how to replicate Unity Catalog metadata and Delta data using IaC, CI/CD, and repeatable DR pipelines.
Amit Damle and RK Iyer describe a “Discovery” utility for Azure Databricks that inventories workspace assets into Unity Catalog-backed Delta tables and a Lakeview dashboard, helping platform teams quickly understand clusters, jobs, warehouses, pipelines, security settings, and DBU usage.
mohashaikh shows how to use GitHub Copilot Spaces plus a dedicated Markdown “engineering knowledge base” repo to make Copilot answer questions and generate code in line with your team’s standards, with optional in-repo instruction files and reusable prompt-file slash commands for consistent reviews.
gurkirat explains how GitHub Copilot can speed up Azure Landing Zone work by shifting engineers from writing Terraform and pipelines by hand to prompting for a structured draft and then reviewing it, with examples spanning management groups, networking, OIDC, GitHub Actions, and policy assignments.
toddysm (with the Azure Container Registry team) explains how ACR Artifact Cache can now use another Azure Container Registry as its upstream, enabling ACR-to-ACR pull-through caching for scenarios like image promotion and hub-and-spoke registry topologies, with managed identity support and clear networking/auth constraints.
Shivani650 shares practical, migration-tested guidance for moving workloads from AWS to Azure, focusing on target architecture, infrastructure as code, networking, observability, and security. The post highlights common pitfalls (like skipping landing zone design) and outlines concrete practices to reduce drift, improve stability, and standardize deployments.
Roslin_Nivetha explains how to deploy Azure Managed HSM, create an HSM-backed encryption key, assign RBAC roles, and configure an Azure resource (like a Storage account) to use customer-managed keys (CMK) via Bicep, including common permission and key-rotation pitfalls.
sutandan explains spec-driven development as a more reliable alternative to the “prompt → retry → guess” loop when using AI coding tools, showing how a lightweight specification (inputs, outputs, constraints, edge cases) can make generated code more consistent for APIs and refactoring tasks.
ranjsharma outlines an approach for validating Azure infrastructure consistency by comparing an Excel “source of truth” against Terraform configuration and the actual deployed resources, producing a drift report that highlights missing resources and mismatched settings like region and SKU.
Devi Priya explains how GitHub Copilot Workspace supports intent-driven, multi-file refactoring across a repository, including a practical walkthrough that modernizes an app’s authentication flow and highlights planning, review, and adoption best practices.
Devi Priya walks through creating an Azure Logic Apps (Standard) project in Visual Studio Code, running and debugging it locally with the required tooling, and then packaging and deploying the workflow to Azure using a YAML-based CI/CD pipeline.
varghesejoji introduces the Application Resilience Framework and a companion tool that turns architecture artifacts into a measurable resilience model. The guide walks through what to import, how to prioritize workflows and failure modes, how to validate mitigations (including chaos tests), and how to map risks to health signals and governance.
Vybava Ramadoss (coauthored with Nerdio) explains how Azure Files can improve Azure Virtual Desktop profile performance and simplify identity at scale, focusing on FSLogix profile containers, Provisioned v2 performance tuning, and Entra ID authentication for SMB access.
ruchitapradhan shares a practical performance-tuning journey for a high-volume Azure workload, showing why “scale out more” can backfire and how controlled concurrency, better Durable Functions orchestration, and proactive SQL maintenance can improve throughput and stability while reducing cost.
Valini Sunthwal describes a practical pattern for running multi-subscription Azure AI infrastructure with drift detection and “self-healing” using Terraform, multi-repo boundaries, and a daily reconciliation pipeline that cross-checks deployment metadata against Terraform state and a central registry.
jiteshhp outlines a practical approach to validating high availability for Azure Kubernetes Service (AKS) in a single region using Availability Zones, focusing on controlled failure simulations (pod, node, zone, network, and dependency) and the metrics and checks needed to confirm real runtime resiliency under traffic.

Rejoining the server...

Rejoin failed... trying again in seconds.

Failed to rejoin.
Please retry or reload the page.

The session has been paused by the server.

Failed to resume the session.
Please reload the page.