Samantha_Fernandez introduces the preview of AI-powered Kubernetes troubleshooting agents in Azure Copilot, offering automated root cause analysis, actionable solutions, and integrated support for AKS clusters.

Azure Copilot Adds Advanced Kubernetes Troubleshooting Agents for AKS

What’s New?

Microsoft is previewing a new Kubernetes troubleshooting agent capability within Azure Copilot. This agent provides an intuitive, guided experience for detecting, triaging, and resolving issues in Azure Kubernetes Service (AKS) clusters. By using Kubernetes-specific keywords and running targeted kubectl commands, it can analyze cluster configuration, events, resource metrics, and diagnose problems such as pod failures or scaling bottlenecks. Users receive root cause analysis and actionable steps directly through Azure Copilot, empowering independent troubleshooting of complex diagnostics.

How It Works

  • The troubleshooting agent automatically investigates AKS cluster issues by executing relevant kubectl commands.
  • It detects errors such as failing or pending pods, problematic cluster events, and abnormal resource utilization.
  • The agent correlates signals across metrics and events, provides clear, step-by-step guidance for remediation, and offers one-click solutions for many common issues.
  • If automated resolution is not possible, Azure Copilot generates a support request with all necessary diagnostics to facilitate assistance from Microsoft Support.
  • This capability is available via Azure Copilot in the Azure Portal.

Getting Started

  • Admins can request preview access for agents at the tenant level in the Azure Copilot admin center.
  • Once enabled, users will see an Agent mode toggle in the Copilot chat interface.
  • Capacity is limited—sign up early for preview participation.
  • To help shape future agentic cloud operations, join the customer feedback program here.

Troubleshooting Sample Prompts

When accessing an AKS cluster resource, click Kubernetes troubleshooting with Copilot to use the agent with specific resources. Example prompts include:

  • “My pod keeps restarting can you help me figure out why?”
  • “Pods are stuck pending; what is blocking them from being scheduled?”
  • “I am getting ImagePullBackOff; how do I fix this?”
  • “One of my nodes is NotReady; what is causing it?”
  • “My service cannot reach the backend pod; what should I check?”

Ensure agent mode is enabled in the chat window.

Learn More


Author: Samantha_Fernandez

Version 1.0. Updated Nov 18, 2025.

This post appeared first on “Microsoft Tech Community”. Read the entire article here