Weekly GitHub Copilot Roundup: Models, Agents, and MCP Ops

Mar 23, 2026 by TechHub

This week's Copilot story is less about one headline feature and more about Copilot settling into three practical layers teams run every day: (1) clearer model choice and governance, (2) agent workflows with the observability and safety controls teams expect, and (3) broader MCP tool access so Copilot can act with real platform context (Azure DevOps, GitHub scanners, Azure resources, Fabric) instead of relying on chat history guesses. Building on last week's themes (auto model selection across IDEs, repo-visible instruction files and hooks, and enterprise observability), this week adds more of the operational layer needed for scale: stable model windows, adjustable validations, and more traceable agent execution.

Models, defaults, and enterprise governance

Copilot's model lineup changed in two ways that matter for operations, and both extend last week's model-management thread (JetBrains Auto Model Selection GA, model attribution, and plan/policy routing). First, OpenAI's GPT-5.4 mini is rolling out as generally available as a faster, agentic coding option aimed at repo exploration and “grep-style” tool workflows. It appears in the model picker across VS Code (chat/ask/edit/agent), Visual Studio, JetBrains, Xcode, Eclipse, github.com, GitHub Mobile, and GitHub CLI; availability depends on paid plans and Business/Enterprise admin policy. GitHub also notes a 0.33x premium request multiplier (subject to change), which can help keep exploration costs lower. That matters more now that auto-routed requests can carry pricing modifiers and teams increasingly treat model choice as a policy and cost-control surface, not just personal preference. Second, GitHub added an enterprise stability option: long-term support (LTS) for GPT-5.3-Codex on Copilot Business and Enterprise. LTS models remain available for 12 months, and GPT-5.3-Codex is the first, available through 2027-02-04. GPT-5.3-Codex will also become the default base model (replacing GPT-4.1) for orgs that have not explicitly approved or selected alternatives. GitHub notes automatic enablement within 60 days and calls out a base-model date of 2026-05-17, so even teams that never touch the model picker should plan a validation window if they care about style, security patterns, or dependency choices. Together with last week's push to show which model responded, this is about fewer surprises: clearer defaults, longer-lived enterprise options, and better attribution for governance and audit.

Copilot coding agent: faster starts, smarter search, tighter quality gates, and better traceability

Several updates landed around the Copilot coding agent (the hosted/background agent assigned issues, run from Agents, or driven via @copilot on PRs). Last week's focus on autonomy trade-offs (auto-approval switches, IDE autopilot modes) and “run agents safely in real repos” continues here, with speed improvements plus more reviewable, repo-tunable controls and audit hooks. On performance, GitHub says the agent now starts work ~50% faster, reducing “time to first action” before it begins changing code. On effectiveness, it now uses semantic code search alongside text matching, helping it find code by intent; GitHub says this cut ~2% off task time in internal tests. This pairs with last week's parallelism story (CLI /fleet, multi-agent editor workflows): faster orientation and better code location make delegated tasks easier to use in practice. The most practical change this week is control and inspectability. Repos can now configure which validation tools the coding agent runs when it changes code. By default it runs tests and lint plus GitHub security/quality checks (CodeQL, Advisory Database, secret scanning, Copilot code review) and tries to fix findings before returning results. Admins can now tune that set in Repository Settings → Copilot → Coding agent, including disabling heavier checks (like slow CodeQL) when they do not match the repo's feedback loop, even though GitHub says these validations are free and on by default. This continues last week's governance theme: as agents act more autonomously, teams want visible “speed vs safety” controls aligned with CI, not buried in prompt habits. Two related improvements also help with audit and review. Session logs are now clearer and more actionable (explicit setup steps like clone + “agent firewall” start; inline output from copilot-setup-steps.yml; subagent work collapsed by default with an “in progress” indicator). And agent commits now include an Agent-Logs-Url trailer so reviewers can jump from a commit to the exact session logs that produced it, which helps both review context and later traceability. This extends last week's observability push (VS Code OpenTelemetry, hook troubleshooting, debug snapshots) into Git history.

VS Code Copilot agents: Autopilot, multi-agent workflows, troubleshooting, and customization management

VS Code's Copilot agent experience kept moving toward longer-running, more autonomous workflows, while adding guardrails and diagnostics for when agents go off track. This continues last week's 1.111 autonomy modes discussion (default approvals, bypass approvals, autopilot) and the parallel effort to make customization debuggable (instruction discovery conventions, /troubleshoot, hook inspection). In an Insiders walkthrough, VS Code previewed Autopilot mode and chat UX updates for multi-step sessions: “shimmers” to clarify generation state, collapsed containers to keep long runs readable, and input UI changes that surface agent controls. Autopilot reduces constant confirmations, but adds explicit approval modes, a permissions picker, max-retry limits, and a “task_complete” stop condition to keep unattended runs bounded. That matches last week's framing that autonomy is a posture you choose, not a single switch. The VS Code 1.112 update video covered details that show up in CLI/background sessions: you can now steer an in-progress run (steer after current tool call, add to queue, or stop-and-send). Startup is also safer with uncommitted changes: the session prompts what to do and shows an expandable file list so you know what will be copied, moved, or skipped. Autopilot extends to CLI sessions via chat.autopilot.enabled, with guidance to use isolated environments like dev containers or Codespaces when bypassing approvals. That echoes last week's “safer automation” and sandboxing focus. Troubleshooting also got more concrete: a new /troubleshoot command analyzes agent debug logs (with JSONL logging enabled) to explain why instructions or skills are not applied. Logs can now be exported/imported as JSON for team debugging, with a reminder they may contain sensitive content. Version 1.112 also added image analysis for workspace images, symbol paste that preserves location context (sym: references), better monorepo customization discovery (including parent .github customizations via chat.useCustomizationsInParentRepositories), and a unified “Open Customizations” view to manage agents, skills, instructions, and MCP servers (including disabling servers per session or workspace). This continues last week's “move customization to files” direction: fewer implicit behaviors, more repo-visible configuration, and fewer mysteries when a skill or instruction does not apply. For teams testing parallel agents, another video showed multiple agent sessions side-by-side as separate workstreams (feature, storage wiring, docs) so you are not blocked on one tool run. It complements last week's CLI /fleet story by showing parallelization in-editor as a practical way to deal with tool latency.

MCP and plugins: giving Copilot real tools (Azure DevOps, GitHub secret scanning, Azure resources, Fabric)

MCP keeps becoming the plumbing that connects Copilot to real systems in a way teams can deploy and govern. After last week's MCP momentum (bridging chat to permissioned tool calls) and the reminder that auto-approve is a policy choice, this week's emphasis is moving from local experiments to operational deployments: hosted endpoints, security scanning in the inner loop, and platform skills that need repeatable setup. For Azure DevOps, Microsoft announced a public preview Remote Azure DevOps MCP Server: a hosted MCP endpoint in Azure DevOps that uses streamable HTTP transport and removes the need to run a local server process. Setup is essentially an org-scoped URL in MCP config (for example, https://mcp.dev.azure.com/{organization}), but the prerequisite matters: the org must be Entra-backed (not MSA-only). Today, it works without extra onboarding in Visual Studio + Copilot and VS Code + Copilot; other clients (Copilot CLI, Claude Desktop/Code, ChatGPT) depend on upcoming Entra OAuth dynamic registration. Microsoft also signaled that local-server support remains “for now,” but the repo is expected to be archived once remote reaches GA. In practice, this is the step that makes MCP easier to run in an enterprise: fewer local daemons, more centrally managed auth and endpoints. On GitHub's MCP side, a public preview lets AI coding agents trigger GitHub secret scanning via the GitHub MCP Server to catch leaked credentials in uncommitted changes (pre-commit/pre-PR). It requires GitHub Secret Protection. Setup differs by host: in Copilot CLI you install the Advanced Security plugin and add the MCP tool (run_secret_scanning); in VS Code you install the advanced-security agent plugin and run /secret-scanning from Copilot Chat. The key shift is bringing secret detection into the inner loop while agents generate or refactor config, scripts, and sample code, which is where secrets often slip in. That reinforces last week's defense-in-depth guidance: wire scanners in as tools instead of assuming the model will remember. For Azure, a guide walked through installing and verifying the Azure Skills Plugin across Copilot CLI, VS Code, and Claude Code, with an emphasis on proving tools are actually called. It covers Node.js 18+ (MCP servers via npx), az login, azd auth login, and smoke tests like “list my resource groups.” It also calls out a common operational issue: Copilot CLI needs /mcp reload after install, and token or skill budgets can silently prevent skills from activating when multiple plugins are loaded. This matches last week's “prompt less, context more” point: many “Copilot ignored me” cases are tool/config visibility problems. Fabric also showed up via the FabCon/SQLCon recap: Fabric MCP (local MCP GA as open source; remote MCP in public preview) and “Agent Skills for Fabric” plugins so GitHub Copilot in the terminal can perform Fabric tasks via MCP tooling, alongside Git integration/CI/CD improvements and agent-enabled operational workflows. It follows the same arc as the Azure DevOps remote MCP announcement: move from “connect a tool” to “repeatable platform capability.”

Copilot CLI and repo-native agent workflows: multi-model review, session forensics, and reproducible “agent memory”

Copilot CLI and “agents in the repo” content lined up on one theme: treat agent output as inspectable, repeatable, and standardizable, not one-off chat. This continues last week's CLI direction (requesting Copilot code review from gh, /fleet parallelism, terminal-first workflows) and pushes further into repeatability and post-hoc analysis. A Copilot CLI tip showed using /review as a local pre-PR gate scoped to bugs, security, and performance, then doing multi-agent review by using multiple model backends (for example, Gemini, Codex, Opus) and combining results. That can help surface different failure modes before CI, and it fits last week's point that model choice is becoming explicit and policy-shaped. Here, the goal is coverage rather than debating a single “best model.” Another advanced video focused on forensics: Copilot CLI stores session history in a local SQLite database, which you can query to retrieve prior prompts and outputs, reconstruct past work, or feed history back into Copilot to critique prompting patterns. It complements last week's “memory default-on” and “debug snapshots” story: even when memory sharing is not desired, teams still want reproducible records of what happened. GitHub also highlighted Squad, an open-source, Copilot-based multi-agent orchestration approach that lives in the repo. It is repo-native: install a CLI, run squad init, and it generates an AI “team” (coordinator plus specialists) that works via branches and PRs, uses repo files for shared memory (for example, a committed “versioned decisions.md”), and keeps an audit trail under .squad/. Version-controlled memory (instead of an external vector DB) keeps it inspectable and portable. This matches last week's instruction-file and hook movement: put agent behavior in on-disk artifacts you can review, diff, and govern. The post also describes safety patterns like independent review, where rejected output cannot be self-corrected by the same agent and fixes are pushed to a different persona. That is an organizational analogue to last week's separate approvals and defense-in-depth posture.

Copilot customizations and “skills” as shared team assets

Copilot customization is increasingly treated as reusable tooling, not personal prompt snippets. This builds on last week's shift to repo-discoverable instruction sources (AGENTS.md/CLAUDE.md, .github/instructions, hooks) and the emerging skills ecosystem (dotnet/skills, skill specs, evaluation loops). The common thread is that “how Copilot behaves” is becoming versioned project configuration, not chat lore. The community-driven Awesome GitHub Copilot Customizations project shipped a website and Learning Hub, plus a plugin marketplace-style flow to discover and adopt agents, skills, instructions, and plugins via PR-reviewed, traceable GitHub workflows. The catalog is now large (hundreds of items), and the site adds search, previews, and one-click installs for VS Code/Insiders, plus CLI install patterns like copilot plugin install <plugin>@awesome-copilot. It also reflects an ongoing simplification effort: consolidating formats so skills become the common reusable unit over time, which helps teams package and maintain standard behaviors more easily. That lines up with last week's point that reusable skills tend to produce more reproducible behavior than ad-hoc prompting. A short VS Code intro reinforced the same product direction: Agent Skills bundle instructions and resources into a named capability invoked on demand (for example, /skill-name) instead of expanding system prompts or rewriting checklists. The aim is predictability: teams codify conventions, navigation routines, and run/test steps as repeatable skills discoverable in chat. Paired with last week's instruction discovery and troubleshooting, the pattern is explicit, discoverable, and debuggable customization.

Copilot in real workflows: MAUI repo agents, cross-arch C++ migrations, designers prototyping in code, and hosted agent production paths

A few deeper pieces showed what “Copilot as workflow” looks like when embedded in real contribution pipelines. This extends last week's “portfolio-scale modernization” and “agents as workflow participants” theme, but with concrete layouts, gates, and deployment paths teams can reuse. The .NET MAUI team described repo-native Copilot CLI agents and composable skills under .github/agents and .github/skills, including a structured pr-review skill with phases that enforce tests-first and multi-platform validation. The “Gate” step blocks progress until tests exist and are shown to fail on the unfixed code; “Try-Fix” runs multiple fix attempts (including multi-model sequences) and records outcomes; a final PR-ready report is produced for humans. The post also details test generation (Appium UI tests, XAML runtime/XamlC/source-gen coverage) and CI debugging (Helix logs), anchoring the flow to real build/test infrastructure rather than pure code generation. It aligns with last week's governance direction by making guardrails and phase gates repo-visible and repeatable. A migration guide walked through using Copilot to refactor C++ during IBM Power/IBM i big-endian to Azure x86 little-endian ports, where apps can “work” while silently corrupting numeric values. Copilot helps with mechanical refactors (finding unsafe memcpy into structs, generating byte-swap helpers with C++20 std::endian and intrinsics), while the article stresses the parts you cannot outsource (ABI/padding checks via static_assert, EBCDIC conversions, packed decimals). It is the modernization arc from last week, with a reminder that correctness comes from platform constraints, tests, and review, not prompting. A design-focused discussion described designers using VS Code + Copilot (Chat and CLI) for code-based prototyping in repos, using plan-mode-like structure and parallel Copilot sessions paired with Git worktrees to test UI variants without branch friction. It mirrors last week's multi-surface Copilot story (IDE + CLI + web) and shows parallel sessions as a broader collaboration pattern. A tutorial also showed moving from prompt prototype to hosted agent using VS Code AI Toolkit + Microsoft Foundry, with Copilot used for model comparison and scaffolding Agent Framework templates (Python/C#), then using Foundry evaluation, LLM-judge scoring, red teaming, and monitoring in production. This continues last week's “Beyond Prompts” message (execution loops, tools, observability), and it ends in deployment monitoring rather than an interactive session.

Other GitHub Copilot News

Usage reporting and governance kept catching up to how Copilot is actually used across terminals and model pickers. Building on last week's enterprise-management angle (usage metrics, routing/session controls, and transparency into which model responded), this week closes two reporting gaps: org-wide CLI adoption and the “Auto” model bucket. Org admins can now see Copilot CLI activity at org scope (daily active users, sessions/requests, token totals, average tokens/request) in 1-day and 28-day windows, and via the usage metrics REST API. Separately, usage metrics now resolve “Auto” model selection into the actual model used in dashboard and API breakdowns, removing an ambiguous bucket that made cost and policy analysis harder (though GitHub says it still does not split auto-selected vs manual). This ties back to last week's JetBrains auto-routing story: once routing is automated, reporting needs to support cost and compliance governance.

Copilot usage metrics now includes organization-level GitHub Copilot CLI activity
Copilot usage metrics now resolve auto model selection to actual models GitHub's planning surface also added workflow changes related to agent delegation. Projects Hierarchy view is GA and enabled by default for new views, which makes sub-issue trees easier to manage. Issue templates can also auto-assign @copilot via YAML (assignees: ["@copilot"]) to route new issues to the coding agent (where permitted). GitHub also adjusted blank-issue behavior to preserve structured intake for contributors while still letting maintainers create blank issues. After last week's “trigger agentic review from CLI” and “agents doing routine triage,” this is a practical intake step: route work into an agent-friendly pipeline up front, not after manual rerouting.
Hierarchy view in GitHub Projects is now generally available Agent monitoring also showed up in Raycast: the GitHub Copilot Raycast extension now lets you stream agent logs live from “View Tasks,” which mainly reduces context switches and helps spot stalled sessions without living in the web UI. It fits last week's observability trend (debug snapshots, OpenTelemetry traces): as agents run longer, “where do I see what it's doing?” becomes a normal workflow question.
Monitor Copilot coding agent logs live in Raycast A couple of getting-started items focused on standardizing Copilot's environment. One video covered installing Copilot CLI on Windows (winget) and adding the Work IQ plugin (including EULA acceptance), with notes on interactive vs non-interactive usage, continuing last week's CLI onboarding. Another short reminded maintainers that .github is the repo's workflow control plane (Actions, templates, and a home for Copilot instructions and related guidance), echoing last week's move to repo-visible instruction sources.
Getting Started with GitHub Copilot CLI and Work IQ
The powerful GitHub folder most developers ignore Microsoft Fabric's extensibility tooling also reached a new baseline with the Fabric Extensibility Toolkit GA, including a “Copilot-optimized” starter kit (repo context under .ai/ plus DevContainer/Codespaces setup) and React/Fluent UI building blocks for Fabric workload items, plus OneLake-backed storage patterns and Entra OBO auth flows. It connects to last week's “structured context + reusable skills” story and this week's Fabric MCP mention: platforms are shipping repo scaffolding that assumes agents are part of the workflow.
Microsoft Fabric Extensibility Toolkit is now Generally Available For developers building Copilot extensions, a VS Code episode walked through creating a project with the GitHub Copilot SDK from scratch, pointing to the canonical repo for templates and APIs. This continues last week's “Copilot SDK = execution loops and tools” message, but as a more direct starter path for teams productizing internal agents.
‘Let it Cook - GitHub Copilot SDK: Fresh from Scratch’