Agent Governance, Safer Ops, and Platform Modernization
This roundup tracks a clear shift from agent capability to agent governance: more context, more observability, and more policy controls across Copilot, VS Code, and the CLI. On the platform side, Microsoft tightened the path from prototype to production with .NET agent building blocks, Azure AI Foundry deployment patterns, and data governance improvements that make RAG and operations easier to standardize. We also cover the less flashy work that keeps systems dependable at scale, including Fabric and Databricks operational updates, GitHub migration and ruleset changes, and security research that keeps token theft, privilege escalation, and supply chain risk in focus.
This Week's Overview
- GitHub Copilot
- GitHub Copilot in Visual Studio Code (agent context, search, and cost controls)
- Copilot CLI (Rubber Duck multi-model reviews and enterprise-managed plugins)
- Enterprise model governance (upcoming deprecations and model policy actions)
- Agent-driven development hygiene (review checklists, security, and measurement)
- Other GitHub Copilot News
- Artificial Intelligence
- Microsoft Agent Framework in .NET: agent building blocks, orchestration patterns, and a production deployment path
- Azure AI Foundry + Fabric data: governed knowledge sources and real-world operations
- Agentic developer experience: IDE chat sessions, safer terminals, and editors that can opt out of AI
- Applied agent patterns in teams: stand-ups, document pipelines, and production-minded tutorials
- Machine Learning
- Microsoft Fabric: Scaling Spark automation, discoverability, and operational guardrails
- Microsoft Fabric Dataflow Gen2 and Direct Lake: Less rework in data prep, fewer surprises in semantic performance
- Azure Databricks: Disaster recovery planning and a workspace inventory you can query
- Model behavior control: Context engineering, RAG, and when to fine-tune
- DevOps
- GitHub Enterprise migrations and repository governance
- GitHub settings and API changes that affect automation
- VS Code and TypeScript workflows for building and operating projects
- Terraform validation and “shift left” reliability for Azure Functions
- AI-assisted documentation that fits DevOps decision-making
- Azure
- Azure Virtual Machines: new Intel Xeon 6 families and safer paths to higher availability
- Azure cost planning: Reserved VM Instances retirement guidance and what to do before July 2026
- Azure Container Registry: ACR-to-ACR pull-through caching for registry hierarchies
- Azure integration: modernize to Logic Apps Standard with a migration agent and new Oracle in-process connectivity
- Azure AI governance and operations: landing zones for agents and MCP-based access to Azure Resource Manager
- Other Azure News
- .NET
- Security
- Microsoft Defender threat research: token theft, macOS infostealers, and active Linux exploitation
- GitHub + Defender for Cloud: bringing runtime context into code security, and scanning that works with AI agents
- Microsoft identity and passwordless: passkeys progress and recovery changes
- Hardware and data platform controls: open HSM components and OneLake security GA
- Other Security News
GitHub Copilot
GitHub Copilot updates this week leaned into two themes we have been tracking: giving agents more context and tighter enterprise controls, while GitHub simultaneously pushes teams to stay current on model availability and review quality as more PRs arrive with AI fingerprints. After last week's mix of “more capability” (agents across IDEs, CLI, MCP tooling) and “more constraint” (individual plan limits, premium multipliers, model availability changes), this week's changes read like the next step: reduce wasted token spend, make sessions more observable, and give admins more levers so Copilot can scale without becoming unpredictable.
GitHub Copilot in Visual Studio Code (agent context, search, and cost controls)
VS Code users saw a steady set of Copilot improvements across the v1.116-v1.119 line, with the most practical changes focused on context gathering and responsiveness. That picks up directly from last week's emphasis on “intentional configuration” (model pickers, autonomy controls, and usage indicators) by making it easier for Copilot to find the right inputs without you manually pasting them. Semantic search now spans your local workspace and GitHub repositories, which matters when you are asking Copilot Chat questions that depend on “project memory” rather than the currently opened file. GitHub also introduced an experimental /chronicle chat-history index, aiming to make prior Copilot conversations usable as retrievable context instead of dead text in a sidebar (a natural extension of last week's push toward more resumable, auditable sessions).
On the performance and cost side, Copilot reduced token usage with prompt caching and deferred tool loading. In practice, that means less repeated context sent on similar requests and fewer tools initialized until the agent actually needs them, which should show up as faster starts and lower consumption in longer sessions. That matters more after last week's individual plan limits and GPT-5.5 premium multipliers made “how much context you send” a real workflow constraint, not an abstract optimization.
Agents themselves got more capable and easier to supervise: inline diffs help you see proposed code edits in-place, terminal access makes it possible for an agent to run commands as part of a workflow, and browser tab sharing expands what an agent can “see” when troubleshooting docs, dashboards, or web UIs. This continues last week's cross-IDE direction (JetBrains inline agent mode, VS Code autonomy/permissions) where the differentiator is not just agent power but how clearly you can see and control what the agent is doing. VS Code 1.119 also highlighted OpenTelemetry tracing for agent sessions, giving teams a way to instrument and diagnose agent activity (with availability shaped by plan and enterprise policy). That fits neatly with last week's theme of traceability (structured debugging output on the web, better session metadata in clients): once agents act across terminals, repos, and browsers, teams need logs and traces that look more like production tooling than chat transcripts.
For organizations standardizing on their own model relationships, BYOK (Bring Your Own Key) model providers continued rolling out for Copilot Business and Enterprise, letting teams route requests through approved providers and keys while keeping policy controls in view. In context, BYOK is becoming the enterprise counterpart to last week's “escape hatch” framing for individuals impacted by plan/model changes: reduce dependency on a single hosted SKU by putting model access behind keys and policies you control.
- GitHub Copilot in Visual Studio Code, April releases
- Visual Studio Code and GitHub Copilot - What's new in 1.117
- Visual Studio Code and GitHub Copilot - What's new in 1.119
Copilot CLI (Rubber Duck multi-model reviews and enterprise-managed plugins)
The Copilot CLI story this week was about making terminal-based Copilot usage more governable for enterprises and more reliable for individuals who want a second set of “AI eyes” on changes. It is a direct continuation of last week's CLI thread (longer-running, tool-loop-heavy workflows, plus BYOK/local-model options) but with more emphasis on structured review and admin control as usage grows.
Rubber Duck in Copilot CLI now supports cross-model “second opinion” flows: a GPT-orchestrated session can hand review to a Claude critic agent, and a Claude-orchestrated session can use GPT-5.5 for review. This builds on the workflow pattern called out last week (separate “builder” and “reviewer” models) and makes it easier to operationalize: instead of ad hoc copy/paste between chats, the CLI can run an explicit critique step that helps catch blind spots before code reaches a PR. That becomes even more relevant alongside this week's agent-PR review guidance (later in this section) because it lets you move some “reviewer mindset” left into the terminal loop before changes ever hit GitHub.
For enterprise rollouts, GitHub put enterprise-managed plugins in Copilot CLI into public preview. Central teams can configure the plugin marketplace, auto-install approved plugins, and enforce baseline standards (including hooks and MCP configuration) via a shared settings.json. This echoes last week's theme that as Copilot spreads into more surfaces, governance needs to follow. Combined with the VS Code note about remote monitoring of Copilot CLI sessions, the direction is clear: more CLI capability, but paired with more administrative guardrails and auditability so the CLI can be used at scale without each developer hand-curating plugins and configs.
- Rubber Duck in GitHub Copilot CLI now supports more models
- Enterprise-managed plugins in GitHub Copilot CLI are now in public preview
Enterprise model governance (upcoming deprecations and model policy actions)
If your organization relies on specific Copilot chat models, this week came with deadlines. This is the follow-through to last week's “model churn needs knobs” storyline (model pickers, admin toggles, and premium multipliers): now the churn has concrete dates that force decisions. GPT-4.1 will be deprecated across GitHub Copilot experiences on 2026-06-01, with GPT-5.5 positioned as the replacement. Separately, Grok Code Fast 1 is scheduled for deprecation on May 15, 2026, and GitHub pointed admins toward alternatives like GPT-5 mini or Claude Haiku 4.5.
The practical implication is policy work, not just awareness. Copilot Enterprise admins may need to update model policies so replacement models actually appear where developers select them (including Copilot Chat in VS Code and on github.com). Treat this like any other dependency change: verify model availability ahead of the cutoff, test critical workflows (code review prompts, refactor tasks, internal coding standards), and communicate which models are approved so developers do not discover missing options mid-sprint. If you enabled GPT-5.5 last week (and accounted for its premium multiplier), this is the week to validate that it is not just available in principle but actually usable in the specific clients and flows your developers rely on.
Agent-driven development hygiene (review checklists, security, and measurement)
As agent-generated pull requests become routine, GitHub published a pragmatic checklist focused on where AI-authored changes tend to go wrong even when tests pass. This complements last week's theme that agents are becoming first-class (JetBrains inline agents, VS Code session controls, CLI tool loops) by addressing the uncomfortable next question: what does “good review” look like when a meaningful chunk of the change came from an agent rather than a human typing line-by-line?
The guidance calls out patterns reviewers should actively look for: CI weakening (for example, disabling or loosening checks to get a green build), duplicated utilities that quietly increase maintenance cost, and subtle logic bugs that slip through because coverage does not hit the right edge cases. It also puts security front and center for LLM-powered GitHub workflows, including prompt injection risks and the common mistake of leaving GITHUB_TOKEN permissions too broad for the job at hand. The takeaway is that “agent PRs” need a different reviewer mindset: you are not only reviewing code correctness, you are reviewing whether the workflow itself was bent to make the code look correct. That ties back to last week's shift toward more explicit guardrails (agent permissions, tool-call controls, structured debugging) because review is where those guardrails get tested in practice.
On the measurement side, Copilot usage metrics got more specific for teams trying to understand whether Copilot code review is helping. Last week, we saw more emphasis on usage signals and governance (warnings for limits, admin controls for models). This week adds finer-grained evidence: the Copilot usage metrics REST API now includes copilot_suggestions_by_comment_type under pull_requests, reporting totals and applied totals per Copilot code review comment type for enterprise and organization reports. With that breakdown, you can start answering practical questions like which comment types developers actually apply, whether certain teams ignore whole categories, and where training or policy tweaks might improve outcomes (for example, if security-related comment types are consistently skipped).
- Agent pull requests are everywhere. Here’s how to review them.
- Copilot code review comment types now in usage metrics API
Other GitHub Copilot News
Copilot cloud agent configuration got easier to scale: GitHub added dedicated “Agents” secrets and variables with organization-level configuration and per-repository access controls, which helps when you need consistent settings across many repos without over-sharing credentials. This is a clean continuation of last week's “governance moves closer to the workflow” thread (policies, controls, auditability) because secrets and variables are where many agent experiments fail in practice. Centralizing them in an “Agents” scope makes it easier to standardize cloud-agent behavior across repos while keeping the blast radius smaller than ad hoc per-repo secret sprawl.
Artificial Intelligence
This week, the AI conversation kept shifting from “what can agents do?” to “how do we build and run them safely in real systems?” Building on last week's push to treat agents like standard cloud apps (traceability, identity, hosted execution, and repeatable deployment), Microsoft leaned further into agent-building primitives in .NET and showed a clearer production path through Azure AI Foundry. In parallel, editor makers and teams shared how agent workflows are landing in day-to-day developer experience and operations, where the details (session management, terminal safety, human approval steps, and governed data access) decide whether agents are useful or risky.
Microsoft Agent Framework in .NET: agent building blocks, orchestration patterns, and a production deployment path
The Microsoft Agent Framework (v1.0) continued to take shape as a practical set of .NET building blocks for agentic apps, extending the same “from local to production” story we covered last week into more concrete .NET-first primitives. Jeremy Likness framed it as the third building block alongside the broader Microsoft.Extensions.AI story, focusing on what you need once you move past single-prompt chat: tool calling to let the model take actions, sessions (via AgentSession) to keep multi-turn state coherent, and memory via context providers (AIContextProvider) so the agent can retrieve and inject the right information at the right time. A key theme is that “agent” is not just a wrapper around an LLM call, its the combination of tools, state, and structured flow that makes behavior repeatable.
That emphasis on structured flow showed up again in Jacob Alber's walkthrough of the Handoff orchestration pattern. It is a practical continuation of last week's focus on multi-agent workflows that can be run and shipped, not just diagrammed. Instead of one agent trying to do everything, you define a bounded graph of agents and let them route control between each other using injected handoff tools, while maintaining a shared transcript so context does not get lost mid-transfer. The post contrasts Handoff with Sequential and explicit conditional workflows, and includes both .NET and Python examples, plus guidance on when to add Human-in-the-Loop (HITL) checkpoints so the system can escalate decisions rather than guess.
The other half of the story was what happens after your agent works locally. Following last week's introduction of Foundry Hosted Agents as the “where does the code actually run?” answer, Tao Chen and Shawn Henry described how to deploy Agent Framework agents to Foundry Hosted Agents (Foundry Agent Service) in Azure AI Foundry, with concrete steps that mirror real delivery pipelines: container packaging with Azure Developer CLI (azd) and Azure Container Registry (ACR), choosing between protocol styles (/responses vs invocations) depending on how you want clients to interact, and wiring identity through Microsoft Entra ID instead of shipping keys around. They also call out operational basics that agent apps need just as much as APIs do, like versioned rollouts and built-in observability through Application Insights, so you can see tool calls, failures, and latency once the agent is under real load. Read together with last week's OpenTelemetry-first messaging, the direction is consistent: Microsoft wants agent apps to inherit the same deployment and operations discipline teams already apply to services.
- Microsoft Agent Framework – Building Blocks for AI Part 3
- A Tour of Handoff Orchestration Pattern
- From Local to Production: Deploy Your Microsoft Agent Framework Agent with Foundry Hosted Agents
Azure AI Foundry + Fabric data: governed knowledge sources and real-world operations
Azure AI Foundry pushed further into “use your real data, but keep it governed” by making the OneLake catalog natively available in Foundry (now generally available). This is a direct continuation of last week's MCP-through-Fabric storyline, where Fabric started to look like an AI tool surface and operational plane for agents. Here, the focus is more on the RAG setup path and discoverability: instead of switching contexts to hunt for datasets, you can browse OneLake assets directly while creating knowledge sources, which matters when teams want retrieval-augmented generation (RAG) that is both discoverable and compliant. The GA post walks through prerequisites and the setup path for creating a Foundry knowledge base backed by OneLake and Azure AI Search, a common pairing when you need governed storage (Fabric) plus an index optimized for retrieval (AI Search). It also matches the broader Foundry direction from last week, where governance and reuse move “up a layer” so teams can scale beyond one-off demos.
A Microsoft case study made that architecture feel less abstract by showing how Porsche Cup Brasil built AI-assisted race operations on the same stack. Their crash analysis workflow uses Azure AI Foundry with Azure AI Search and Microsoft Fabric, with apps running on Azure Kubernetes Service (AKS) and a human validation step to confirm conclusions before decisions get acted on. That HITL step echoes the practical safety thread running through both weeks: agentic systems need approval paths and auditability when outcomes matter. The broader operations loop pulls real-time telemetry into Fabric and visualizes it with Power BI for anomaly detection during races, illustrating a pattern many teams will recognize: stream data into an analytics layer, let AI help summarize or classify events, then keep a human in the approval path when the cost of a wrong call is high.
- OneLake catalog is now natively available in Foundry (Generally Available)
- Inside Porsche Cup Brasil’s AI-powered race operations
Agentic developer experience: IDE chat sessions, safer terminals, and editors that can opt out of AI
Tooling changes this week reflected a common friction point with agentic workflows: once you have multiple conversations, multiple models, and terminal access, the UI and safety defaults start to matter. That lines up with last week's platform-side emphasis on “safe to run” agents (sandboxing, identity, observability), but viewed from the developer workstation where accidents (like leaking secrets) actually happen. Visual Studio Code 1.120 (Insiders) refined chat and agent workflows with session organization changes (helpful when you are juggling several tasks), a model context size picker (so you can trade off cost/latency vs how much context the model can read), and safer handling of terminal password prompts to reduce the chance of accidentally leaking secrets into an agent session. For extension authors, it also introduced a proposed customDiffEditorProvider API, alongside improvements across GitHub and Copilot CLI-related integrations.
On the editor side, Zed reached version 1.0 and positioned itself clearly as a “traditional editor and AI tool” rather than an AI-first environment. DevClass highlighted its Rust-based implementation, LSP-driven language support, and optional AI features that include Zeta LLM predictions and parallel agents, plus support for Agent Client Protocol (ACP). The practical detail many teams will care about is that AI can be disabled entirely, which complements the broader enterprise control story from last week (governed tools, identity boundaries, and hosted isolation): sometimes the safest AI feature is the one you can reliably turn off where policy demands it.
- Visual Studio Code 1.120
- Zed team releases version 1.0 of Rust-built editor: Traditional editor and AI tool
Applied agent patterns in teams: stand-ups, document pipelines, and production-minded tutorials
Several posts focused less on platforms and more on what teams are actually building with LLMs and agents, reinforcing the same “production reality” theme that ran through last week's Agent Framework and Foundry updates. John Edward's “Daily Stand-Up Agent” design connects to Jira and Azure DevOps, grounds responses on sprint and work item data, and uses LLM summarization to generate stand-up notes, blocker alerts, and sprint health reporting through a conversational interface. It highlights the integration work that tends to dominate these projects (Azure DevOps Work Item APIs, Jira REST API, OAuth, and RBAC), and why grounding plus prompt design matters when summaries must reflect real status rather than plausible text.
On the data extraction side, tanyabaranwal shared an event-driven pipeline for contract processing that starts with Blob Storage and an Azure Functions Blob trigger, uses Azure AI Document Intelligence to extract layout and tables, normalizes the output into a canonical JSON schema, and stores results in Cosmos DB for downstream use. The post rounds out the “production” story with monitoring in Application Insights and security practices like Key Vault, Managed Identity, and RBAC, which tracks closely with the lifecycle framing from last week (identity, governance, and observability as defaults rather than add-ons). Even when this is not presented as an “agent,” it fits the same operational model: AI components become dependable services only when they ship with the same controls you expect for any other workload.
Rounding out the week, GitHub's “Rubber Duck Thursdays” episode walked through building an AI agent app for a fictional company, including what changes when you plan for production deployment rather than a local prototype. That mirrors Microsoft's recent messaging almost point-for-point: the throughline across these examples is that the agent behavior is only half the job, the rest is connectors, identity, monitoring, and a clear strategy for what the model can and cannot do in your workflow.
- The Daily Stand-Up Agent: A Custom Copilot for Summarizing Jira & Azure DevOps Progress
- Building a Scalable Contract Data Extraction Pipeline with Microsoft Foundry and Python
- Rubber Duck Thursdays: Building an AI agent app
Machine Learning
This week in machine learning and analytics tooling was mostly about making day-to-day platform operations less fragile: Fabric pushed several previews that help teams scale Spark automation, find assets across workspaces, and centralize monitoring and cost controls, while Databricks guidance focused on disaster recovery and visibility across sprawling workspaces. Building on last week's Fabric-heavy focus on “operational plumbing” (MLOps boundaries with MLflow, real-time ingestion paths, and secure-by-default architecture choices), the throughline here is similar: once the platform grows beyond a single workspace or a single team, automation, discoverability, and guardrails matter as much as the model code. Alongside the platform work, model-behavior guidance reinforced a practical theme: better outcomes come from better context, not just bigger prompts.
Microsoft Fabric: Scaling Spark automation, discoverability, and operational guardrails
Fabric's Spark automation story moved forward with preview support for High Concurrency sessions in the Fabric Livy API, aimed at teams running many parallel jobs without paying the overhead of constantly spinning up new sessions. Instead of serializing work through a single session, you can run multiple isolated Spark workloads in parallel while still reusing sessions, tagging them with sessionTag so clients can reliably reconnect to the right session and track what is running where. In practice, this changes how you build orchestration: rather than treating a Livy session as a single-threaded bottleneck, you can design job runners that multiplex workloads, isolate failure domains, and get cleaner monitoring and cost attribution because sessions are no longer a black box shared by unrelated jobs. If last week's theme was promoting ML work safely across environments (Dev/Test/Prod boundaries with MLflow), this is the adjacent scaling problem: once you have repeatable pipelines, you need them to run concurrently without turning into a session-management and cost-debugging mess.
Cross-workspace sprawl is another common pain point in Fabric, and a new preview OneLake Catalog Search REST API targets exactly that. It lets you discover items across workspaces programmatically, which matters once you have dozens of domains, duplicated datasets, and a growing list of notebooks, warehouses, and semantic models. This ties directly back to last week's cross-workspace MLOps story: separating workspaces is only helpful if teams can still find the right datasets, artifacts, and owners across those boundaries without manual inventories. Microsoft also wired this search into the Fabric Core MCP Server (so agent-driven workflows can query the catalog as a tool) and added a new fab find command in the Fabric CLI. If you are building internal tooling or CI checks, the details here are practical: results can be filtered and shaped using JMESPath, making it easier to build repeatable “what do we have and where is it” scripts without maintaining your own inventory tables.
Operationally, Fabric also added a preview feature in the monitoring hub that centralizes failure notification management for scheduled items. The new Schedule failures page consolidates the configuration and maintenance of email failure notifications across workspaces, reducing the common drift where some pipelines notify the right owners and others silently fail. This is not a new scheduler, but it is a step toward treating alerting as a platform setting instead of a per-item afterthought. That lands neatly after last week's architecture guidance on designing maintainable lakehouse pipelines (idempotency, retries, observability): you can make the pipeline logic robust, but you still need consistent operational ownership when something breaks at 2am.
On the cost and capacity side, Fabric moved a set of tools to general availability: updates to the capacity metrics app now include a Capacity health page plus timepoint summary and timepoint detail views, which are designed to make throttling and capacity pressure easier to spot and explain at a specific point in time. The Fabric Chargeback app is also now generally available, giving teams a supported path to allocate capacity costs across workspaces and workloads, which pairs naturally with the push toward more parallel Spark usage and better monitoring. In other words, as the platform encourages more automation and concurrency, it is also giving admins and platform teams better levers to answer the inevitable follow-up questions: “what caused the spike?” and “who pays for it?”
Finally, Fabric Data Warehouse got a schema-evolution quality-of-life improvement in preview: T-SQL ALTER TABLE ... ALTER COLUMN support for metadata-only schema changes. For many teams, schema evolution is where pipelines become brittle because type changes trigger rebuilds or force complicated migration playbooks. The pitch here is fewer disruptive rebuilds and fewer downstream breaks when the change can be handled as a metadata update, aligning with Delta Lake patterns like type widening. That also echoes last week's medallion framework guidance on schema evolution and rerun safety: you can design layers well, but you still need the warehouse and lakehouse tooling to make everyday changes survivable for downstream transformations and ML feature tables.
- High Concurrency Support for the Fabric Livy API— Scalable Spark Automation (Preview)
- Discover items across workspaces with the OneLake Catalog Search API, MCP and CLI tools (Preview)
- Manage failure notifications from the monitoring hub in Fabric (Preview)
- Providing more insights & tools: Capacity health, timepoint summary, timepoint detail, chargeback now generally available
- Simplify Schema Changes in Fabric Data Warehouse with ALTER COLUMN (Preview)
Microsoft Fabric Dataflow Gen2 and Direct Lake: Less rework in data prep, fewer surprises in semantic performance
Dataflow Gen2 picked up a preview feature that targets a very specific kind of waste: repeated Power Query work copied across pipelines. “My queries” lets you save Power Query (M) queries into a personal library, then import them into other dataflows when you need the same cleaning logic again. That changes the default from copy-paste reuse (which quietly forks logic) to an explicit reuse path, which should help teams standardize transformations like date handling, normalization, and data quality fixes without maintaining separate “template” PBIX files or shared snippets in wikis. It fits with last week's Fabric data engineering push (nested folder-aware lake transformations and dbt orchestration): the common goal is to make the “last mile” of dataset shaping more maintainable, whether that logic lives in lake transformations, dbt models, or Power Query steps.
On the consumption side, guidance for Direct Lake on SQL with Fabric Data Warehouse focused on what drives real performance when semantic models page Delta Parquet data into memory. The practical takeaway is that schema design and cardinality directly affect whether Direct Lake behaves like you expect, and the article calls out how and why models fall back to DirectQuery. That matters because fallback often shows up as “it was fast yesterday, why is it slow today” once a column's cardinality grows or a model change increases memory pressure. The best practices here center on designing for VertiPaq-friendly shapes, watching the common causes of fallback, and using those signals to decide whether to remodel, reduce cardinality, or accept DirectQuery for specific queries. Read next to last week's medallion decision guide and real-time ingestion pattern (SQL change events into Fabric), this is a useful reminder that upstream choices (schema evolution, incremental feeds, partitioning) quickly turn into downstream semantic performance issues if teams do not manage cardinality and model shape intentionally.
- From repetition to reuse: accelerate data prep with My queries in Dataflow Gen2
- Direct Lake on SQL with Fabric Data Warehouse
Azure Databricks: Disaster recovery planning and a workspace inventory you can query
Two Databricks posts landed on the operational end of machine learning platforms: keep the platform recoverable, and make it observable. A disaster recovery strategy write-up laid out a phased, customer-managed approach that forces the usual hard decisions into the open: what RTO/RPO targets are realistic, when active-active is worth the complexity, and when warm standby is the better trade. It also gets concrete about what must be replicated across regions, including Unity Catalog metadata and Delta data, and it frames DR as something you automate with infrastructure as code (IaC) and repeatable pipelines rather than a runbook you hope to never use. The inclusion of patterns like Delta Sharing and Deep Clone highlights a key detail: data and governance metadata need different replication tactics, and your DR plan needs to cover both. That parallels last week's Fabric guidance on planning secure streaming paths and maintainable lakehouse layers early: if you treat networking, governance metadata, and data layout as afterthoughts, they become the hardest things to retrofit when you need higher assurance (whether for private connectivity or for cross-region recovery).
The “single pane of glass” post tackled a different, but related, problem: once a Databricks workspace grows, it gets hard to answer basic questions about what exists, who owns it, what it costs, and how it is used. The proposed Discovery utility scans workspace assets (clusters, jobs, warehouses, Delta Live Tables, Unity Catalog objects, security configuration, billing, and utilization), writes the results into Unity Catalog Delta tables, and surfaces them through a Lakeview dashboard. That approach matters because it turns operational visibility into queryable data: you can audit configurations, track utilization patterns, and build internal controls without relying on screenshots or one-off admin scripts. In spirit, it is solving a similar problem to Fabric's new cross-workspace catalog search and centralized monitoring: at scale, governance and operations start with “can we reliably inventory what we run?”
- Resilient by Design: Azure Databricks Disaster Recovery Strategy
- From Chaos to Clarity: Your Databricks Workspace on a Single Pane of Glass
Model behavior control: Context engineering, RAG, and when to fine-tune
A model-behavior primer this week pulled together the toolbox teams are actually using in production to make models respond the way they need. It starts with prompt-level controls (zero-shot, one-shot, few-shot examples, and system prompts) and then moves into context engineering, where you shape what the model sees and how it sees it, often by structuring inputs rather than just adding more text. Retrieval-augmented generation (RAG) and embeddings show up here as the practical bridge between static model behavior and dynamic, domain-specific answers, especially when you need responses grounded in internal documents.
When prompts and RAG are not enough, the guidance shifts to model adaptation: fine-tuning and LoRA (low-rank adaptation). The useful framing is that fine-tuning is not the first tool you reach for, but it can be the right one when you need consistent style, structured outputs, or domain behaviors that are hard to achieve through context alone. LoRA is called out as a lighter-weight alternative for adaptation, which is often relevant when you want targeted behavior changes without the cost and operational overhead of full fine-tunes. Read alongside last week's Fabric MLOps thread (MLflow tracking across workspaces and secure, repeatable promotion), the “behavior control” takeaway is that whichever lever you choose (RAG, fine-tune, LoRA), you still need the operational backbone to version inputs, track experiments, and promote changes safely, otherwise improvements in model behavior are hard to reproduce and govern.
DevOps
This week in DevOps, the theme was making change safer and easier to operate at scale: GitHub shipped several governance and platform updates aimed at smoother migrations and tighter repo controls, while developer tooling and AI workflows kept pushing more work “left” into PRs, editors, and automation where teams can catch issues earlier. That through-line matches last week's focus on reliability-and-guardrails, where small “plumbing” changes (TLS and token formats) and workflow controls showed how easy it is for automation to fail when platforms evolve underneath it. This week, the story moves up a layer from transport and credentials to the operational mechanics of migrating, governing, and administering large GitHub footprints without pausing delivery.
GitHub Enterprise migrations and repository governance
GitHub Enterprise teams got a clearer path for modernizing their footprint without long freezes. Enterprise Live Migrations (ELM) entered public preview as a migration option from GitHub Enterprise Server (GHES) to GitHub Enterprise Cloud that keeps repositories continuously synced and then supports a fast cutover, which matters if you have active development and cannot afford an extended read-only window. GitHub positioned ELM as complementary to GitHub Enterprise Importer rather than a replacement, so migration planning now includes choosing between (or sequencing) importer-style moves and live sync depending on repo size, activity, and cutover constraints, and you will want to confirm your GHES version supports ELM before you design the runbook. It also pairs naturally with last week's “audit the dependencies you do not think about” message: migration plans tend to surface hidden assumptions in automation (auth flows, endpoint allowlists, TLS stacks, token handling), so having a live-sync option can reduce the operational risk of long freezes while you validate those lower-level prerequisites.
On day-to-day governance, repository rulesets continued to get more practical for real operations. You can now add individual users as bypass actors (via the UI, REST API, and GraphQL), which helps when you need a controlled exception for an on-call engineer, a release manager, or a service account without weakening the ruleset for everyone else. GitHub also unblocked a common admin pain point by allowing branch renames even when a branch is protected by rulesets, as long as the new branch name still falls within all applicable ruleset scopes. That means you can do cleanup like standardizing main/trunk naming or aligning release branch conventions without temporarily dismantling protections. After last week's discussion of scaling governance via APIs (Azure DevOps policy inspection) and tightening quality controls in review workflows, these ruleset changes read like the GitHub-side continuation: more granular exceptions and fewer “turn protections off to do admin work” moments, which is exactly where accidental drift and one-off breakages tend to creep in.
Maintainers also got a broader set of platform improvements framed around reducing noise and improving control, including contribution limits, pull request controls (including pull request archiving), and Issues and notification improvements, plus guidance and community programming for Maintainer Month. For teams maintaining critical internal or public repos, these changes tie back to the same goal as rulesets and migration tooling: keep collaboration open while staying predictable and manageable as activity scales. This also directly follows last week's maintainer-pressure thread around low-quality, AI-generated PRs and the need for practical controls that reduce reviewer load. The continued emphasis on maintainer tooling and workflow control suggests GitHub is treating “manage contribution volume without burning out maintainers” as an ongoing ops problem, not a one-off moderation feature.
- Enterprise Live Migrations is now in public preview
- Repository rulesets: User bypass and branch renaming
- Welcome to Maintainer Month: Celebrating the people behind the code
GitHub settings and API changes that affect automation
A small API change can still break a pipeline, and GitHub flagged one of those this week: the code_scanning_upload field will be removed from the /rate_limit REST API response on May 19, 2026. If you have internal tooling that parses /rate_limit to decide when to throttle code scanning uploads (or if you are just logging that field), you should update now so scripts do not fail when the response shape changes. It is the same category of risk as last week's platform notices (TLS behavior and longer token formats): nothing “breaks” in your YAML, but brittle assumptions around plumbing (response schemas, header sizes, token regexes, legacy clients) can still take down automation in ways that look random until you trace the dependency.
On the collaboration side, GitHub added a user-level setting to enable or disable commit comments by default for repositories owned by a personal account, while still allowing repository-level overrides. Disabling commit comments does more than hide UI affordances: it prevents creating commit comments through both the REST and GraphQL APIs, while keeping existing comments readable. For DevOps automation, that detail matters because bots and integrations that currently write commit comments (for example, posting lint results or deployment annotations) may need to shift to checks, PR comments, or Issues instead when a repo owner turns this off. In a week where GitHub is also leaning into maintainer ergonomics and noise reduction, this setting fits the broader workflow trend from last week: move machine output into the right channels (checks, dashboards, structured status) so humans are not forced to hunt through high-volume comment streams to understand whether a change is safe.
- Deprecation notice: code_scanning_upload field will be removed from rate_limit API endpoint
- Disable commit comments on the user level
VS Code and TypeScript workflows for building and operating projects
Visual Studio Code's April 2026 updates continued the trend of pulling more operational and AI-assisted work into the editor. The release highlights introduced the Agents Window, which is designed to make agent-based workflows easier to manage in one place, and an evaluations extension aimed at chat customizations so teams can test and refine assistant behavior rather than treating prompts as one-off experiments. Copilot for CLI picked up more control surface too, including a “thinking effort” setting (useful when you want to trade speed for deeper reasoning on complex tasks) and remote control capabilities that better fit SSH-heavy DevOps workflows where your “real” environment is not local. That builds cleanly on last week's MCP-and-agent thread (Goose, MCPUI, and the push toward agents that act against real systems): the difference this week is less about the connector layer and more about day-to-day operator ergonomics, with editors and CLI tools adding the knobs you need to make agent behavior observable and repeatable.
The VS Code team also recapped the month's changes in a Live session, tying editor updates to practical demos and showing how GitHub Copilot features are landing in real workflows. In parallel, Daniel Rosenwasser and James Montemagno walked through building websites from scratch with TypeScript 7, focusing on how to configure VS Code to stay efficient as the project grows. The key DevOps angle here is the tooling surface: Language Server Protocol (LSP) enhancements and TypeScript 7 integration details translate into faster feedback loops in-editor, which reduces the number of trivial failures that otherwise bounce into CI. That fits the same reliability goal as last week's Git 2.54 “quiet improvements” story: better defaults and better local feedback reduce the need for heavyweight process workarounds later in the pipeline.
- VS Code Release Highlights - April 2026
- VS Code Live: April Releases Recap
- Let it Cook - TypeScript 7 Websites from Scratch
Terraform validation and “shift left” reliability for Azure Functions
A concrete example of shifting reliability left showed up in a case study on Azure Functions deployments driven by Terraform. The core idea was validation-driven Terraform: move common deployment failures out of terraform apply and into pull request checks and plan-time verification by combining PR checks, Azure pre-flight checks, and Terraform-native mechanisms like input validation and preconditions. In practice, this makes deployments more predictable because developers get actionable failures while the change is still under review, not when a release pipeline is already running and time pressure is higher. It is a direct continuation of last week's “make governance enforceable in the PR/release flow” theme (cost gates, drift gates, scalable policy inspection): treat infrastructure rules and operational constraints as something you can test early with automation, not something you rediscover late during a rollout.
AI-assisted documentation that fits DevOps decision-making
AI support is not just for code this week, it showed up as a way to keep operational decisions documented without turning documentation into a bottleneck. A guide on Architecture Decision Records (ADRs) argued for using tools like GitHub Copilot and ChatGPT to draft ADRs quickly and consistently, so teams can treat documentation as a review-and-correct step rather than a blank-page writing task. It called out common ADR use cases that map directly to DevOps work (infrastructure, database choices, security controls) while warning about the real pitfall: if you accept AI output uncritically, you can end up with polished text that does not match your actual constraints, so the workflow needs explicit review and ownership. This complements last week's recommendation to keep architecture artifacts in the same GitHub PR flow as code: AI can reduce the effort to produce ADR drafts, but the governance value still comes from putting those decisions under the same review, rulesets, and audit trail as the changes they describe.
Azure
Azure updates this week leaned toward two big themes: modernizing long-lived infrastructure without breaking production, and putting clearer guardrails around the way teams build and run AI-powered systems on the platform. Between new VM families, new migration paths into zones and scale sets, cost model changes on reservations, and a growing set of tools for governed integration and agent operations, the message was consistent: keep what works, but make it easier to evolve. That connects directly to last week's throughline of controlled transitions and safer-by-default operations, where Azure kept removing implicit behavior and replacing it with explicit, automatable paths platform teams can standardize.
Azure Virtual Machines: new Intel Xeon 6 families and safer paths to higher availability
Azure Compute had a busy week that combined new capacity with pragmatic migration tooling. Microsoft announced general availability of the Dlsv7, Dsv7, and Esv7 VM series based on Intel Xeon 6 processors, positioning them for general purpose and memory-optimized workloads that need higher scale options alongside improved networking and storage. The announcement calls out Azure Boost as part of the platform improvements and highlights support for modern disk options such as Premium SSD v2 and Ultra Disk, which matters if you are balancing predictable IOPS/throughput with cost. This also lines up with last week's FinOps and performance tuning angle (Cosmos DB cost/perf and Blob smart tier GA): Azure keeps pairing “new capability” with “here is how to run it efficiently.”
On the resiliency side, two public previews focus on reducing the friction of moving off older deployment patterns. The first preview enables migration from Availability Sets to Virtual Machine Scale Sets (Flexible). Instead of requiring a rebuild, you can perform controlled, per-VM moves using the Azure portal or automation paths (CLI, PowerShell, REST), which is useful when you need to shift incrementally while keeping risk contained. That “incremental, reversible modernization” framing is the same operational posture as last week's guidance across networking and identity: make change explicit, testable, and standardizable rather than a big-bang cutover.
The second preview targets one of the most common blockers to adopting Availability Zones: existing regional (non-zonal) VMs. Azure is previewing an in-place migration that moves regional VMs and VMSS Flexible instances into availability zones while preserving resource IDs, names, disks, NICs, and IPs. The workflow is intentionally explicit (deallocate, update, start) so you can plan maintenance windows and validate dependencies, and the post documents preview limitations so you can decide where it is safe to test first. If you are doing this in environments that are simultaneously tightening network posture (like last week's “private subnets by default” change), the practical lesson is to treat zone migration as part of a broader dependency review (egress, DNS, and private endpoint reachability) rather than a pure compute move.
- Announcing General Availability of Azure Dl/D/Esv7-series VMs based on Intel Xeon 6 processors
- Public Preview: Migrate Availability Sets to Virtual Machine Scale Sets
- Public Preview: Migrate your regional virtual machines to availability zones
Azure cost planning: Reserved VM Instances retirement guidance and what to do before July 2026
Alongside feature work, Azure signaled a meaningful purchasing change: Azure Reserved Virtual Machine Instances will no longer be available for new purchase or renewal for select VM series starting July 1, 2026. The transition guide focuses on the practical steps teams need now, including how to identify which existing reservations are affected and how to choose a path based on workload reality. Coming right after last week's run of “small notices become real work” items (like protocol retirement timelines and SDK lifecycle changes), this is another heads-up where the calendar matters as much as the technology.
For some teams, the best move will be switching to the Azure savings plan for compute to retain discount coverage with more flexibility. For others, the guide frames modernization as the moment to reassess the underlying VM series or architecture rather than renewing, while noting that renewal before the cutoff may still be an option depending on your situation. The key takeaway is operational: treat this as a planning item for finance and engineering together, because reservation strategy, VM family choices, and modernization timelines will need to line up well before mid-2026. If you are already planning moves like Availability Set → VMSS Flexible or regional → zonal migrations, this retirement effectively becomes another input into sequencing (when to migrate, when to re-commit spend, and what “steady state” you want to reserve against).
Azure Container Registry: ACR-to-ACR pull-through caching for registry hierarchies
Azure Container Registry expanded Artifact Cache with a capability that fits how many orgs structure environments: you can now use another ACR as an upstream source. That effectively enables ACR-to-ACR pull-through caching, which is useful for image promotion patterns (for example, pulling from a central registry into a regional or environment-specific registry) and for building registry hierarchies where downstream registries can cache what they need without constantly reaching across network boundaries.
The walkthrough goes deep on the real setup details that typically cause friction: supported combinations of networking and authentication, using a user-assigned managed identity, and configuring the right RBAC roles (and where ABAC considerations may apply). It also calls out Private Link scenarios, which is often the deciding factor for enterprises that keep registries off the public internet. That continues last week's secure-by-default identity and networking direction: fewer reusable credentials, more managed identities, and more private connectivity as the baseline. In practice, if you're adopting last week's private subnet defaults (and making egress explicit via NAT Gateway where needed), registry hierarchy and caching can reduce cross-network pulls and help keep “what needs internet access” tightly scoped.
Azure integration: modernize to Logic Apps Standard with a migration agent and new Oracle in-process connectivity
Logic Apps Standard got two updates that complement each other: one aimed at modernization workflows, and another aimed at expanding what those workflows can do once they arrive.
For teams moving from BizTalk Server or other integration platforms, Microsoft introduced the open-source Logic Apps Migration Agent. The workflow is intentionally stage-gated and AI-assisted, but with human review checkpoints so teams can validate mappings and behavior before committing changes. The tooling integrates with VS Code and GitHub Copilot, which matters because modernization projects usually stall on developer ergonomics (how quickly you can iterate, review, and correct large sets of converted artifacts). This lands cleanly after last week's integration change-management note (Service Bus SBMP retirement for BizTalk 2020 customers): instead of treating protocol and adapter changes as one-off fixes, Azure is pushing a more repeatable “move to the supported platform path, then keep iterating” approach.
On the connectivity front, Logic Apps Standard added a public preview Oracle Database built-in connector that runs in-process in single-tenant workflows. The key operational detail is that it removes the need for a gateway when you already have network connectivity, which can simplify deployments in environments using VNET integration or Hybrid Logic Apps patterns. The announcement lays out supported actions, configuration options, current limitations, and troubleshooting guidance so you can decide whether the connector is ready for a given workload. It is also consistent with last week's private-first posture: when teams already invest in private connectivity and managed identity patterns, “in-process + your network” connectors tend to fit better than architectures that depend on extra gateway infrastructure and long-lived shared secrets.
- Bringing all your Integration workloads to Logic Apps Standard
- Announcing the public preview of Oracle Database built-in connector for Azure Logic Apps Standard
Azure AI governance and operations: landing zones for agents and MCP-based access to Azure Resource Manager
Azure Architecture and Azure management updates both tackled the same problem from different angles: teams are building more AI agents, but they need consistent control, policy, and safe operational access. This is a natural continuation of last week's Azure AI architecture coverage, which emphasized that production AI is mostly “platform plumbing” (identity, networking, evaluation, monitoring) wrapped around models. The difference this week is that the spotlight shifts from app reference designs (like the drone inspection pipeline) to governance patterns for agents and agent-driven operations.
A new reference architecture addresses “agent sprawl” with a multi-region AI agent landing zone on Azure. The design layers Azure API Management AI Gateway, Azure AI Foundry Control Plane, and Microsoft Agent 365 so you can centralize oversight across policy, safety controls, and evaluation while still allowing teams to ship. It also brings in identity and operational structure through components like Microsoft Entra Agent ID and uses Azure DevOps pipelines for provisioning, which makes the architecture feel closer to an adoptable platform blueprint than a conceptual diagram. If your platform team has been standardizing private endpoints, managed identities, and policy-as-code the way last week's roundup suggested, this landing zone approach is the AI-agent equivalent of those same paved-path ideas.
In parallel, Azure introduced a public preview Azure Resource Manager MCP Server (Model Context Protocol server). It is positioned as a remote MCP server that gives AI agents tool-based access to ARM operations, including translating natural language into Azure Resource Graph queries and supporting ARM template deployments directly from VS Code. For developers experimenting with agent-driven ops, the practical value is having a defined tool boundary for what an agent can do, plus the ability to align that access with governance mechanisms such as Azure Policy. Read together with last week's messaging about removing implicit behavior and locking down credential sprawl, this is another “make the interface explicit” move: agent actions become tool calls you can scope, audit, and constrain instead of ad-hoc scripts running with broad permissions.
- Governing Agent Sprawl: A Multi-Region AI Agent Landing Zone on Azure (Reference Architecture)
- Introducing the Azure Resource Manager MCP Server!
Other Azure News
Azure High Performance Computing shared a deep dive into how Azure keeps large, synchronous AI training jobs running despite routine network faults, describing the Fairwater AI supercomputer's use of Multipath Reliable Connection (MRC), a two-tier multi-plane topology, and static SRv6 source routing. The post also points to broader ecosystem work through the OCP MRC specification and related open-source libraries and plugins. Coming after last week's focus on private networking reliability (and the idea that networking and DNS are Tier-0 dependencies), this is the same story at a different scale: AI workloads are increasingly network-bound, so Azure is investing in explicit reliability mechanisms in the fabric rather than assuming the network will behave.
.NET
This week in .NET, Microsoft kept pushing the platform in two directions at once: modernizing long-running enterprise workloads (including mainframe connectivity) while tightening the inner loop for web and client apps with better testing, faster WebAssembly, and more pragmatic API design patterns. That maps cleanly to last week's split between “ship-ready platform updates” and “what's taking shape next.” .NET 10 keeps showing up as the standardization point (from Ubuntu 26.04 baselines and container tags to real products migrating runtimes), while .NET 11 previews continue to fill in practical workflow gaps (Blazor UX and testing) before they harden into defaults.
.NET 10 adoption in real products (Host Integration Server and WebAssembly)
Following last week's focus on .NET 10 as a new baseline on Ubuntu 26.04 (plus the matching resolute container tags for cleaner Linux standardization), Host Integration Server 2028 preview shows what “adoption” looks like when a long-lived enterprise product moves its core onto .NET 10. Microsoft is modernizing IBM connectivity by pairing that runtime move with new integration surfaces and platform cleanups. The headline for many teams will be expanded REST API surfaces for DB2 and for transaction integration scenarios (CICS and IMS), which shifts common host connectivity patterns closer to the HTTP-first tooling and operational model teams already use for modern services. On the ops and security side, the preview adds Microsoft Entra ID support for identity and Azure Arc support for hybrid management, aiming to make host-connected deployments fit better into current governance and inventory practices rather than living as a separate island. Microsoft is also using the release to deprecate legacy components and remove older dependencies, and it calls out Linux support for non-SNA features as part of the .NET 10-based direction (a practical signal if you are planning mixed OS deployments but still rely on specific protocol stacks). There is even an Azure AI Foundry integration callout, framing host data and transactions as candidates for AI-assisted experiences and workflows, not just back-office plumbing.
In a very different corner of the stack, Copilot Studio moved its .NET WebAssembly runtime from .NET 8 to .NET 10 and documented what that migration buys in practice: tighter defaults and measurable perf wins when you lean into ahead-of-time (AOT) compilation. This also connects back to last week's Linux baseline story, where Native AOT came up as part of making deployments leaner and more predictable. The post highlights built-in asset fingerprinting (useful for long-lived caching without serving stale bits) and a new default behavior around WasmStripILAfterAOT, which reduces what ships to the client after AOT. Taken together, these updates show .NET 10 maturing as a runtime you can standardize on for both hybrid enterprise integration servers and browser-hosted apps, with improvements landing not just in APIs but in the build and deployment output that affects cold start, download size, and caching behavior.
- Announcing Microsoft Host Integration Server 2028: Modern connectivity for IBM Mainframes Midranges
- Copilot Studio gets faster with .NET 10 on WebAssembly
ASP.NET Core and Blazor: testing and versioning getting more concrete
Blazor’s tooling story took a step toward more realistic UI validation with a first look at a new end-to-end (E2E) component testing library, previewed in .NET 11. Coming right after last week's .NET 11 direction-setting around Blazor validation and “move common plumbing into the framework,” this is another push toward making core app workflows (forms, lists, and now testing) less dependent on custom harnesses and one-off test infrastructure. In the Community Standup, Daniel Roth and Javier Calvarro Nelson walked through the motivation and demoed the approach against real apps, positioning it as a way to test components in scenarios closer to how users actually interact with Blazor UIs (instead of stopping at unit tests or framework-specific abstractions). The key point for teams planning upgrades is timing: it is framed as a .NET 11 preview with explicit next steps driven by community feedback, so this is a good moment to watch the API shape and align internal testing patterns before it hardens.
On the API side, a practical .NET 10 minimal API tutorial showed how to add versioning with Asp.Versioning.Http, including both query-string versioning and URL-segment versioning. This pairs naturally with last week's standup on aligning API versioning with OpenAPI output in .NET 10: once you start versioning minimal APIs, the next pain point is keeping docs and client generation aligned without duplicating configuration. Beyond the mechanics, it covers how to deprecate an API version, which matters when you are trying to keep minimal APIs minimal without painting yourself into a compatibility corner. The walkthrough uses a Visual Studio 2026 .http file for testing, reinforcing a workflow where versioning behavior is easy to validate in-repo without needing a heavier client setup.
- Blazor Community Standup: E2E Component Testing for Blazor
- Add versioning to .NET 10 minimal API using Asp.Versioning.Http
.NET MAUI: cross-language UI experiments and on-device processing demos
The latest .NET MAUI Community Standup leaned into experimentation across the UI and native boundary. In context, it feels like the “client app inner loop” thread from last week (where MauiDevFlow focused on faster build-deploy-inspect cycles and better inspection hooks) expanding into “what else can we wire into MAUI to unlock new app patterns.” David Ortinau and Gerald Versluis, joined by Nick Kovalsky, demoed scenarios that mix Rust with .NET MAUI, explored SkiaSharp in broader “everywhere” usage, and discussed a drawn-UI approach where custom rendering can offer tighter control over visuals and performance than standard widget composition. The session also touched AI/ML live processing work, pointing to app patterns where you process streams (like camera, audio, or sensor input) continuously rather than treating ML as a batch step. For MAUI teams, the takeaway is less about a single shipped feature and more about where the ecosystem is spending energy: interoperability, custom rendering pipelines, and practical ML workloads inside real client apps.
Other .NET News
Microsoft launched “Quest to Compile”, a new show focused on modern game development in the .NET ecosystem, covering core topics (gameplay programming, debugging) alongside day-to-day workflows like Git and AI-assisted development with GitHub Copilot.
Mark Russinovich explained why Win32 is still treated as a first-class API surface in 2026, tying it directly to Windows' deep Win32 foundations and the practical reality that long-term compatibility supports a large existing app ecosystem.
Amanda Silver revisited the origin of TypeScript, grounding it in the need to make large JavaScript codebases more scalable and maintainable through stronger structure and developer tooling support.
Security
Security updates this week landed in three places developers feel immediately: identity (with more passkey momentum and new token-theft campaign details), software supply chain (with tighter code-to-cloud visibility and new scanning options that work in agent-driven workflows), and infrastructure hardening (from open-sourcing HSM components to active Linux exploitation and stronger data platform controls). Coming right after last week's theme of shrinking ambient privilege and interrupting intrusion chains with automation, this week's items largely zoom in on the same question from different angles: once an attacker gets a foothold (or once risky code ships), how quickly can you detect it, bound it, and prove what happened.
Microsoft Defender threat research: token theft, macOS infostealers, and active Linux exploitation
Microsoft security researchers mapped out multiple active campaigns that target the gaps between “user is authenticated” and “attacker can operate as the user”, with a heavy focus on stealing tokens or escalating privileges after initial access. That builds directly on last week's token-centric identity framing and the Defender XDR incident writeups: the attacker goal stays the same (operate as a real user, move laterally, exfiltrate), but the tradecraft varies depending on what is easiest to reuse (session artifacts, interactive access, or local privilege escalation).
One investigation broke down a large-scale “code of conduct” themed phishing operation that uses an adversary-in-the-middle (AiTM) flow to capture authentication tokens, which can bypass MFA by replaying tokens and session cookies rather than brute-forcing passwords. If last week showed how hands-on access via “remote help” tools can bypass the phishing-vs-MFA debate entirely, this week is the more classic “steal the session and skip the password” story, with the same operational implication: you need identity telemetry and fast response paths for session abuse, not just better password policy. The write-up pairs the attack chain with practical response material, including Defender detections, Microsoft Defender for Office 365 guidance, Microsoft Entra ID Protection recommendations, Microsoft Defender XDR coverage, and Advanced Hunting queries plus IOCs so security teams can validate whether the campaign reached their tenants.
On endpoints, Microsoft detailed updated ClickFix-style social engineering on macOS where the “payload” starts with the user copying and pasting attacker-provided Terminal commands. The report outlines multiple campaign variants, how persistence is established (including LaunchAgents and LaunchDaemons), how command-and-control infrastructure is discovered, and how infostealers may progress into wallet trojanization. This is the same “attackers win when normal workflows get abused” theme that ran through last week's Quick Assist intrusion chain, just shifted to macOS and developer-style muscle memory (Terminal). For defenders, the value is in the concrete hunting and detection guidance (including Microsoft Defender for Endpoint KQL queries) and the extensive IOC set to speed up triage.
The most urgent infrastructure note was an “active attack” advisory for the “Dirty Frag” Linux local privilege escalation technique, expanding the risk after a system is already compromised by giving attackers a way to jump to higher privileges. That complements last week's emphasis on cutting off the middle of the chain (lateral movement and credential abuse) by calling out another mid-chain accelerant: privilege escalation that turns a limited foothold into broader control. Microsofts coverage calls out affected components (including esp4/esp6 and rxrpc) and provides interim mitigation steps, along with Microsoft Defender detection coverage so teams can both reduce exposure and monitor for exploitation attempts in the wild.
- Breaking the code: Multi-stage ‘code of conduct’ phishing campaign leads to AiTM token compromise
- ClickFix campaign uses fake macOS utilities lures to deliver infostealers
- Active attack: Dirty Frag Linux vulnerability expands post-compromise risk
GitHub + Defender for Cloud: bringing runtime context into code security, and scanning that works with AI agents
GitHubs security surface continued to shift toward “developer-first, but deployment-aware” workflows, starting with the general availability of code-to-cloud risk visibility via Microsoft Defender for Cloud integration with GitHub Advanced Security. This continues last week's supply chain focus (know what you built, limit blast radius when dependencies go bad) by extending the visibility story past CI and into production reality. The core change is correlation: teams can connect what shipped (deployed container artifacts) with what was known during development, then see runtime risk context directly inside GitHub security views. Practically, that means security and platform teams can triage findings with more signal (what is actually running, where, and with what risk context) instead of treating code findings as isolated from production. The GA update also adds runtime-aware filters and campaign targeting across code scanning and Dependabot, which helps teams focus remediation efforts on what is deployed rather than what is merely present in a repo.
In parallel, GitHub expanded what its MCP Server (Model Context Protocol) can do for security in agent-driven development. This picks up the thread from last week on governing agent tool execution (MCP control planes and per-call policy enforcement) by showing how security checks are moving into the same agent tool boundary where code is increasingly being proposed and edited. Secret scanning via GitHub MCP Server is now generally available, enabling MCP-compatible IDEs and AI coding agents to detect exposed secrets before commits or pull requests. A key detail for teams already using GitHub Advanced Security is consistency: the GA release honors existing push protection customization, so detection rules and bypass behavior remain aligned across standard GitHub workflows and MCP-driven tooling. Alongside that GA, dependency scanning in GitHub MCP Server entered public preview, letting AI coding agents and MCP-compatible IDEs check proposed changes for vulnerable dependencies using the Dependabot toolset and the GitHub Advisory Database before the change becomes a commit or PR. Taken together, these updates push scanning earlier in the loop (inside the editor and agent workflow) while keeping enterprise policies coherent across the “human PR” and “agent-assisted change” paths.
- Code-to-cloud risk visibility with Microsoft Defender for Cloud is now generally available
- Secret scanning with GitHub MCP Server is now generally available
- Dependency scanning with GitHub MCP Server is in public preview
Microsoft identity and passwordless: passkeys progress and recovery changes
Microsoft used World Passkey Day to summarize incremental but meaningful changes across Microsoft Entra ID, Windows, and consumer sign-in as passwordless adoption expands from “sign-in” into “sign-in plus recovery.” This is a clean continuation of last week's identity-first theme (reducing what attackers can steal and reuse) because recovery paths and helpdesk flows are where “passwordless” programs often get undermined in practice. The update highlights general availability improvements to Entra ID account recovery, which matters because recovery paths often become the weak link once primary authentication is hardened. Microsoft also reiterated a notable cleanup item: it plans to remove security questions as a password reset option starting January 2027, reducing reliance on low-signal knowledge-based answers that are frequently guessable, reused, or obtainable through social engineering. For teams rolling out FIDO2/passkeys, the practical takeaway is to treat recovery and helpdesk flows as part of the rollout plan, not as an afterthought.
Hardware and data platform controls: open HSM components and OneLake security GA
On the “trust the platform” side, Azure announced it is open-sourcing Azure Integrated HSM through the Open Compute Project, including firmware and supporting software plus independent validation artifacts. Paired with last week's emphasis on reducing exfil paths (for example, Fabric outbound access protection) and tightening identity boundaries, this is the lower-layer counterpart: if keys anchor your identity, encryption, and signing systems, then assurance in the HSM implementation becomes part of the overall “can we trust the control plane under pressure” story. The goal is verifiable key protection at scale for server-integrated hardware security modules (HSMs) that complement Azure Key Vault and Azure Managed HSM. The post frames this as a transparency and assurance move: by publishing artifacts and aligning with OCP SAFE, Azure enables deeper third-party scrutiny of how keys are protected by hardware-enforced controls, including the kind of assurance customers look for in regulated environments (the post calls out FIPS 140-3 Level 3). For organizations building stronger cryptographic trust chains, this is a reminder that key management is not only about API usage, but about attestation, validation evidence, and the ability to verify the underlying system design.
In Microsoft Fabric, OneLake security reached general availability with default enablement and an automatic upgrade rollout running through May. This follows last week's Fabric security arc (better controls at the boundary and clearer enforcement points for data movement) by tightening governance inside the lake itself: who can see which rows and columns, and how quickly teams can validate and automate those permissions. The GA focuses on making governance usable at scale: UI improvements, inline row-level security (RLS) validation, a role creation wizard that supports RLS and column-level security (CLS) authoring, and more granular REST APIs for role management. For teams using OneLake mirroring or consolidating data access patterns in Fabric, the practical impact is faster iteration on least-privilege role design (via the wizard and validation) and better automation hooks (via the new APIs) to keep permissions consistent across environments.
- Enforcing trust and transparency: Open-sourcing the Azure Integrated HSM
- OneLake security (Generally Available)
Other Security News
Inspektor Gadget published the results of its first independent security audit, patching three vulnerabilities (including CVE-2026-24905 and CVE-2026-25996) and documenting hardening recommendations. Coming right after last week's blend of “supply chain plus operational guardrails”, this is a useful reminder that observability and inspection tooling needs the same scrutiny as the workloads it monitors, especially when it hooks deeply into Linux and Kubernetes through eBPF. For teams using eBPF-based inspection in Kubernetes and on Linux hosts, the report is useful both as a validation point and as a practical checklist for tightening RBAC and deployment posture.