Weekly Machine Learning Roundup: Fabric, Agents, Biomed ML

Updates in ML this week delivered more productive workflows and better data quality for large-scale and enterprise deployments. Microsoft improved ML engineering, pipeline automation, and operational tooling with Spark and agent frameworks.

Microsoft Fabric Ecosystem: Engineering, Data Quality, and Automation

This week, Microsoft Fabric improvements target automation and barrier reduction in ML workflows. A step-by-step guide demonstrates how to automate data quality checks for every layer of a Medallion Architecture using Great Expectations for reusable, testable pipelines. The guide also explains how to integrate results with incident response and analytics workflows. The new Forecasting Service allows for nearly instant Spark notebook startup, building on the recent focus on serverless infrastructure and cost efficiency. Articles this week explain dynamic scheduling and predictive scaling using Azure Cosmos DB and Data Explorer. Variable Library is now available for Fabric Notebooks, offering centralized management for secrets and configuration, supporting automation and migration across environments. Update to Fabric Real-Time Intelligence changes how Anomaly Detector is billed—from instance-based to query-based—helping teams monitor usage and control costs more effectively.

Reinforcement Learning in AI Agents: Agent Lightning Open Source Release

Microsoft Research Asia has open-sourced Agent Lightning, a framework designed for reinforcement learning (RL) with support for decoupled RL training and agent execution. The platform enables workflow optimization for existing LLMAgent frameworks, supports hierarchical RL for complex tasks, and allows flexible plug-in of new RL algorithms. Agent Lightning streamlines logging, supports both GPU and CPU, and increases accuracy in a range of scenarios including text-to-SQL, RAG, and multi-agent QA. Continuous learning is planned. Development is underway to offer better prompt optimization and easy RL integration in live AI applications.

AI for Biomedical Workflows: GigaTIME Spatial Proteomics Platform

GigaTIME, a spatial proteomics platform, lets scientists use machine learning on digital slides to measure protein distributions at scale, removing the need for expensive assays. It supports broad analysis and rapid hypothesis generation for cancer research, representing practical ML for biomedical challenges.