Weekly Machine Learning Roundup: Fabric ETL, AI Data Prep, Experiments

This week’s updates in machine learning focus on deeper integrations with the Microsoft Fabric ecosystem: faster, more affordable data transformations, enhanced multitasking, and streamlined collaboration. Broader connectivity between cloud and analytics platforms and stronger tools for experimentation mark a focus on flexible, interactive enterprise workflows.

Dataflow Gen2 in Microsoft Fabric: Performance, Integration, and Developer Experience

Dataflow Gen2 in Microsoft Fabric gains a new pricing model, helping organizations manage ETL costs for jobs of all sizes. The Modern Query Evaluation Service speeds up parallel queries for lower expenses and shorter runtimes, advancing last week’s troubleshooting features like Spark monitoring APIs. Real-time analytics and previews allow faster iteration on transformation logic, with outputs now targeting Fabric Lakehouse, Azure Data Lake Gen2, SharePoint (CSV), Snowflake (preview), and OneLake Catalog management—matching the trend of multi-environment integration. Copilot now enables natural language transformation and ingestion, contributing to collaborative machine learning themes. Migration from Gen1 is supported by dedicated tools. Permission management, schema controls, and hybrid architecture improvements continue the previous focus on operational governance.

AI-Powered Data Transformation and Developer Tools

Fabric Data Wrangler now supports fast AI-driven text summarization, translation, and sentiment analysis through PROSE suggestions and live previews. Copilot prompts generate custom transformation code and feedback, minimizing manual coding for complex datasets. Conversion between pandas and PySpark further scales projects, while documentation and guides support adoption of these new workflows.

Multitasking and Workflow Improvements in Microsoft Fabric

Fabric’s updated horizontal tabs permit working on multiple items, along with workspace color coding and numbering to prevent errors and reduce context switching. The Object Explorer and higher concurrent item limits cater to users who need advanced multitasking—building on recent improvements for async processing and VS Code extension integration. These features are specific to Fabric.

Experimentation Analytics with Statsig in Microsoft Fabric

Statsig Experimentation Analytics in Fabric provides tools for running and analyzing A/B/n tests on OneLake data, using frequentist statistics and near real-time metrics via Statsig’s Explorer. Instant results allow rapid update cycles, and Power BI integration assists visual review of experiments. Structured workflows help teams validate ML models, continuing last week’s focus on practical MLOps processes.