Microsoft Fabric Blog showcases how JobInsight empowers developers and data engineers to analyze Spark applications in Microsoft Fabric, offering metrics extraction, log diagnostics, and advanced troubleshooting directly within Fabric Notebooks.

Gain Deeper Insights into Spark Jobs with JobInsight in Microsoft Fabric

JobInsight is a powerful Java-based diagnostic library that streamlines the analysis of completed Spark jobs within Microsoft Fabric. Designed for both developers and data engineers, JobInsight provides direct programmatic access to Spark execution metrics and logs straight from a Fabric Notebook environment.

Key Capabilities

Interactive Spark Job Analysis

  • Access in-depth Spark execution data, including queries, jobs, stages, tasks, and executors.
  • Structured APIs return these objects as Spark Datasets for investigation, visualization, and custom analysis.

Spark Event Log Access

  • Easily copy Spark event logs to a OneLake or ADLS Gen2 directory.
  • Ideal for long-term storage, custom diagnostics, or offline inspection.

How to Use JobInsight

Analyzing Completed Spark Applications:

import com.microsoft.jobinsight.diagnostic.SparkDiagnostic

val jobInsight = SparkDiagnostic.analyze(
  workspaceId,
  artifactId,
  livyId,
  jobType,         // e.g., "sessions" or "batches"
  stateStorePath,  // Output path to store analysis results
  attemptId        // Optional; defaults to 1
)

val queries = jobInsight.queries
val jobs = jobInsight.jobs
val stages = jobInsight.stages
val tasks = jobInsight.tasks
val executors = jobInsight.executors

This enables direct, code-driven performance analysis, anomaly detection, and debugging inside Fabric.

Reloading Previous Analyses:

val jobInsight = SparkDiagnostic.loadJobInsight(stateStorePath)

val queries = jobInsight.queries
val jobs = jobInsight.jobs

This feature makes iterative and historical investigations efficient.

Saving Metrics and Logs to a Lakehouse:

val df = jobInsight.queries

df.write
  .format("delta")
  .mode("overwrite")
  .saveAsTable("sparkdiagnostic_lh.Queries")

Repeat for additional DataFrames (jobs, stages, etc.) as required.

Copying Event Logs for In-Depth Analysis

Move raw Spark event logs to OneLake or ADLS Gen2 for long-term retention or advanced offline debugging:

import com.microsoft.jobinsight.diagnostic.LogUtils

val contentLength = LogUtils.copyEventLog(
  workspaceId,
  artifactId,
  livyId,
  jobType,
  targetDirectory,
  asyncMode = true, // Use async mode for better performance
  attemptId = 1
)

Example Usage:

val lakehouseBaseDir = "abfss://<workspace>@<onelake>/Files/eventlog/0513"
val jobType = "sessions"

copyEventLogs(workspaceId, artifactId, livyId, jobType, attemptId = 1, asyncMode = true, s"$lakehouseBaseDir/$jobType/async")
copyEventLogs(workspaceId, artifactId, livyId, jobType, attemptId = 1, asyncMode = false, s"$lakehouseBaseDir/$jobType/sync")

Why Use JobInsight?

  • Visualize Spark execution breakdowns by job, stage, or executor
  • Monitor and tune resource utilization
  • Identify and resolve Spark performance bottlenecks
  • Save, reuse, and automate diagnostics workflows
  • Seamless integration with Delta Lake and Lakehouse paradigms

For more details and documentation, refer to the JobInsight diagnostics library (Preview).

This post appeared first on “Microsoft Fabric Blog”. Read the entire article here