Domain 3 β€” Module 2 of 8 25%
20 of 26 overall
Domain 3: Monitor and Optimize an Analytics Solution Free ⏱ ~12 min read

Troubleshoot Pipelines & Dataflows

Identify and resolve common pipeline and Dataflow Gen2 errors β€” connection failures, timeout issues, activity errors, and data refresh problems.

Troubleshooting mindset

Simple explanation

Think of a detective investigating a crime scene.

You don’t guess β€” you follow the evidence. In Fabric, the evidence is in the error message, the activity run details, and the Monitoring Hub logs. The pattern is always: What failed? β†’ What was it trying to do? β†’ What does the error say? β†’ What changed recently?

Common pipeline errors

Start with the error message β€” it usually points directly to the cause
Error PatternTypical CauseResolution
Connection timeoutSource database is unreachable, firewall blocking, VNet misconfigurationCheck network connectivity, verify firewall rules allow Fabric IP ranges, test connection in pipeline settings
Authentication failureExpired credentials, rotated keys, service principal permission removedUpdate connection credentials in the pipeline or linked service settings
Copy activity: data type mismatchSource column type doesn't match destination schemaMap column types explicitly in the Copy activity mapping tab, or transform with a Dataflow/notebook first
Activity timeoutActivity exceeded its configured timeout (default varies by type)Increase timeout setting, or optimize the activity (reduce data volume, improve query)
Concurrent run conflictPipeline is already running and concurrency limit is reachedWait for the current run, increase max concurrent runs, or investigate why runs overlap
Parameter errorMissing required parameter, wrong data type, expression syntax errorCheck parameter definitions and expressions β€” use the expression builder to validate
Scenario: Carlos's Monday morning pipeline failure

Carlos arrives Monday to find the overnight pipeline failed. He investigates:

  1. Monitoring Hub: Pipeline β€œDaily-Production-Load” failed at 1:23 AM
  2. Activity detail: Copy activity β€œCopy-SAP-Production” failed
  3. Error message: β€œUnable to connect to server β€˜sap-db.contoso.com’. Login timeout expired.”
  4. Recent changes: The infrastructure team rotated database credentials on Friday

Fix: Update the connection credentials in the pipeline’s linked service. Re-run the pipeline.

Common Dataflow Gen2 errors

Error PatternTypical CauseResolution
Refresh failed: data source errorSource connector authentication expired or source unavailableRe-authenticate the connector in Dataflow settings
Evaluation timeoutQuery takes too long (complex M query on large dataset)Enable staging lakehouse for query folding, simplify transformations
Mashup engine memory limitDataset too large for in-memory processingEnable staging lakehouse, filter data earlier, or switch to a notebook
Schema mismatch on destinationDataflow output columns don’t match destination tableUpdate column mapping or evolve destination schema
Gateway errorOn-premises data gateway is offline or misconfiguredCheck gateway status, restart if needed, verify data source settings
Exam tip: Staging lakehouse solves many Dataflow issues

Many Dataflow Gen2 performance and memory errors trace back to the same root cause: the Power Query mashup engine is trying to process too much data in memory.

Fix pattern: Enable a staging lakehouse. This stages intermediate data as Delta tables in OneLake, enabling the enhanced compute engine to process large datasets efficiently instead of holding everything in the Power Query mashup engine’s memory. If the exam describes a Dataflow Gen2 with memory or timeout errors on large datasets, look for β€œstaging lakehouse” in the answers.


Question

What is the first thing to check when a pipeline fails?

Click or press Enter to reveal answer

Answer

The error message in the activity run details (Monitoring Hub). It usually identifies the exact cause: connection timeout, auth failure, data type mismatch, etc. Then check what changed recently.

Click to flip back

Question

A Dataflow Gen2 hits a memory limit on a large dataset. What's the recommended fix?

Click or press Enter to reveal answer

Answer

Enable a staging lakehouse. This offloads intermediate data to Delta tables and enables query folding, so the mashup engine doesn't need to hold the entire dataset in memory.

Click to flip back


Knowledge Check

A Copy activity in Carlos's pipeline fails with 'Unable to connect β€” login timeout expired.' The source is an Azure SQL Database. What should Carlos check first?

Knowledge Check

A Dataflow Gen2 processing 50 million rows from Salesforce keeps timing out. A staging lakehouse is NOT configured. What is the most likely fix?

Next up: Troubleshoot Notebooks & SQL β€” resolve Spark job failures, T-SQL errors, and memory issues.