Domain 1 β€” Module 5 of 7 71%
5 of 27 overall
Domain 1: Core Data Concepts Free ⏱ ~10 min read

Analytical Workloads: Finding the Insights

Transactional systems record what happens. Analytical systems answer why it happened and what to do next. Let's explore the world of OLAP, data warehouses, and reporting.

What is an analytical workload?

Simple explanation

Analytical workloads answer big-picture questions using lots of data.

Imagine Priya at FreshMart. The point-of-sale system records every transaction (that’s transactional). But Priya’s job is different β€” she asks questions like β€œWhich stores had the highest sales last quarter?” or β€œAre organic products growing faster than conventional?”

To answer those questions, she doesn’t need individual receipts. She needs summaries, trends, and comparisons across millions of records. That’s an analytical workload.

How analytical workloads differ from transactional

FeatureTransactional (OLTP)Analytical (OLAP)
Question type”What just happened?""What happened over time?”
Data volume per querySingle records or small batchesMillions to billions of records
OperationsINSERT, UPDATE, DELETESELECT with aggregations (SUM, AVG, COUNT)
Data freshnessReal-time (current state)Near-real-time to periodic (often loaded in batches)
UsersOperational staff (cashiers, dispatchers)Analysts, managers, executives
Schema designNormalised (reduce duplication)Denormalised (optimise for reading)
Azure servicesAzure SQL DatabaseMicrosoft Fabric, Azure Databricks, Power BI

The analytical pipeline

Data doesn’t start in an analytical system β€” it flows there from transactional sources through a pipeline:

  1. Source systems β€” Transactional databases, CRM, ERP, IoT devices, logs
  2. Ingestion β€” Extract data from sources (ETL or ELT)
  3. Storage β€” Land data in a data warehouse or data lake
  4. Modelling β€” Organise data into dimensions and measures (star schema)
  5. Visualisation β€” Build dashboards and reports (Power BI)

Priya’s pipeline at FreshMart:

  1. Sales data flows from 50 store POS systems β†’ 2. Azure Data Factory extracts and transforms it nightly β†’ 3. Lands in a Fabric data warehouse β†’ 4. Priya builds a star schema (stores, products, dates, sales amounts) β†’ 5. Power BI dashboards show trends to regional managers
ETL vs ELT β€” what's the difference?

Both are ways to move data from source to analytical storage:

  • ETL (Extract, Transform, Load): Extract data, transform it (clean, reshape, aggregate) in a separate engine, then load the result into the warehouse. Traditional approach.
  • ELT (Extract, Load, Transform): Extract data, load it raw into the data lake or warehouse, then transform it inside the storage system using its compute power. Modern approach β€” especially with Fabric and Databricks.

Key difference: In ELT, the raw data is preserved in the lake, so you can re-transform it later if business requirements change. In ETL, only the transformed result is stored.

Key concepts in analytical workloads

Data warehouses

A data warehouse is a centralised repository of structured, historical data optimised for analytical queries. Data is cleaned, transformed, and organised before loading.

Data lakes

A data lake stores raw data in its original format β€” structured, semi-structured, and unstructured. It’s a landing zone for everything. Analytics tools query the lake directly.

Lakehouses

A lakehouse combines the best of both β€” the raw storage of a data lake with the query performance and structure of a data warehouse. Microsoft Fabric uses this model.

Exam tip: OLAP scenario recognition

Look for these signals in exam questions:

  • β€œManagement wants a report on last quarter’s performance” β†’ Analytical
  • β€œThe dashboard shows sales trends by region” β†’ Analytical
  • β€œData is aggregated from multiple source systems” β†’ Analytical
  • β€œA star schema organises data for reporting” β†’ Analytical

If the scenario mentions β€œrecording a sale,” β€œplacing an order,” or β€œupdating a record” β€” that’s transactional.

Flashcards

Question

What does OLAP stand for?

Click or press Enter to reveal answer

Answer

Online Analytical Processing. OLAP systems process large volumes of historical data to find patterns, trends, and insights β€” answering questions like 'what happened' and 'why.'

Click to flip back

Question

What's the difference between ETL and ELT?

Click or press Enter to reveal answer

Answer

ETL transforms data before loading it into storage. ELT loads raw data first, then transforms it inside the storage system. ELT preserves raw data and leverages modern compute power.

Click to flip back

Question

What is a lakehouse?

Click or press Enter to reveal answer

Answer

A lakehouse combines the raw storage flexibility of a data lake with the structured query performance of a data warehouse. Microsoft Fabric uses this approach with OneLake.

Click to flip back

Question

What is denormalised data, and why is it used in analytics?

Click or press Enter to reveal answer

Answer

Denormalised data combines related information into fewer tables (with some duplication) to reduce the number of joins needed. This speeds up read-heavy analytical queries at the cost of more storage.

Click to flip back

Knowledge check

Knowledge Check

Priya needs to create a monthly dashboard showing total sales by store, product category, and region for the past 12 months. What type of workload is this?

Knowledge Check

A company loads raw sales data into a data lake first, then uses Fabric to clean, transform, and model it for reporting. What is this approach called?

Next up: Data Roles: DBA, Engineer & Analyst β€” meet the people who build and manage these data systems.