Domain 2 β€” Module 10 of 10 100%
18 of 26 overall
Domain 2: Ingest and Transform Data Free ⏱ ~14 min read

Real-Time Intelligence: KQL & Windowing

Query streaming data with KQL in Eventhouses. Choose between native tables and OneLake shortcuts, use query acceleration, and build windowing functions for time-series analytics.

Real-Time Intelligence in Fabric

Simple explanation

Think of a stock market trading floor.

Traders need information in seconds, not hours. They need to see price changes as they happen, calculate moving averages in real time, and spot trends the moment they emerge.

Real-Time Intelligence (RTI) in Fabric is that trading floor for your data. It uses Eventhouses (databases optimised for streaming data) and KQL (a query language built for speed on time-series data). You can query billions of events in seconds, create real-time dashboards, and build windowing functions that aggregate data into time buckets.

Native tables vs OneLake shortcuts

Native tables for real-time speed; shortcuts for reusing lakehouse data
FeatureNative TablesOneLake Shortcuts
Data locationStored in the Eventhouse (columnar format)Stored in OneLake lakehouse (Delta format)
IngestionDirect via Eventstreams, queued ingestion, or streaming ingestionNo ingestion β€” references existing lakehouse data
Query performanceFastest β€” optimised columnar engine with native indexingSlower β€” reads Delta files from OneLake at query time
Data freshnessReal-time (seconds)Depends on lakehouse refresh (batch or streaming)
Storage costEventhouse storage + OneLake copyNo extra storage (uses existing lakehouse data)
Best forHigh-throughput streaming data needing sub-second queriesQuerying existing lakehouse data with KQL without moving it
Exam tip: When to choose each

Use native tables when:

  • Data is streaming in real-time (Eventstreams, IoT Hub)
  • You need sub-second query latency
  • Data doesn’t already exist in a lakehouse

Use OneLake shortcuts when:

  • Data already exists in a lakehouse as Delta tables
  • You want to query it with KQL without duplicating storage
  • Real-time latency isn’t critical (seconds to minutes is acceptable)

Query acceleration for shortcuts

Standard OneLake shortcuts read Delta files directly from the lakehouse at query time β€” which can be slow for large datasets. Query acceleration adds a caching and indexing layer on top of shortcuts.

Standard for simplicity; query-accelerated for performance
FeatureStandard ShortcutQuery-Accelerated Shortcut
How it reads dataReads Delta files from OneLake on every queryCaches and indexes data locally in the Eventhouse
Query speedModerate β€” depends on Delta file size and structureFast β€” near-native table performance after caching
Data freshnessAlways current (reads source directly)Slight delay as cache refreshes
Extra storageNoneYes β€” cache storage in the Eventhouse
Best forSmall/medium datasets, infrequent queriesLarge datasets queried frequently with KQL

KQL windowing functions

Windowing functions aggregate data into time buckets β€” essential for time-series analytics.

Tumbling windows (non-overlapping)

// Orders per 5-minute window
OrderEvents
| where Timestamp > ago(1h)
| summarize
    OrderCount = count(),
    TotalRevenue = sum(Amount)
    by bin(Timestamp, 5m)            // 5-minute buckets, no overlap
| order by Timestamp asc

What’s happening: bin(Timestamp, 5m) rounds each event’s timestamp to its 5-minute bucket. Events at 10:02 and 10:04 both go into the 10:00-10:05 bucket.

Sliding windows (overlapping)

// 1-hour sliding average, calculated every 15 minutes
OrderEvents
| where Timestamp > ago(6h)
| summarize AvgRevenue = avg(Amount) by bin(Timestamp, 15m)
| extend RollingAvg = avg_if(AvgRevenue, Timestamp between (Timestamp - 1h .. Timestamp))

Session windows

// Group user events into sessions (gap of 30 minutes = new session)
ClickEvents
| where Timestamp > ago(24h)
| sort by UserId, Timestamp asc
| extend SessionGap = datetime_diff('minute', Timestamp, prev(Timestamp, 1))
| extend NewSession = iff(SessionGap > 30 or isnull(SessionGap), 1, 0)
| extend SessionId = row_cumsum(NewSession)
| summarize
    SessionStart = min(Timestamp),
    SessionEnd = max(Timestamp),
    EventCount = count(),
    SessionDuration = datetime_diff('minute', max(Timestamp), min(Timestamp))
    by UserId, SessionId
Scenario: Zoe's real-time dashboard

Zoe builds a real-time content performance dashboard for WaveMedia:

// Top 10 videos in the last 30 minutes, with 5-minute trend
PlaybackEvents
| where Timestamp > ago(30m)
| summarize ViewCount = count() by VideoTitle, bin(Timestamp, 5m)
| top-nested 10 of VideoTitle by TotalViews = sum(ViewCount),
  top-nested of bin(Timestamp, 5m) by Views = sum(ViewCount)
| render timechart

Content editors see which videos are trending RIGHT NOW and whether viewership is rising or falling within each 5-minute window.

Materialized views

For queries that run repeatedly (dashboards), you can pre-compute results with materialized views:

// Pre-aggregate hourly metrics (runs automatically on new data)
.create materialized-view HourlyMetrics on table OrderEvents
{
    OrderEvents
    | summarize
        TotalOrders = count(),
        TotalRevenue = sum(Amount),
        AvgAmount = avg(Amount)
        by bin(Timestamp, 1h), Region
}

Materialized views update incrementally as new data arrives β€” queries against the view are instant because results are pre-computed.


Question

What is the difference between native tables and OneLake shortcuts in RTI?

Click or press Enter to reveal answer

Answer

Native tables: data stored in the Eventhouse's columnar engine β€” fastest query performance, real-time ingestion. OneLake shortcuts: virtual references to lakehouse Delta tables β€” no extra storage but slower queries. Use native for streaming; shortcuts for existing lakehouse data.

Click to flip back

Question

What does query acceleration add to OneLake shortcuts?

Click or press Enter to reveal answer

Answer

A caching and indexing layer that stores shortcut data locally in the Eventhouse engine. This gives near-native query performance on lakehouse data, at the cost of extra storage and a slight freshness delay.

Click to flip back

Question

What does bin(Timestamp, 5m) do in KQL?

Click or press Enter to reveal answer

Answer

Rounds each timestamp to the nearest 5-minute bucket, creating tumbling (non-overlapping) windows. Events at 10:02 and 10:04 both map to the 10:00 bucket. Essential for time-series aggregations.

Click to flip back

Question

What is a KQL materialized view?

Click or press Enter to reveal answer

Answer

A pre-computed aggregation that updates incrementally as new data arrives. Queries against the view are instant because results are already calculated. Use for dashboard queries that run repeatedly on the same aggregation pattern.

Click to flip back


Knowledge Check

Zoe's Eventhouse receives 2 million playback events per minute via Eventstreams. She needs sub-second query latency for her real-time dashboard. Should she use native tables or OneLake shortcuts?

Knowledge Check

An Eventhouse already has an OneLake shortcut to a large lakehouse Delta table (500M rows). KQL queries against the shortcut are taking 30+ seconds. The data doesn't need to be real-time β€” it refreshes hourly. What should the engineer do to improve query performance?

Next up: Monitoring & Alerts: Catch Problems Early β€” use the Fabric Monitoring Hub to track ingestion, transformation, and refresh performance.