Domain 4 β€” Module 3 of 4 75%
20 of 28 overall
Domain 4: Optimize Query and Operation Performance Free ⏱ ~12 min read

Integrated Cache and Dedicated Gateway

Implement the Cosmos DB integrated cache with a dedicated gateway cluster to reduce RU consumption for read-heavy workloads β€” including staleness configuration and when to use it.

What is the integrated cache?

Simple explanation

Think of it as a fast shelf next to the warehouse loading dock. Instead of walking to the back of the warehouse every time someone asks for the same product, you keep popular items on the shelf near the door. Much faster, and the warehouse workers aren’t interrupted.

The integrated cache stores frequently-read items in memory at the gateway so repeated reads don’t consume RU/s from the backend.

Amara’s read-heavy dashboard

πŸ“‘ Amara at SensorFlow built a dashboard that shows the latest readings for 10,000 devices. The dashboard auto-refreshes every 30 seconds. That’s 10,000 point reads every 30 seconds β€” about 333 reads/second, costing ~333 RU/s. Most readings haven’t changed since the last refresh.

With the integrated cache, repeat reads of unchanged data are served from cache at zero RU cost.

How it works

Application
    β”‚
    β–Ό
Dedicated Gateway (with integrated cache)
    β”‚
    β”œβ”€β”€ Cache HIT  β†’ return from memory (0 RU)
    β”‚
    └── Cache MISS β†’ fetch from backend (normal RU cost)
                     β†’ store in cache for next request

Setting up the dedicated gateway

Step 1: Provision a dedicated gateway cluster:

az cosmosdb service create \
  --resource-group rg-sensorflow \
  --account-name sensorflow-cosmos \
  --name "SqlDedicatedGateway" \
  --kind "SqlDedicatedGateway" \
  --count 3 \
  --size "Cosmos.D4s"

Step 2: Connect via gateway mode (required for cache):

CosmosClientOptions options = new CosmosClientOptions
{
    // Use the dedicated gateway endpoint, not the standard endpoint
    ConnectionMode = ConnectionMode.Gateway
};

// The dedicated gateway endpoint follows this pattern:
// https://<account-name>.sqlx.cosmos.azure.com
CosmosClient client = new CosmosClient(
    "https://sensorflow-cosmos.sqlx.cosmos.azure.com",
    key,
    options
);
Exam tip: gateway mode is mandatory

The integrated cache only works with ConnectionMode.Gateway via the dedicated gateway endpoint (.sqlx.cosmos.azure.com). If you use ConnectionMode.Direct (the default and recommended for production), the cache is bypassed entirely.

This is a key exam trap: direct mode has lower latency but skips the cache. Gateway mode adds a network hop but enables caching. Choose based on your read pattern.

MaxIntegratedCacheStaleness

Control how stale cached data can be:

// Cache results for up to 2 minutes
FeedIterator<SensorReading> iterator = container.GetItemQueryIterator<SensorReading>(
    query,
    requestOptions: new QueryRequestOptions
    {
        DedicatedGatewayRequestOptions = new DedicatedGatewayRequestOptions
        {
            MaxIntegratedCacheStaleness = TimeSpan.FromMinutes(2)
        }
    }
);
Staleness ValueBehaviour
TimeSpan.ZeroAlways fetch fresh data (bypass cache)
Small (seconds)Near-real-time with minimal caching benefit
Large (minutes)Higher cache hit rate, more stale data
Not setUses the server default

When to use the integrated cache

ScenarioIntegrated Cache?Why
Read-heavy dashboard with repeated queriesβœ… YesSame data read repeatedly β€” high cache hit rate
IoT ingestion (write-heavy)❌ NoWrites don't benefit from read cache
Unique queries (low repetition)❌ NoLow cache hit rate β€” most reads are cache misses
Session-scoped point readsβœ… YesSame session data read multiple times per session
Serverless accounts❌ NoDedicated gateway not supported on serverless
Applications using Direct mode❌ NoCache requires Gateway mode via dedicated gateway

Limitations

  • Serverless: Not supported on serverless accounts
  • Multi-region writes: Cache is per-gateway, region-local
  • Write-through: Writes always go to the backend β€” cache is read-only
  • Size: Cache size depends on the dedicated gateway SKU
  • Cost: Dedicated gateway cluster has its own pricing (separate from RU/s)

🎬 Video walkthrough

Flashcards

Question

What connection mode is required to use the integrated cache?

Click or press Enter to reveal answer

Answer

Gateway mode via the dedicated gateway endpoint (.sqlx.cosmos.azure.com). Direct mode (the default) bypasses the cache entirely. This is a common exam trap.

Click to flip back

Question

Does a cache hit consume provisioned RU/s?

Click or press Enter to reveal answer

Answer

No β€” cache hits are served from in-memory at the dedicated gateway and consume zero RU/s. Only cache misses incur normal RU charges when fetching from the backend.

Click to flip back

Question

What does MaxIntegratedCacheStaleness control?

Click or press Enter to reveal answer

Answer

How long cached data remains valid before a fresh read from the backend is required. A longer staleness window gives higher cache hit rates but older data. TimeSpan.Zero forces fresh reads every time.

Click to flip back

Knowledge Check

Knowledge Check

Amara's dashboard reads the same 10,000 device readings every 30 seconds. She enables the integrated cache with 2-minute staleness. What happens to RU consumption?

Knowledge Check

A developer enables the integrated cache but sees no RU savings. Their connection string uses the standard endpoint. What's wrong?


Next up: Change Feed Patterns β€” materialized views, denormalization propagation, and the change feed estimator for lag monitoring.