Cosmos DB Change Feed: Real-Time Reactions

What the change feed is

Simple explanation

The change feed is a permanent, ordered log of every insert and update inside a Cosmos container. Whenever a document is created or modified, that event lands on the change feed in the partition’s order. You can read the log, react to events, and remember where you stopped — like reading a journal.

For AI apps, this enables patterns the platform makes easy: “every time a new article is inserted, generate an embedding and update the doc”; “every time an order document changes, recompute personalisation tags”; “every audit event triggers a downstream alert.”

You read the change feed in two ways: a processor (push model) that delivers events to your callback continuously, or a pull model where your code asks for the next batch. The exam expects you to know the processor pattern.

The processor pattern

# Python — change feed processor for Cosmos DB
from azure.cosmos.aio import CosmosClient

async def handle_changes(context, changes):
    for doc in changes:
        if doc.get("kind") == "article" and not doc.get("embedding"):
            # New article without embedding — go generate one
            await embed_and_update(doc)

async def main():
    async with CosmosClient(URL, credential=CRED) as client:
        db = client.get_database_client("knowledge")
        feed_container = db.get_container_client("articles")
        lease_container = db.get_container_client("articles-leases")

        processor = await feed_container.get_change_feed_processor(
            processor_name="embedder",
            lease_container=lease_container,
            on_changes_delegate=handle_changes,
        )
        await processor.start()
        try:
            await asyncio.Event().wait()  # run forever
        finally:
            await processor.stop()

Three roles in this code:

Role	Container	Purpose
Monitored container	`articles`	Where the documents live and changes occur
Lease container	`articles-leases`	Stores cursor state per partition; needed for distributed processors
Processor host	Your worker	Multiple instances coordinate via the lease container

Run the processor in three replicas? Cosmos partitions are leased across them automatically. One instance crashes? Its leases time out and rebalance to surviving instances. No code change.

Push vs pull

The processor is the recommended pattern for ongoing reactive work; pull is for explicit batch processing.
Feature	Change feed processor (push)	Change feed pull model
Who drives the loop	The library — your callback receives batches	Your code — you call `query_items_change_feed` explicitly
Lease management	Automatic via lease container	Manual — you store continuation tokens
Best for	Long-lived workers, partition rebalancing across replicas, the default choice	Custom apps that need precise cursor control or batch jobs
When the exam usually expects you to use this	Event-driven workers, RAG embedding pipelines, materialised views	Backfill scripts, snapshot-and-replay tools

Modes — latest version vs all versions and deletes

Mode	What you see	Where it matters
Latest version (default)	The newest version of each changed document; deletes are NOT included	Most replication / projection workflows
All versions and deletes	Every intermediate version + delete tombstones	Audit trails, full event sourcing, compliance use cases

# Configure to include deletes (Python)
processor = await feed_container.get_change_feed_processor(
    processor_name="audit",
    lease_container=lease_container,
    on_changes_delegate=handle_changes,
    mode="AllVersionsAndDeletes",
)

Theo’s clinical assistant uses all versions and deletes so the audit table records every patient-record edit with full history. Mira’s embedding worker uses the default latest version — she only cares about the most recent state.

Common AI patterns

Pattern 1 — async embedding pipeline

Article created in Cosmos
   ↓ (change feed)
Embedder worker picks it up
   ↓
Calls Azure OpenAI for an embedding
   ↓
Updates the doc in place: writes /embedding field

The same container holds source content AND its embeddings. Documents missing /embedding come through the change feed; once the embedding is patched in, the change feed skips it (because kind filter rejects already-processed shape).

Pattern 2 — denormalised materialised views

Order doc updated
   ↓ (change feed)
Projection worker reads the new state
   ↓
Writes one or more rows in a flattened "order_summary" container

The summary container is partitioned for the query workload (e.g., /customerId) while the source is partitioned for the write workload (e.g., /orderId). The change feed keeps them aligned.

Pattern 3 — KEDA scaler triggers Container Apps

The Cosmos DB KEDA scaler watches the change-feed lag and scales replicas accordingly:

rules:
  - name: cosmos-cf-scale
    custom:
      type: azure-cosmosdb
      metadata:
        connectionFromEnv: COSMOS_CONNECTION_STRING
        databaseName: knowledge
        containerName: articles
        leaseContainerName: articles-leases
        eventCount: "100"

Translation: “for every 100 unprocessed change-feed events, run one replica.” When Mira ingests 5,000 new articles, KEDA scales the embedder out, the worker chews through the backlog, and replicas drop back when the feed catches up.

Real-world example: Priya's loyalty pipeline

BeanCraft Coffee writes raw transactions into Cosmos’s transactions container. A change feed processor:

Watches transactions (partitioned by /storeId)
For each new transaction, calls Azure OpenAI to compute personalisation tags
Writes the result into customer_summary (partitioned by /customerId)
Lease container is transactions-leases

KEDA scales the processor on Service Bus during morning rush. The change feed and KEDA pair makes the same workload self-tuning all day.

Operational notes

Concern	What you do
Lease container needs throughput too	Provision RU/s on the lease container — it gets writes for every partition checkpoint
Processor name uniqueness	Two different processors (embedder, auditor) can read the same change feed independently as long as they use different `processor_name` (and ideally different lease containers)
Deletes	Default mode skips deletes. Use all versions and deletes if you need them
At-least-once delivery	Your handler must be idempotent — the same change can replay if a host crashes mid-batch
Order	Guaranteed within a partition, NOT across partitions