Azure Container Apps: Deploy + Revision Management

What Container Apps actually is

Simple explanation

Azure Container Apps is “I want containers, not Kubernetes.” Underneath, it IS Kubernetes — Microsoft runs the cluster — but you never see a node, a pod, a manifest, or kubectl. You give Azure a container image and a few rules (“scale from 0 to 50 when the queue grows”), and Azure runs it.

The defining feature is scale to zero. When nothing’s calling your AI inference endpoint, the cost is $0 — no idle replicas, no warm pool. When traffic arrives, KEDA wakes a replica in seconds.

The other defining feature is revisions. Every deploy creates a new immutable revision. You can split traffic across revisions (10% on the new model, 90% on the old) or roll back instantly by changing the active revision.

The hierarchy

Container Apps Environment (= shared VNet + log analytics)
├── Container App: roo-vision-inference
│   ├── Revision: roo-vision-inference--v3
│   ├── Revision: roo-vision-inference--v3-1 (90% traffic)
│   └── Revision: roo-vision-inference--v3-2 (10% traffic, new model)
├── Container App: roo-orchestrator
└── Container App: roo-dashboard

Every change creates a revision. Whether traffic flows to it depends on your revision mode.

Revision modes — single vs multiple

Pick single revision unless you specifically need to canary across revisions.
Feature	Single revision mode (default)	Multiple revision mode
What it does	Each new revision becomes the only active one; old revisions deactivate	Multiple revisions can be active simultaneously and serve traffic
Use for	Most production apps — simple, ship-it-and-go	Blue/green, canary, A/B testing across model versions
Switch with	Default — no extra config	`az containerapp revision set-mode --mode multiple`
Traffic control	Always 100% on latest revision	You explicitly assign traffic % per revision

# Switch to multiple revision mode
az containerapp revision set-mode \
  --name roo-vision \
  --resource-group roo-prod \
  --mode multiple

# Send 90% to old, 10% to new
az containerapp ingress traffic set \
  --name roo-vision \
  --resource-group roo-prod \
  --revision-weight roo-vision--v3=90 roo-vision--v3-1=10

Real-world example: Mira's canary for a new vision model

Mira is shipping a new YOLO weights file. The old model has 18 months of warehouse data behind it; the new one is freshly fine-tuned. She can’t risk exposing every robot at once.

Switch to multiple revision mode
Deploy the new image (creates roo-vision--v3-2)
Set traffic 95% old / 5% new
Watch error rate, false-negative rate (objects missed) in Log Analytics
Slowly shift traffic — 50/50 by day 3, 100% on new by day 7
Deactivate the old revision once stable

If anything looks off mid-rollout, set old back to 100% — instant rollback, no redeploy, no image change.

Deploying — the minimal command

az containerapp create \
  --name roo-vision \
  --resource-group roo-prod \
  --environment roo-prod-env \
  --image roo.azurecr.io/roo-vision:v3.4.1 \
  --ingress external --target-port 8000 \
  --registry-server roo.azurecr.io --registry-identity system \
  --min-replicas 0 --max-replicas 30 \
  --env-vars LOG_LEVEL=info MODEL_NAME=phi-4-mini \
  --secrets openai-key=keyvaultref:https://roo-kv.vault.azure.net/secrets/OpenAIKey,identityref:system

That single command creates a public-facing inference endpoint that:

Scales from 0 to 30 replicas based on HTTP traffic
Pulls the image from ACR using the app’s managed identity (no password)
Reads the OpenAI key from Key Vault on every restart (no secret in the manifest)
Listens on port 8000

Ingress — how traffic gets in

Ingress mode	Visibility	Use case
External	Public internet via the environment’s `*.<region>.azurecontainerapps.io` host	Public APIs, webhooks
Internal	Only reachable from inside the environment (or the integrated VNet)	Backend services called by other container apps
Disabled	No HTTP ingress	Worker containers triggered by queue messages, not HTTP

You also configure:

Target port — what your container listens on inside the replica
Transport — auto, HTTP/1.1, HTTP/2, or TCP (yes, Container Apps supports raw TCP for non-HTTP services)
Allow insecure — drop HTTPS-only enforcement (almost never needed; HTTPS is automatic)
Custom domains + managed certs — bring your domain, Azure provisions and renews the cert

Secrets — how Container Apps handles them

Container Apps has a first-class secrets concept distinct from environment variables.

# Define secrets at the app level
az containerapp secret set \
  --name roo-vision \
  --resource-group roo-prod \
  --secrets openai-key=keyvaultref:https://roo-kv.vault.azure.net/secrets/OpenAIKey,identityref:system \
            db-password=keyvaultref:https://roo-kv.vault.azure.net/secrets/DbPassword,identityref:system

# Reference secrets as env vars
az containerapp update \
  --name roo-vision --resource-group roo-prod \
  --set-env-vars OPENAI_API_KEY=secretref:openai-key DB_PASSWORD=secretref:db-password

Two secret styles:

Style	Stored where	Best for
Inline value	Encrypted in the Container Apps platform	Small workloads, no Key Vault yet
Key Vault reference (`keyvaultref:`)	Key Vault	Production. Rotation is a Key Vault edit; restart the revision to pick up the new value

Secrets are scoped to the container app. You reference them in env vars (secretref:openai-key) or in registry credentials.

Workload profiles — consumption vs dedicated

Container Apps environments have workload profiles — pools of compute the platform provisions on your behalf:

Profile	Compute	Scale to zero	Best for
Consumption	Multi-tenant serverless	Yes	Bursty workloads, low duty cycle, cost-sensitive
Dedicated D-series	D-series VMs (4-32 vCPU)	No	Predictable load, larger replicas, VM customisation
Dedicated NC-series GPU	NC-series GPU SKUs	No (preview)	GPU inference workloads

A single environment can hold multiple profile types. Container apps choose their profile per app.

# Deploy to a GPU profile
az containerapp create \
  --name roo-vision-gpu \
  --environment roo-prod-env \
  --workload-profile-name gpu-pool \
  --image roo.azurecr.io/roo-vision-cuda:v3.4.1

Logs and observability

Every container app environment is wired to a Log Analytics workspace. You query container output, system events, and KEDA scale events with KQL:

ContainerAppConsoleLogs_CL
| where ContainerAppName_s == "roo-vision"
| where TimeGenerated > ago(15m)
| where Log_s contains "ERROR"
| project TimeGenerated, RevisionName_s, Log_s
| order by TimeGenerated desc

KQL pieces you’ll see in Domain 4 are first-class for troubleshooting Container Apps.