Azure Container Apps: Deploy + Revision Management
The serverless container service that scales from zero. Deploying images, managing revisions, traffic-splitting between revisions, secrets, and ingress configuration.
What Container Apps actually is
Azure Container Apps is βI want containers, not Kubernetes.β Underneath, it IS Kubernetes β Microsoft runs the cluster β but you never see a node, a pod, a manifest, or kubectl. You give Azure a container image and a few rules (βscale from 0 to 50 when the queue growsβ), and Azure runs it.
The defining feature is scale to zero. When nothingβs calling your AI inference endpoint, the cost is $0 β no idle replicas, no warm pool. When traffic arrives, KEDA wakes a replica in seconds.
The other defining feature is revisions. Every deploy creates a new immutable revision. You can split traffic across revisions (10% on the new model, 90% on the old) or roll back instantly by changing the active revision.
The hierarchy
Container Apps Environment (= shared VNet + log analytics)
βββ Container App: roo-vision-inference
β βββ Revision: roo-vision-inference--v3
β βββ Revision: roo-vision-inference--v3-1 (90% traffic)
β βββ Revision: roo-vision-inference--v3-2 (10% traffic, new model)
βββ Container App: roo-orchestrator
βββ Container App: roo-dashboard
Every change creates a revision. Whether traffic flows to it depends on your revision mode.
Revision modes β single vs multiple
| Feature | Single revision mode (default) | Multiple revision mode |
|---|---|---|
| What it does | Each new revision becomes the only active one; old revisions deactivate | Multiple revisions can be active simultaneously and serve traffic |
| Use for | Most production apps β simple, ship-it-and-go | Blue/green, canary, A/B testing across model versions |
| Switch with | Default β no extra config | `az containerapp revision set-mode --mode multiple` |
| Traffic control | Always 100% on latest revision | You explicitly assign traffic % per revision |
# Switch to multiple revision mode
az containerapp revision set-mode \
--name roo-vision \
--resource-group roo-prod \
--mode multiple
# Send 90% to old, 10% to new
az containerapp ingress traffic set \
--name roo-vision \
--resource-group roo-prod \
--revision-weight roo-vision--v3=90 roo-vision--v3-1=10
Real-world example: Mira's canary for a new vision model
Mira is shipping a new YOLO weights file. The old model has 18 months of warehouse data behind it; the new one is freshly fine-tuned. She canβt risk exposing every robot at once.
- Switch to multiple revision mode
- Deploy the new image (creates
roo-vision--v3-2) - Set traffic 95% old / 5% new
- Watch error rate, false-negative rate (objects missed) in Log Analytics
- Slowly shift traffic β 50/50 by day 3, 100% on new by day 7
- Deactivate the old revision once stable
If anything looks off mid-rollout, set old back to 100% β instant rollback, no redeploy, no image change.
Deploying β the minimal command
az containerapp create \
--name roo-vision \
--resource-group roo-prod \
--environment roo-prod-env \
--image roo.azurecr.io/roo-vision:v3.4.1 \
--ingress external --target-port 8000 \
--registry-server roo.azurecr.io --registry-identity system \
--min-replicas 0 --max-replicas 30 \
--env-vars LOG_LEVEL=info MODEL_NAME=phi-4-mini \
--secrets openai-key=keyvaultref:https://roo-kv.vault.azure.net/secrets/OpenAIKey,identityref:system
That single command creates a public-facing inference endpoint that:
- Scales from 0 to 30 replicas based on HTTP traffic
- Pulls the image from ACR using the appβs managed identity (no password)
- Reads the OpenAI key from Key Vault on every restart (no secret in the manifest)
- Listens on port 8000
Ingress β how traffic gets in
| Ingress mode | Visibility | Use case |
|---|---|---|
| External | Public internet via the environmentβs *.<region>.azurecontainerapps.io host | Public APIs, webhooks |
| Internal | Only reachable from inside the environment (or the integrated VNet) | Backend services called by other container apps |
| Disabled | No HTTP ingress | Worker containers triggered by queue messages, not HTTP |
You also configure:
- Target port β what your container listens on inside the replica
- Transport β auto, HTTP/1.1, HTTP/2, or TCP (yes, Container Apps supports raw TCP for non-HTTP services)
- Allow insecure β drop HTTPS-only enforcement (almost never needed; HTTPS is automatic)
- Custom domains + managed certs β bring your domain, Azure provisions and renews the cert
Secrets β how Container Apps handles them
Container Apps has a first-class secrets concept distinct from environment variables.
# Define secrets at the app level
az containerapp secret set \
--name roo-vision \
--resource-group roo-prod \
--secrets openai-key=keyvaultref:https://roo-kv.vault.azure.net/secrets/OpenAIKey,identityref:system \
db-password=keyvaultref:https://roo-kv.vault.azure.net/secrets/DbPassword,identityref:system
# Reference secrets as env vars
az containerapp update \
--name roo-vision --resource-group roo-prod \
--set-env-vars OPENAI_API_KEY=secretref:openai-key DB_PASSWORD=secretref:db-password
Two secret styles:
| Style | Stored where | Best for |
|---|---|---|
| Inline value | Encrypted in the Container Apps platform | Small workloads, no Key Vault yet |
Key Vault reference (keyvaultref:) | Key Vault | Production. Rotation is a Key Vault edit; restart the revision to pick up the new value |
Secrets are scoped to the container app. You reference them in env vars (secretref:openai-key) or in registry credentials.
Workload profiles β consumption vs dedicated
Container Apps environments have workload profiles β pools of compute the platform provisions on your behalf:
| Profile | Compute | Scale to zero | Best for |
|---|---|---|---|
| Consumption | Multi-tenant serverless | Yes | Bursty workloads, low duty cycle, cost-sensitive |
| Dedicated D-series | D-series VMs (4-32 vCPU) | No | Predictable load, larger replicas, VM customisation |
| Dedicated NC-series GPU | NC-series GPU SKUs | No (preview) | GPU inference workloads |
A single environment can hold multiple profile types. Container apps choose their profile per app.
# Deploy to a GPU profile
az containerapp create \
--name roo-vision-gpu \
--environment roo-prod-env \
--workload-profile-name gpu-pool \
--image roo.azurecr.io/roo-vision-cuda:v3.4.1
Logs and observability
Every container app environment is wired to a Log Analytics workspace. You query container output, system events, and KEDA scale events with KQL:
ContainerAppConsoleLogs_CL
| where ContainerAppName_s == "roo-vision"
| where TimeGenerated > ago(15m)
| where Log_s contains "ERROR"
| project TimeGenerated, RevisionName_s, Log_s
| order by TimeGenerated desc
KQL pieces youβll see in Domain 4 are first-class for troubleshooting Container Apps.
Key terms
Knowledge check
Mira wants to ship a new vision model behind 5% canary traffic, then ramp to 100% over a week. Which Container Apps feature does she need?
Theo's clinical container app reads a database password from environment variable `DB_PASSWORD`. The password lives in Key Vault. What's the right Container Apps configuration?
Lin sets `--min-replicas 0 --max-replicas 10` on a Container App. The app idles for hours, then receives a single HTTP request. What happens?