Domain 3 β€” Module 4 of 8 50%
18 of 28 overall
Domain 3: Secure compute Free ⏱ ~14 min read

Foundry AI Gateway + Defender for AI Service + Foundry Guardrails

The Azure-side AI security stack on SC-500: AI Gateway in Azure API Management for Microsoft Foundry traffic, Microsoft Defender for AI Service threat protection, agent guardrails in Foundry, and the Data and AI security dashboard in Defender for Cloud.

The Azure-side AI security stack

Simple explanation

The last three modules covered the M365 + identity surfaces (Purview DSPM, Copilot Studio governance, Entra Agent ID). This module covers the Azure-side stack β€” the controls that protect the Foundry models themselves and the traffic going to them.

Four moving parts:

  • AI Gateway in Azure API Management (APIM) β€” a set of APIM policies that act as a reverse proxy in front of Microsoft Foundry models. Centralises authentication, rate-limits by token usage, logs prompts and completions, and gives you a single chokepoint to enforce policy across multiple AI consumers.
  • Microsoft Defender for AI Service β€” a Defender for Cloud workload protection plan that monitors Foundry AI workloads for threats β€” prompt injection, suspicious model usage, data exfiltration via model output. Alerts surface in Defender for Cloud and Defender XDR.
  • Foundry agent guardrails β€” content filters, blocklists, evaluations, and topic-restriction settings applied to Foundry-built agents to prevent unsafe output and contain agent behaviour.
  • Data and AI security dashboard in Defender for Cloud β€” the executive view of AI security posture: which workloads have Defender for AI Service enabled, which agents have guardrails configured, top threats detected, compliance posture for AI-specific frameworks.

AI Gateway in Azure API Management

The AI Gateway pattern uses Azure API Management as a reverse proxy in front of Microsoft Foundry model endpoints. APIs are not the only beneficiary β€” agents that call Foundry models go through the same gateway. Key capabilities:

AI Gateway in Azure API Management β€” capabilities and what they buy you
AI Gateway capabilityWhat it doesWhy it matters
Token rate limiting`azure-openai-token-limit` policy enforces per-key, per-IP, or per-subscription token quotasStops runaway clients from burning a whole tenant's Foundry quota
Token usage metrics`azure-openai-emit-token-metric` emits prompt/completion token counts as APIM metricsGives the security and finance teams real per-consumer Foundry usage data
Semantic caching`azure-openai-semantic-cache-lookup` + `-store` policies cache responses based on semantic prompt similarity (via embeddings)Cuts cost and latency for repeated similar queries β€” and keeps cached PII out of recomputation
Load balancing across deploymentsAPIM backend pools spread requests across multiple Foundry deployments (regions, models, capacities)High availability and quota multiplexing
Centralised authConsumers authenticate to APIM using subscription keys, OAuth tokens, or managed identity; APIM uses managed identity to authenticate to FoundryNo Foundry API keys distributed to consumers; one identity surface to govern
Prompt and completion loggingAPIM diagnostic settings log requests/responses to Application Insights / Log Analytics / StorageAudit trail for what's being asked and what's being answered, with retention you control

A reference architecture sketch

For Ravi at Maple Genomics:

[Foundry-using app]              [Copilot Studio agent]
        β”‚                                  β”‚
        └─────────────► APIM β—„β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
                  AI Gateway policies:
                  β€’ token rate limiting per consumer
                  β€’ token metrics emitted
                  β€’ semantic cache lookup/store
                  β€’ managed-identity auth to Foundry
                  β€’ prompt/response logging
                         β”‚
                         β–Ό
              Microsoft Foundry endpoints
              (multiple deployments load-balanced)

The β€œconsumer” can be a custom app, a Copilot Studio agent, a Foundry-built agent, a Logic App, or anything else that talks HTTP to Foundry. The gateway gives Ravi one place to enforce policy and gather telemetry across all of them.

Microsoft Defender for AI Service

Defender for AI Service is one of the workload protection plans in Microsoft Defender for Cloud. Like Defender for Servers / SQL / Storage / Containers / Key Vault, it is enabled per-subscription, billed per protected resource, and surfaces alerts in Defender for Cloud and through the Defender XDR connector into Microsoft Sentinel.

What it detects

  • Prompt injection β€” both direct (an attacker controlling the user prompt to override the system prompt) and indirect (an attacker placing injection content in grounding data the agent reads).
  • Jailbreak attempts β€” patterns matching Microsoft-curated jailbreak prompts.
  • Suspicious model interaction patterns β€” unusual volume, atypical access patterns, anomalous prompts from a single account.
  • Data exfiltration via model output β€” output containing detected sensitive content patterns leaving the workload in a way that suggests exfiltration.
  • Anomalous Entra Agent ID activity β€” correlates with Entra Agent ID signals via Defender XDR.

How it’s enabled

In Microsoft Defender for Cloud, navigate to Environment settings β†’ [Subscription] β†’ Defender plans β†’ AI workloads β†’ Status: On. Some agent and gateway integrations require additional configuration (e.g. AI Gateway in APIM forwarding prompts to the Defender for AI inspection endpoint, when configured).

Foundry agent guardrails

Microsoft Foundry agents have a built-in guardrails configuration that applies safety controls at the agent layer. Conceptually distinct from APIM policies and Defender alerts β€” these are agent-developer-time controls embedded in the agent definition itself.

Foundry agent guardrails β€” agent-developer-time safety controls
GuardrailWhat it does
Content filtersClassify and block input and output across categories (violence, hate, sexual, self-harm) at thresholds you set (low / medium / high)
Blocklists (custom)Curated term lists you maintain β€” block input or output containing the terms (e.g. customer names, project codewords, regulated phrases)
Topic restrictionsDefine the in-scope topics the agent will engage with; out-of-scope requests are politely declined or redirected
EvaluationsAutomated quality and safety scoring on agent responses β€” feeds into agent improvement and compliance reporting
Prompt shieldsDetects and mitigates direct and indirect prompt-injection attempts via dedicated Foundry classifiers
Grounding requiredForce the agent to ground responses on configured knowledge β€” refuse to answer if grounding is unavailable

Guardrails complement Defender for AI Service (runtime detection) and the AI Gateway (traffic-level policy). The three together are the defense-in-depth pattern for Foundry agents.

Data and AI security dashboard in Defender for Cloud

The Data and AI security dashboard is the executive view in Microsoft Defender for Cloud aggregating AI-related security posture across the subscriptions:

  • AI workload inventory β€” which Foundry workloads exist, where, and whether Defender for AI Service is enabled on them
  • Top threats in the period β€” most-detected Defender for AI alerts and their distribution
  • AI compliance posture β€” alignment with AI controls in the Microsoft Cloud Security Benchmark and other frameworks
  • Recommended actions β€” links into Defender recommendations for AI workloads
  • Agent governance signals β€” coverage of Entra Agent ID conditional access, real-time protection coverage for Copilot Studio agents, integration with Purview DSPM signals

For SC-500, you should know where the dashboard lives (Defender for Cloud > Data and AI security dashboard) and what it aggregates (Defender for AI alerts, plan coverage, compliance posture, recommendations).

Scenario: Ravi assembles the full Azure-side AI security stack

Maple Genomics has 3 Microsoft Foundry deployments serving 12 internal apps and 4 Copilot Studio agents. Ravi assembles the Azure-side controls:

  1. AI Gateway in APIM

    • Single APIM instance fronts all 3 Foundry deployments via backend pools (load balancing across deployments for resilience).
    • All consumers (apps + Copilot Studio agents) authenticate to APIM via managed identity; APIM authenticates to Foundry via APIM’s own managed identity.
    • Token rate limit policy: 50K tokens/min per consumer; emit token metrics.
    • Semantic caching policy on the genomics-Q&A agent’s endpoint: ~30% of queries are semantic-similar repeats; caching halves latency and Foundry token spend.
    • Diagnostic settings: log all prompts and completions to a dedicated Log Analytics workspace with a 90-day retention.
  2. Defender for AI Service enabled on the 3 subscriptions hosting Foundry. Alerts route to the same Sentinel workspace as the rest of Maple Genomics’ Defender stack.

  3. Foundry agent guardrails on the 4 Copilot Studio agents and on Foundry-built custom agents:

    • Content filters: medium across all four categories (violence, hate, sexual, self-harm); blocks input + output that exceeds threshold.
    • Blocklists: customer names, project codewords, regulated phrases (HIPAA categories).
    • Topic restriction: each agent has its in-scope topic list; out-of-scope queries politely declined.
    • Prompt shields: on (mitigates direct and indirect injection).
  4. Data and AI security dashboard in Defender for Cloud: reviewed weekly in the security ops sync. Tracks Defender for AI alert volume, plan coverage (target 100%), and AI compliance posture.

  5. Integration with the M365-side stack (previous modules): Purview DSPM for AI continues to monitor Copilot Studio agents; Entra Agent ID CA policies gate invocation; Microsoft 365 admin center governs agent lifecycle. Defender XDR ties M365 alerts and Azure Defender for AI alerts into one incident graph.

End result: prompts and completions are logged centrally; per-consumer quotas are enforced; semantic caching cuts cost; threats against Foundry are detected; agents have guardrails; the dashboard surfaces gaps. Defense-in-depth across identity, runtime, traffic, posture, and detection.

Key terms

Question

What is AI Gateway in Azure API Management (APIM)?

Click or press Enter to reveal answer

Answer

A pattern using Azure API Management as a reverse proxy in front of Microsoft Foundry model endpoints. Key APIM policies β€” azure-openai-token-limit, azure-openai-emit-token-metric, azure-openai-semantic-cache-lookup/store, llm-* equivalents β€” provide token rate limiting, usage metrics, semantic caching, load balancing, centralised authentication (managed identity), and prompt/completion logging.

Click to flip back

Question

What is Microsoft Defender for AI Service?

Click or press Enter to reveal answer

Answer

A Microsoft Defender for Cloud workload protection plan covering Microsoft Foundry AI workloads. Detects prompt injection (direct and indirect), jailbreak attempts, suspicious model interaction patterns, data exfiltration via model output, and anomalous Entra Agent ID activity. Surfaces alerts in Defender for Cloud and Defender XDR; routes to Microsoft Sentinel via the Defender XDR connector.

Click to flip back

Question

What are Microsoft Foundry agent guardrails?

Click or press Enter to reveal answer

Answer

Agent-developer-time safety controls embedded in the agent definition: content filters (violence/hate/sexual/self-harm classification with thresholds), custom blocklists (term lists for input/output), topic restrictions (in-scope subject list), evaluations (automated quality/safety scoring), prompt shields (injection mitigation), and grounding-required (refuse if grounding unavailable).

Click to flip back

Question

What is the Data and AI security dashboard in Defender for Cloud?

Click or press Enter to reveal answer

Answer

An executive-facing dashboard in Microsoft Defender for Cloud aggregating AI security posture across subscriptions β€” AI workload inventory, Defender for AI plan coverage, top AI-related threats, AI compliance posture against frameworks like MCSB, and recommended actions. The unified view for security leaders on AI workload risk.

Click to flip back

Question

How do APIM AI Gateway, Defender for AI Service, and Foundry guardrails complement each other?

Click or press Enter to reveal answer

Answer

Three layers, different timings: Foundry guardrails are agent-build-time and inline runtime (content filters, prompt shields). AI Gateway in APIM is in-band at the network layer (rate limiting, caching, logging, auth). Defender for AI Service is detection (alerting on injection, jailbreak, exfiltration, anomalies). Together they form defense-in-depth for Foundry-hosted AI.

Click to flip back

Knowledge check

Knowledge Check

Ravi at Maple Genomics has 14 different applications and agents calling Microsoft Foundry. He wants ONE place to enforce per-consumer token rate limits, log every prompt and response, and ensure no Foundry API keys are distributed to consumers. Which pattern fits best?

Knowledge Check

Esme at Northwind Bank is enabling threat protection for the Foundry workloads in the bank's data subscription. Which Microsoft Defender for Cloud plan should she enable?

Knowledge Check

Asha at Aurora Health Service wants the executive view of AI security posture across all 47 subscriptions: which workloads have Defender for AI enabled, what the top threats were in the period, compliance posture against the Microsoft Cloud Security Benchmark for AI, and recommended actions. Where does she look?

Domain 3 AI security wrap-up

You’ve covered the four AI security modules β€” the genuine differentiator on SC-500 vs AZ-500:

  • Discovery + posture β€” Microsoft Purview DSPM for AI
  • M365-side governance β€” Copilot Studio real-time protection + Microsoft 365 admin center agent management
  • Identity + access β€” Microsoft Entra Agent ID + Defender XDR blast radius
  • Azure-side runtime + traffic β€” AI Gateway in APIM + Defender for AI Service + Foundry agent guardrails + Data and AI security dashboard

The next four modules in Domain 3 cover the remaining compute-security surfaces: VM and server hardening, Defender for Servers across hybrid/multicloud (via Azure Arc), Defender for Containers (AKS/ACR/ACI/Container Apps), and application platform security (App Service, Functions, Logic Apps, WAF, APIM API protection).