Domain 2 β€” Module 12 of 12 100%
22 of 28 overall
Domain 2: Integrate and Extend Agents in Copilot Studio Free ⏱ ~14 min read

Foundry Model Catalog and Application Insights

Choose the right AI model from the Foundry model catalog for custom prompts, and monitor your agents with Application Insights telemetry.

Part 1: The Foundry Model Catalog

Simple explanation

Think of the model catalog like a car dealership.

You would not buy a sports car to deliver furniture, and you would not buy a delivery van for a race. AI models are the same β€” some are fast and cheap (great for simple FAQs), others are powerful and expensive (needed for complex reasoning). The Foundry model catalog is your dealership: hundreds of models from Microsoft, OpenAI, Meta, and others, each with different strengths.

For Copilot Studio, the model catalog matters when you create custom prompts β€” special AI instructions that run inside your agent’s topics. Instead of using the default model, you pick the model that best fits each task.

Application Insights is your agent’s dashboard β€” it tracks every conversation, measures response times, catches errors, and shows you how your agent performs in the real world.

Choosing the right model

Not all AI models are equal. The exam expects you to match model characteristics to use cases.

Foundry model catalog β€” key models compared
FeatureCostSpeedCapabilityBest for
GPT-4oHigher β€” premium per-token pricingModerate β€” 10-30 seconds for complex reasoningHighest β€” complex reasoning, nuanced understanding, multi-step analysis, medical/legal/financial domainsHigh-stakes tasks: clinical decision support, contract analysis, complex troubleshooting
GPT-4o miniLow β€” fraction of GPT-4o costFast β€” typically under 5 secondsGood β€” handles most business tasks well, strong at summarisation and classificationHigh-volume tasks: FAQ answers, ticket classification, simple summarisation, routing logic
Phi (small language model)Lowest β€” designed for cost efficiency at scaleFastest β€” sub-second for simple tasksModerate β€” strong for structured tasks, weaker on open-ended reasoningEdge deployment, high-throughput classification, cost-sensitive scenarios with thousands of daily calls
Llama (Meta)Varies by size β€” competitive with GPT-4o miniVaries β€” depends on model size (8B, 70B, 405B)Strong β€” open-source, good at general reasoning, code generationTeams preferring open-source models, specific compliance requirements, code-heavy tasks

Custom prompts with model selection

A custom prompt in Copilot Studio is a prompt node within a topic that sends a specific instruction to an AI model and returns the result. By connecting to the Foundry model catalog, you can choose which model processes each prompt.

Configuration steps:

  1. Connect your Copilot Studio environment to Foundry β€” this enables model catalog access
  2. In a topic, add a Prompt node (also called a β€œCreate prompt” or β€œAI Builder prompt” action)
  3. Select the Foundry model β€” choose from the catalog based on your task requirements
  4. Write the prompt instruction β€” what the model should do with the input (e.g., β€œClassify this ticket as urgent, normal, or low priority”)
  5. Map input variables β€” pass conversation variables into the prompt
  6. Map output variables β€” capture the model’s response for use in subsequent nodes
  7. Test with sample inputs β€” verify the model produces expected outputs
Why not just use the default model for everything?

Custom prompts with specific models give three advantages: cost optimization (GPT-4o mini for simple tasks, GPT-4o only for complex reasoning β€” can cut AI costs 60-80%), task-specific accuracy (some models excel at certain tasks), and compliance (choose models deployed in specific Azure regions for data residency requirements).

Part 2: Monitoring with Application Insights

Building an agent is half the job. The other half is knowing whether it actually works in production. Application Insights provides the observability layer.

Connecting Application Insights to your agent:

  1. Create an Application Insights resource in Azure (or use an existing one)
  2. In Copilot Studio, go to Settings then Agent settings then Application Insights
  3. Paste the connection string from your Application Insights resource
  4. Save and publish β€” telemetry starts flowing within minutes

Key telemetry captured:

MetricWhat it tells youWhy it matters
Session countHow many conversations happen per day/weekAdoption tracking β€” is the agent being used?
Topic completion ratePercentage of topic starts that reach the end nodeQuality signal β€” incomplete topics suggest confusion or errors
Resolution ratePercentage of sessions resolved without human escalationEffectiveness β€” the agent’s core success metric
Escalation rateHow often conversations transfer to a humanCapacity planning β€” high escalation means the agent needs improvement
Average response timeHow long the agent takes to respondUser experience β€” slow responses increase abandonment
Error rateFailed connector calls, timeout errors, unhandled exceptionsReliability β€” errors need immediate investigation
KQL query examples for agent monitoring

Application Insights data is queried using KQL (Kusto Query Language). Common queries include sessions per day (summarize dcount(session_Id) by bin(timestamp, 1d)), top escalated topics, and average response latency.

You do not need to memorise KQL syntax for the exam β€” but knowing that Application Insights enables query-driven monitoring is testable.

Scenario: Lena picks models and sets up monitoring

Lena’s hospital agent handles two workloads: clinical decision support (complex medical questions requiring accuracy) and general FAQ (parking, cafeteria, IT password resets β€” speed and cost matter more).

For clinical custom prompts she selects GPT-4o β€” its reasoning accuracy on medical terminology is worth the premium. For FAQ prompts she picks GPT-4o mini β€” simple Q&A at a fraction of the cost, under 3 seconds.

She connects Application Insights and builds dashboards tracking clinical accuracy, daily usage, and cost per model. After the first week, data reveals 40% of β€œclinical” queries are simple medication lookups. She adds a classification prompt (GPT-4o mini) that routes simple lookups to the cheaper model. Cost drops 35% with no accuracy impact. Data-driven model optimization in action.

Exam tip: model selection is about matching cost and capability to the task

The exam will describe scenarios and ask which model to choose. The decision framework is simple:

  • Complex reasoning, high stakes β†’ GPT-4o (or the most capable model available)
  • Simple tasks, high volume β†’ GPT-4o mini (good balance of cost and capability)
  • Maximum cost efficiency, structured tasks β†’ Phi (smallest, cheapest, fastest)
  • Open-source requirement β†’ Llama

If the scenario mentions β€œthousands of daily requests” and β€œsimple classification,” the answer is almost always GPT-4o mini or Phi β€” not GPT-4o.

Question

What is the Foundry model catalog?

Click or press Enter to reveal answer

Answer

A library of hundreds of AI models from multiple providers (OpenAI, Microsoft, Meta, Mistral, etc.) available in Microsoft Foundry. Copilot Studio developers use it to select specific models for custom prompts based on cost, speed, and capability requirements.

Click to flip back

Question

When should you use GPT-4o vs GPT-4o mini in custom prompts?

Click or press Enter to reveal answer

Answer

GPT-4o: complex reasoning, high-stakes tasks (medical, legal, financial) where accuracy is critical. GPT-4o mini: high-volume, simpler tasks (FAQ, classification, summarisation) where cost and speed matter more than reasoning depth.

Click to flip back

Question

How do you connect Application Insights to a Copilot Studio agent?

Click or press Enter to reveal answer

Answer

Create an Application Insights resource in Azure β†’ In Copilot Studio go to Settings β†’ Agent settings β†’ Application Insights β†’ paste the connection string β†’ Save and publish. Telemetry starts flowing within minutes.

Click to flip back

Question

Name four key metrics captured by Application Insights for Copilot Studio agents.

Click or press Enter to reveal answer

Answer

1) Session count (adoption). 2) Topic completion rate (quality). 3) Resolution rate (effectiveness). 4) Escalation rate (agent capability gaps). Also: response time and error rate.

Click to flip back

Question

What query language is used to analyse Application Insights data?

Click or press Enter to reveal answer

Answer

KQL (Kusto Query Language). It enables queries against agent telemetry β€” session counts, topic performance, error rates, response latency, and custom events.

Click to flip back

Knowledge Check

Lena's agent handles thousands of simple FAQ questions daily and a few dozen complex clinical queries. How should she configure model selection?

Knowledge Check

After connecting Application Insights, which metric best indicates that the agent is failing to help users?

Knowledge Check

What is the primary benefit of using the Foundry model catalog with custom prompts instead of the default Copilot Studio model?