Domain 1 β€” Module 1 of 8 13%
1 of 27 overall
Domain 1: Plan and Manage an Azure AI Solution Free ⏱ ~12 min read

Choosing the Right AI Model

Not all AI models are created equal. Learn how to pick the right model for each task β€” LLMs, SLMs, multimodal models, and Foundry Tools β€” so you don't overspend or underperform.

Why model selection matters

Simple explanation

Picking an AI model is like hiring for a job β€” you wouldn’t hire a brain surgeon to stack shelves, and you wouldn’t hire a shelf-stacker to do surgery.

Large language models (LLMs) like GPT-4o are powerful but expensive. Small language models (SLMs) like Phi-4 are cheaper and faster but less capable. Multimodal models handle images, audio, and video β€” not just text. And Foundry Tools are pre-built AI services you don’t need to train at all.

The exam tests whether you can match the right model to the right task β€” balancing cost, speed, accuracy, and capability.

The four model categories

The four model categories in Microsoft Foundry
FeatureLLMsSLMsMultimodalFoundry Tools
What they doComplex reasoning, generation, analysisSimpler tasks, fast inferenceProcess text + images + audio + videoPre-built AI capabilities (search, OCR, speech)
ExamplesGPT-4o, GPT-4.1, Llama 3.3Phi-4, Phi-4-mini, Mistral SmallGPT-4o (vision), Llama 4Azure AI Search, Content Understanding, Translator
CostHigher (more tokens, more compute)Lower (smaller, faster)Medium-high (depends on modalities)Pay-per-use (no model hosting)
Best forAgents, RAG, complex workflowsEdge devices, high-volume simple tasksApps that need to see, hear, and readStructured tasks: search, OCR, translation
DeploymentCloud (Foundry hosted or serverless)Cloud or edgeCloudManaged service (no deployment needed)

When to use what β€” decision framework

The exam loves β€œwhich model should you use?” questions. Here’s the decision tree:

ScenarioBest ChoiceWhy
Complex multi-step reasoning with toolsLLM (GPT-4o, GPT-4.1)Needs strong reasoning and function-calling
Summarising thousands of support ticketsSLM (Phi-4)Simple task at high volume β€” cost matters
Analysing medical images alongside patient notesMultimodal (GPT-4o vision)Needs to process both text and images
Extracting invoice fields from scanned PDFsFoundry Tool (Content Understanding)Purpose-built for document extraction
Real-time speech transcription in a call centreFoundry Tool (Azure Speech)Dedicated speech service, optimised for streaming
Building a chatbot that searches company docsLLM + Foundry Tool (GPT-4o + Azure AI Search)Combine reasoning with retrieval
Exam tip: The 'cheapest correct option' trap

The exam often presents scenarios where multiple models could work. The correct answer is usually the one that meets the requirements at the lowest cost and complexity.

For example: β€œA company needs to classify customer emails as positive, negative, or neutral.” You might think GPT-4o β€” but Phi-4 or even a Foundry sentiment analysis tool would be cheaper and sufficient. The exam rewards right-sizing, not over-engineering.

Meet the characters

Throughout this course, you’ll follow four teams building AI solutions:

CharacterWho They AreAI Use Cases
πŸ₯ NeuralMedHealth-tech startup, 25 engineersAI diagnostic assistants, medical record extraction, patient chatbots
🏦 Atlas FinancialEnterprise bank, 3000 employeesCompliance agents, fraud detection, customer service bots
πŸš€ MediaForgeContent operations platform, 40 developersImage/video generation, marketing content pipelines, prompt optimisation
πŸ‘¨β€πŸ’» KaiAI engineer at a logistics companyInfrastructure decisions, CI/CD for AI, deployment troubleshooting
Real-world example: Kai's model selection

Kai needs to build three features for the logistics platform:

  1. Package label OCR β€” reads shipping labels from photos β†’ Content Understanding (Foundry Tool β€” purpose-built, no model hosting)
  2. Route optimisation chatbot β€” answers complex questions about delivery routes β†’ GPT-4o (LLM β€” needs reasoning over structured data)
  3. Automated status updates β€” generates short β€œyour package is on its way” messages β†’ Phi-4-mini (SLM β€” simple generation, high volume, low cost)

Three features, three different model choices. That’s model selection in practice.

Foundry Tools vs models

A common exam confusion: Foundry Tools are not models you deploy β€” they’re managed services you call.

Foundry ToolWhat It DoesWhen to Use Instead of a Model
Azure AI SearchSemantic, vector, and hybrid searchWhen you need retrieval/grounding for RAG
Content UnderstandingOCR, layout analysis, field extraction from documentsWhen extracting structured data from PDFs, forms, images
Azure SpeechSpeech-to-text, text-to-speechWhen you need dedicated speech processing
Azure TranslatorText and document translationWhen you need reliable multilingual translation
Exam tip: Foundry Tools vs prompting an LLM

The exam tests whether you know when to use a dedicated Foundry Tool versus prompting an LLM to do the same task. Key rule: if a Foundry Tool exists for the task, it’s usually the correct answer β€” it’s cheaper, more reliable, and purpose-built.

Example: β€œTranslate a 500-page legal document from English to Japanese.” Answer: Azure Translator (Foundry Tool), NOT β€œprompt GPT-4o to translate.”

The model catalog and Model Router

Microsoft Foundry’s model catalog gives you access to 11,000+ models from OpenAI, Meta, Mistral, Anthropic, and more. You don’t have to use only Microsoft models.

Model Router is a deployable model in the Foundry catalog β€” you deploy it like any other model and call it via the Chat Completions API. It automatically selects the best underlying model for each request based on cost-performance trade-offs. Think of it as β€œauto-scaling for model intelligence” β€” simple requests get routed to cheaper models, complex ones to more capable models.

Key terms

Question

What is a Large Language Model (LLM)?

Click or press Enter to reveal answer

Answer

A neural network trained on vast text data that can understand and generate human language. Examples: GPT-4o, GPT-4.1, Llama 3.3. Best for complex reasoning, agents, and multi-step tasks.

Click to flip back

Question

What is a Small Language Model (SLM)?

Click or press Enter to reveal answer

Answer

A compact language model optimised for speed and cost over maximum capability. Examples: Phi-4, Phi-4-mini, Mistral Small. Best for high-volume simple tasks and edge deployment.

Click to flip back

Question

What is a multimodal model?

Click or press Enter to reveal answer

Answer

A model that can process multiple input types β€” text, images, audio, and video β€” in a single interaction. Example: GPT-4o with vision can analyse photos while answering questions about them.

Click to flip back

Question

What are Foundry Tools?

Click or press Enter to reveal answer

Answer

Pre-built Azure AI services (Search, Content Understanding, Speech, Translator) that you call as APIs. They don't require model deployment β€” you pay per use and they're optimised for specific tasks.

Click to flip back

Question

What is Model Router in Microsoft Foundry?

Click or press Enter to reveal answer

Answer

A deployable model in the Foundry catalog that you deploy and call like any other model. It automatically routes each request to the most cost-effective underlying model that can handle the task. Simple requests go to cheaper models; complex ones go to more capable models.

Click to flip back

Knowledge check

Knowledge Check

NeuralMed needs to extract patient names, dates of birth, and medication lists from scanned handwritten prescriptions. Which approach should they use?

Knowledge Check

Atlas Financial processes 50,000 customer emails daily and needs to classify each as 'complaint', 'enquiry', or 'compliment'. Which model type is most cost-effective?

Knowledge Check

MediaForge is building a content review tool that analyses both marketing images and their accompanying ad copy together. Which model type should they choose?