Domain 3 β€” Module 2 of 7 29%
23 of 28 overall
Domain 3: Implement AI Capabilities in Database Solutions Free ⏱ ~14 min read

Embeddings: Design, Chunk, and Generate

Turn your data into vectors that AI can search and compare β€” choose which columns to embed, design chunking strategies, select maintenance methods, and generate embeddings at scale.

Turning text into searchable numbers

Simple explanation

Imagine every product in a store as a point on a map.

Similar products are close together: all the red shoes are in one area, all the running shoes in another, and red running shoes are where those areas overlap. That map is what embeddings create β€” except instead of physical location, each product is represented as a list of numbers (a vector) where similar items have similar numbers.

To create this map, you need to: (1) decide what information matters (product name? description? reviews?), (2) break long text into manageable pieces (chunks), and (3) send each piece to an AI model that converts it into a vector. This module covers all three steps.

Which columns to embed

Not every column should be embedded. Choose columns that contain semantic meaning users would search for:

Column TypeEmbed?Why
Product descriptionYesRich semantic content users search with natural language
Customer review textYesContains opinions and experiences users want to find
Technical documentationYesLong-form content with searchable concepts
Product nameMaybeShort text β€” useful if combined with description
Price, quantityNoNumeric β€” use standard SQL comparison operators
Status codes, IDsNoCategorical β€” use exact match, not semantic search
TimestampsNoTemporal β€” use date range queries

Combine related columns for richer embeddings:

-- Concatenate meaningful columns before embedding
SELECT
    ProductId,
    Name + ' | ' + Category + ' | ' + Description AS EmbeddingInput
FROM Products;

Chunking: breaking long text into pieces

Embedding models have token limits (typically 512-8,192 tokens). Long documents must be split into chunks.

Chunking strategies

StrategyHow It WorksProsCons
Fixed-sizeSplit every N characters/tokensSimple, predictableMay split mid-sentence
Sentence-basedSplit on sentence boundariesPreserves meaningVariable chunk sizes
Paragraph-basedSplit on paragraph breaksNatural boundariesParagraphs vary in length
SemanticSplit where topics changeBest meaning preservationMost complex to implement
OverlappingEach chunk includes some text from adjacent chunksPreserves cross-boundary contextMore chunks to store and search

SQL Server 2025: AI_GENERATE_CHUNKS

-- Built-in chunking function
SELECT chunk, chunk_order, chunk_offset, chunk_length
FROM AI_GENERATE_CHUNKS(
    SOURCE = @text,
    CHUNK_TYPE = FIXED,
    CHUNK_SIZE = 500        -- characters (not tokens)
);
Exam tip: Chunk size affects search quality

Smaller chunks (256-512 characters) give more precise search results but lose broader context. Larger chunks (1024-2048 characters) preserve context but may dilute the relevance signal. The sweet spot depends on your content:

  • FAQ entries, short descriptions: no chunking needed (already small)
  • Product documentation: 500-character chunks with overlap
  • Legal documents: paragraph-based chunking to preserve clause boundaries
  • Conversations: message-level chunking (each message = one chunk)

The exam may present a scenario where search results are too broad β€” the fix is usually smaller chunks.

Generating embeddings

Using AI_GENERATE_EMBEDDINGS (SQL Server 2025)

-- Generate embeddings using a registered model
SELECT
    ProductId,
    AI_GENERATE_EMBEDDINGS(Description USE MODEL EmbeddingModel) AS DescriptionVector
FROM Products;

-- Store embeddings in a vector column
ALTER TABLE Products ADD DescriptionEmbedding VECTOR(1536);

UPDATE Products
SET DescriptionEmbedding = AI_GENERATE_EMBEDDINGS(Description USE MODEL EmbeddingModel);

Maintaining embeddings as data changes

When source data changes, embeddings become stale. Choose a maintenance method based on your data change patterns:

Embedding maintenance methods β€” choose based on volume, latency, and architecture
MethodHow It Detects ChangesLatencyBest For
Table triggersAFTER INSERT/UPDATE trigger regenerates embeddingImmediate (synchronous)Low-volume tables where latency matters
Change TrackingLightweight change flags on rowsNear-real-time (polling)Medium-volume, polling-based sync
Azure Functions SQL triggerServerless function triggered by row changesSeconds to minutesServerless architecture, event-driven
CDC (Change Data Capture)Full change history in change tablesNear-real-timeWhen you need the full change history, not just current state
CES (Change Event Streaming)Streams changes to Event HubsReal-time streamingHigh-volume, event-driven architectures
Azure Logic AppsWorkflow triggered by schedule or eventMinutes to hoursLow-code workflow, scheduled batch updates
Microsoft FoundryAI pipeline orchestrationConfigurableComplex AI pipelines with multiple models
Scenario: Leo's embedding pipeline at SearchWave

SearchWave’s product catalog has 500,000 products. About 1,000 products change daily (new listings, updated descriptions). Leo designs the embedding maintenance strategy:

  • Initial load: batch process all 500K products overnight using a SQL Agent job
  • Ongoing updates: Azure Functions with SQL trigger binding β€” when a product’s description changes, the function automatically regenerates its embedding
  • Why not triggers? Calling an external AI model synchronously inside a trigger would block the INSERT/UPDATE transaction. Azure Functions processes the embedding asynchronously.
Question

Why should you not embed numeric columns like Price or Quantity?

Click or press Enter to reveal answer

Answer

Embeddings capture semantic meaning in text. Numeric values like prices do not have semantic meaning β€” 29.99 and 30.00 are semantically identical but would produce different vectors. Use standard SQL comparison operators (=, >, <, BETWEEN) for numeric filtering.

Click to flip back

Question

What is the purpose of overlapping chunks?

Click or press Enter to reveal answer

Answer

Overlapping chunks include some text from adjacent chunks at the boundaries. This preserves context that would be lost if a concept spans two chunks. For example, with 50-token overlap, the last 50 tokens of chunk 1 are also the first 50 tokens of chunk 2.

Click to flip back

Question

Why should you avoid using triggers for embedding maintenance on high-volume tables?

Click or press Enter to reveal answer

Answer

Triggers run synchronously inside the transaction. Calling an external AI model from a trigger blocks the INSERT/UPDATE until the model responds β€” adding seconds of latency to every write. For high-volume tables, use async methods (Azure Functions, CES, CDC) instead.

Click to flip back

Knowledge Check

Priya at Vault Bank has a customer support database with 10 million tickets. Each ticket has a subject line and a 2,000-word description. She wants to enable semantic search. How should she prepare the data for embeddings?

Next up: Choose Your Search Strategy β€” compare full-text, vector, and hybrid search to pick the right approach for your data.