PostgreSQL on Azure: Connect, Model, Index for AI
Azure Database for PostgreSQL Flexible Server is the AI-ready relational option. Connection patterns with managed identity, schema modelling for AI workloads, and indexing strategies for low-latency reads.
Why PostgreSQL keeps showing up in AI architectures
Azure Database for PostgreSQL Flexible Server is the relational database with vector and AI built in. Cosmos is great when your data is JSON-y; PostgreSQL is great when you want SQL, foreign keys, joins, transactions, AND AI features (pgvector, pg_search, Azure AI extension).
For AI-200, the exam tests you on three things: connecting from your container app (managed identity beats passwords), modelling AI-relevant tables (chat history, RAG sources, audit), and indexing so reads stay fast as data grows.
The pgvector extension turns Postgres into a real vector database. It lives in the same SQL transaction as everything else β thatβs the killer feature.
Connecting β managed identity is the right answer
# Python with psycopg + Microsoft Entra token
import psycopg
from azure.identity import DefaultAzureCredential
cred = DefaultAzureCredential()
token = cred.get_token("https://ossrdbms-aad.database.windows.net/.default").token
conn = psycopg.connect(
host="roo-pg.postgres.database.azure.com",
user="roo-app-identity", # Entra group or user-assigned managed identity name
password=token, # the Entra token IS the password
dbname="roo",
sslmode="require",
)
Three things to know:
| Item | Detail |
|---|---|
| The tokenβs audience | Always https://ossrdbms-aad.database.windows.net/.default |
| Token lifetime | ~1 hour. Refresh on reconnect; long-lived connections may need re-auth |
| User name | The Entra principal must be added to PostgreSQL with CREATE USER ... WITH LOGIN and granted permissions |
-- One-time admin setup: tell PostgreSQL who its Entra users are
SELECT * FROM pgaadauth_create_principal('roo-app-identity', false, false);
GRANT CONNECT ON DATABASE roo TO "roo-app-identity";
GRANT USAGE ON SCHEMA app TO "roo-app-identity";
GRANT SELECT, INSERT, UPDATE ON ALL TABLES IN SCHEMA app TO "roo-app-identity";
Exam tip: 'I'm getting authentication errors with PostgreSQL'
Common causes:
- The tokenβs audience is wrong (must be
ossrdbms-aad.database.windows.net) - The Entra principal hasnβt been added with
pgaadauth_create_principal - The user name in the connection string doesnβt match the Entra display name
- The token has expired (1 hour lifetime)
When the question is βX service canβt connect to PostgreSQLβ, first check the role assignment, then the principal registration, then the token audience.
Modelling for AI workloads
A common AI app needs three table groups: source content, embeddings, and conversation/audit logs.
-- Source content
CREATE TABLE knowledge_articles (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id text NOT NULL,
title text NOT NULL,
body text NOT NULL,
language text NOT NULL DEFAULT 'en',
metadata jsonb, -- anything tenant-specific
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now()
);
-- Embeddings (pgvector β covered in next module)
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE article_embeddings (
article_id uuid PRIMARY KEY REFERENCES knowledge_articles(id) ON DELETE CASCADE,
embedding vector(1536) NOT NULL,
model text NOT NULL, -- track which model produced it
embedded_at timestamptz NOT NULL DEFAULT now()
);
-- Chat history
CREATE TABLE chat_messages (
id bigserial PRIMARY KEY,
session_id uuid NOT NULL,
tenant_id text NOT NULL,
role text NOT NULL CHECK (role IN ('user','assistant','tool','system')),
content text NOT NULL,
tokens_in int,
tokens_out int,
created_at timestamptz NOT NULL DEFAULT now()
);
Three modelling patterns worth knowing:
| Feature | text vs varchar | jsonb for variable shape | Composite primary keys |
|---|---|---|---|
| Use | Always `text` (or `text` with `CHECK length(...) <= N`) | When metadata varies per tenant or document type | Multi-tenant tables β `(tenant_id, id)` keeps rows together physically |
| Why | `varchar(N)` and `text` are equivalent in performance; `text` is more flexible | `jsonb` is binary, indexable with GIN, queryable with `->`, `->>`, `@>` | Improves index locality, simplifies row-level security |
| Trap | Premature length limits cause migrations later | JSONB hides schema drift β be deliberate | Composite keys with auto-generated columns need careful sequence management |
Indexing β the cheapest speed-up youβll ever buy
Postgres provides several index types; choose based on the query shape.
| Index | What it accelerates | Best for |
|---|---|---|
| B-tree (default) | Equality, ranges, ORDER BY | Most queries on scalar columns |
| Hash | Equality only | Rare; B-tree usually wins |
| GIN | Containment in jsonb / arrays / full-text | metadata @> '{"tier":"gold"}', tsvector text search |
| GiST | Geometric, range, full-text, vector (with extensions) | Geo, ranges, fuzzy text |
| BRIN | Block range index β tiny, for huge tables ordered by time | Append-only audit / log tables |
| HNSW / IVFFlat (pgvector) | Approximate nearest-neighbour vector search | Embedding similarity (next module) |
Common AI-200 indexing patterns:
-- Frequent equality + ORDER BY on chat history
CREATE INDEX ON chat_messages (session_id, created_at DESC);
-- jsonb metadata containment query (e.g., WHERE metadata @> '{"docType":"policy"}')
CREATE INDEX ON knowledge_articles USING gin (metadata jsonb_path_ops);
-- Audit log: massive table, queries usually by time range
CREATE INDEX ON audit_events USING brin (created_at);
-- Multi-tenant lookup
CREATE INDEX ON knowledge_articles (tenant_id, language, updated_at DESC);
Real-world example: Theo's index sweep on go-live
Tidewaterβs clinical assistant launched fast β and slow. P95 read latency was 1.4 s. Theo ran EXPLAIN ANALYZE on the four worst queries and found three sequential scans. The fixes:
- Composite index on
(tenant_id, updated_at DESC)for the recent-articles query β 1100 ms β 15 ms. - GIN index on
metadatafor the jsonb containment filter β 800 ms β 8 ms. - BRIN index on
audit_events.created_atβ keeps the index cheap on a 200M-row table.
No SDK changes. No schema changes. Three CREATE INDEX statements. P95 dropped to 60 ms.
Reading EXPLAIN ANALYZE β the troubleshooting toolkit
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM knowledge_articles
WHERE tenant_id = $1 AND language = 'en'
ORDER BY updated_at DESC
LIMIT 20;
What to look for:
| Sign | Meaning |
|---|---|
Seq Scan on a large table | Missing index β top suspect |
Index Scan followed by huge Filter | Index covers the wrong columns or order |
Bitmap Heap Scan with Heap Blocks: lossy | Bitmap was too big β boost work_mem |
Hash Join with high actual rows Γ loops | Statistics may be stale β ANALYZE the_table; |
Two SQL one-liners that pay for themselves repeatedly:
-- Live blocked-statement view
SELECT pid, query, state, wait_event_type, wait_event, query_start
FROM pg_stat_activity
WHERE state != 'idle' AND wait_event IS NOT NULL;
-- Cumulative slow-query digest (requires pg_stat_statements)
SELECT round(total_exec_time::numeric, 1) AS total_ms,
calls, query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
Key terms
Knowledge check
Mira's Container App connects to Azure Database for PostgreSQL with a stored password. Security wants the password removed. What's the recommended replacement?
Theo's chat-message table is a huge time-ordered append-only table. The most common query is `WHERE created_at > $1 AND created_at < $2`. Which index type is the most space-efficient choice?
Lin sees `Seq Scan on knowledge_articles ... rows=2_400_000` in EXPLAIN ANALYZE. The query is `WHERE tenant_id = $1 AND language = $2 ORDER BY updated_at DESC LIMIT 20`. There's no index on these columns. What's the right index to add?