PostgreSQL on Azure: Connect, Model, Index for AI

Why PostgreSQL keeps showing up in AI architectures

Simple explanation

Azure Database for PostgreSQL Flexible Server is the relational database with vector and AI built in. Cosmos is great when your data is JSON-y; PostgreSQL is great when you want SQL, foreign keys, joins, transactions, AND AI features (pgvector, pg_search, Azure AI extension).

For AI-200, the exam tests you on three things: connecting from your container app (managed identity beats passwords), modelling AI-relevant tables (chat history, RAG sources, audit), and indexing so reads stay fast as data grows.

The pgvector extension turns Postgres into a real vector database. It lives in the same SQL transaction as everything else — that’s the killer feature.

Connecting — managed identity is the right answer

# Python with psycopg + Microsoft Entra token
import psycopg
from azure.identity import DefaultAzureCredential

cred = DefaultAzureCredential()
token = cred.get_token("https://ossrdbms-aad.database.windows.net/.default").token

conn = psycopg.connect(
    host="roo-pg.postgres.database.azure.com",
    user="roo-app-identity",       # Entra group or user-assigned managed identity name
    password=token,                # the Entra token IS the password
    dbname="roo",
    sslmode="require",
)

Three things to know:

Item	Detail
The token’s audience	Always `https://ossrdbms-aad.database.windows.net/.default`
Token lifetime	~1 hour. Refresh on reconnect; long-lived connections may need re-auth
User name	The Entra principal must be added to PostgreSQL with `CREATE USER ... WITH LOGIN` and granted permissions

-- One-time admin setup: tell PostgreSQL who its Entra users are
SELECT * FROM pgaadauth_create_principal('roo-app-identity', false, false);
GRANT CONNECT ON DATABASE roo TO "roo-app-identity";
GRANT USAGE ON SCHEMA app TO "roo-app-identity";
GRANT SELECT, INSERT, UPDATE ON ALL TABLES IN SCHEMA app TO "roo-app-identity";

Exam tip: 'I'm getting authentication errors with PostgreSQL'

Common causes:

The token’s audience is wrong (must be ossrdbms-aad.database.windows.net)
The Entra principal hasn’t been added with pgaadauth_create_principal
The user name in the connection string doesn’t match the Entra display name
The token has expired (1 hour lifetime)

When the question is “X service can’t connect to PostgreSQL”, first check the role assignment, then the principal registration, then the token audience.

Modelling for AI workloads

A common AI app needs three table groups: source content, embeddings, and conversation/audit logs.

-- Source content
CREATE TABLE knowledge_articles (
  id           uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  tenant_id    text NOT NULL,
  title        text NOT NULL,
  body         text NOT NULL,
  language     text NOT NULL DEFAULT 'en',
  metadata     jsonb,                   -- anything tenant-specific
  created_at   timestamptz NOT NULL DEFAULT now(),
  updated_at   timestamptz NOT NULL DEFAULT now()
);

-- Embeddings (pgvector — covered in next module)
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE article_embeddings (
  article_id   uuid PRIMARY KEY REFERENCES knowledge_articles(id) ON DELETE CASCADE,
  embedding    vector(1536) NOT NULL,
  model        text NOT NULL,         -- track which model produced it
  embedded_at  timestamptz NOT NULL DEFAULT now()
);

-- Chat history
CREATE TABLE chat_messages (
  id            bigserial PRIMARY KEY,
  session_id    uuid NOT NULL,
  tenant_id     text NOT NULL,
  role          text NOT NULL CHECK (role IN ('user','assistant','tool','system')),
  content       text NOT NULL,
  tokens_in     int,
  tokens_out    int,
  created_at    timestamptz NOT NULL DEFAULT now()
);

Three modelling patterns worth knowing:

Three table-design choices that come up regularly in AI-200 questions.
Feature	text vs varchar	jsonb for variable shape	Composite primary keys
Use	Always `text` (or `text` with `CHECK length(...) <= N`)	When metadata varies per tenant or document type	Multi-tenant tables — `(tenant_id, id)` keeps rows together physically
Why	`varchar(N)` and `text` are equivalent in performance; `text` is more flexible	`jsonb` is binary, indexable with GIN, queryable with `->`, `->>`, `@>`	Improves index locality, simplifies row-level security
Trap	Premature length limits cause migrations later	JSONB hides schema drift — be deliberate	Composite keys with auto-generated columns need careful sequence management

Indexing — the cheapest speed-up you’ll ever buy

Postgres provides several index types; choose based on the query shape.

Index	What it accelerates	Best for
B-tree (default)	Equality, ranges, ORDER BY	Most queries on scalar columns
Hash	Equality only	Rare; B-tree usually wins
GIN	Containment in jsonb / arrays / full-text	`metadata @> '{"tier":"gold"}'`, `tsvector` text search
GiST	Geometric, range, full-text, vector (with extensions)	Geo, ranges, fuzzy text
BRIN	Block range index — tiny, for huge tables ordered by time	Append-only audit / log tables
HNSW / IVFFlat (pgvector)	Approximate nearest-neighbour vector search	Embedding similarity (next module)

Common AI-200 indexing patterns:

-- Frequent equality + ORDER BY on chat history
CREATE INDEX ON chat_messages (session_id, created_at DESC);

-- jsonb metadata containment query (e.g., WHERE metadata @> '{"docType":"policy"}')
CREATE INDEX ON knowledge_articles USING gin (metadata jsonb_path_ops);

-- Audit log: massive table, queries usually by time range
CREATE INDEX ON audit_events USING brin (created_at);

-- Multi-tenant lookup
CREATE INDEX ON knowledge_articles (tenant_id, language, updated_at DESC);

Real-world example: Theo's index sweep on go-live

Tidewater’s clinical assistant launched fast — and slow. P95 read latency was 1.4 s. Theo ran EXPLAIN ANALYZE on the four worst queries and found three sequential scans. The fixes:

Composite index on (tenant_id, updated_at DESC) for the recent-articles query — 1100 ms → 15 ms.
GIN index on metadata for the jsonb containment filter — 800 ms → 8 ms.
BRIN index on audit_events.created_at — keeps the index cheap on a 200M-row table.

No SDK changes. No schema changes. Three CREATE INDEX statements. P95 dropped to 60 ms.

Reading EXPLAIN ANALYZE — the troubleshooting toolkit

EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM knowledge_articles
WHERE tenant_id = $1 AND language = 'en'
ORDER BY updated_at DESC
LIMIT 20;

What to look for:

Sign	Meaning
`Seq Scan` on a large table	Missing index — top suspect
`Index Scan` followed by huge `Filter`	Index covers the wrong columns or order
`Bitmap Heap Scan` with `Heap Blocks: lossy`	Bitmap was too big — boost `work_mem`
`Hash Join` with high `actual rows` × `loops`	Statistics may be stale — `ANALYZE the_table;`

Two SQL one-liners that pay for themselves repeatedly:

-- Live blocked-statement view
SELECT pid, query, state, wait_event_type, wait_event, query_start
FROM pg_stat_activity
WHERE state != 'idle' AND wait_event IS NOT NULL;

-- Cumulative slow-query digest (requires pg_stat_statements)
SELECT round(total_exec_time::numeric, 1) AS total_ms,
       calls, query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

Key terms

Question

What's the recommended way to authenticate to Azure Database for PostgreSQL Flexible Server from a Container App?