Domain 1 — Module 4 of 5 80%
4 of 28 overall
Domain 1: Set Up and Configure an Azure Databricks Environment Free ⏱ ~14 min read

Unity Catalog: The Three-Level Namespace

Unity Catalog organises every data asset in a three-level hierarchy: catalog > schema > object. Master the naming conventions, structure, and governance that underpins every other domain in DP-750.

What is Unity Catalog?

Simple explanation

Unity Catalog is the filing system for your entire lakehouse.

Imagine a massive office building. Unity Catalog is the building directory. It tells you:

  • Which floor (catalog) — e.g., “Sales Department,” “Finance Department”
  • Which room (schema) — e.g., “Sales Raw Data,” “Sales Reports”
  • Which file cabinet (table, view, volume) — the actual data

Without Unity Catalog, data is scattered across clusters with no central directory. With it, every table, view, and file has a registered address that anyone authorised can find.

The three-level namespace

Every data object in Unity Catalog has a fully qualified name:

catalog.schema.object
LevelPurposeAnalogyExample
CatalogTop-level container — typically one per business domain or environmentFloor in a buildingsales, finance, dev_sandbox
Schema (database)Groups related objects within a catalogRoom on a floorsales.raw, sales.curated, sales.reports
ObjectThe actual data — tables, views, volumes, functionsFile cabinet in a roomsales.curated.daily_revenue
-- Fully qualified reference
SELECT * FROM sales.curated.daily_revenue;

-- Set default catalog and schema to avoid typing the full path
USE CATALOG sales;
USE SCHEMA curated;
SELECT * FROM daily_revenue;  -- now just the table name

Naming conventions

The exam tests your ability to design naming conventions for different requirements:

By environment (isolation)

PatternExampleUse Case
Separate catalogs per environmentdev_sales, staging_sales, prod_salesFull data isolation between dev/staging/prod
Separate schemas per environmentsales.dev_raw, sales.staging_raw, sales.prod_rawShared catalog, environment isolation at schema level
Separate catalogs per teamdata_engineering, data_science, analyticsTeam-based isolation

Dr. Sarah Okafor at Athena Group chooses separate catalogs per environment (dev, staging, prod) because her security team requires complete isolation — developers must never accidentally query production data.

By external sharing

When sharing data with external partners, create a dedicated sharing catalog:

-- Catalog specifically for data shared via Delta Sharing
CREATE CATALOG IF NOT EXISTS shared_external;
CREATE SCHEMA IF NOT EXISTS shared_external.partner_freshmart;

Exam tip: Naming conventions are tested via scenarios. If the question mentions “isolation” or “prevent accidental access” — think separate catalogs. If it mentions “sharing” — think dedicated sharing catalog.

Real-world naming patterns

Common patterns seen in production:

PatternStructureProsCons
env_domainprod_sales.raw.ordersClear environment + domainLots of catalogs
domain with env schemassales.prod_raw.ordersFewer catalogsLess isolation
domain with env catalogsprod.sales.ordersClean, hierarchicalRequires strict permissions

Most enterprise teams use environment-first (prod_sales, dev_sales) because Unity Catalog supports hierarchical inheritance. However, object-level privileges (USE CATALOG, USE SCHEMA, SELECT) are still required for data access — inheritance simplifies management but doesn’t grant blanket access without explicit permissions.

Creating catalogs

-- Create a catalog
CREATE CATALOG IF NOT EXISTS prod_sales
  COMMENT 'Production sales data for all regions';

-- View all catalogs
SHOW CATALOGS;

-- Describe a catalog
DESCRIBE CATALOG prod_sales;

A catalog is bound to a storage location — where its managed tables physically store data. By default, catalogs use the metastore’s root storage. You can override this:

-- Catalog with custom storage
CREATE CATALOG prod_sales
  MANAGED LOCATION 'abfss://sales-container@adlsaccount.dfs.core.windows.net/prod';

Creating schemas

-- Create a schema within a catalog
CREATE SCHEMA IF NOT EXISTS prod_sales.raw
  COMMENT 'Raw ingested data before any transformation';

CREATE SCHEMA IF NOT EXISTS prod_sales.curated
  COMMENT 'Cleaned, validated, and conformed data';

CREATE SCHEMA IF NOT EXISTS prod_sales.aggregated
  COMMENT 'Business-level aggregates for reporting';

Schemas can also have a managed location that overrides the catalog’s default:

CREATE SCHEMA prod_sales.sensitive
  MANAGED LOCATION 'abfss://secure-container@adlsaccount.dfs.core.windows.net/sensitive';

Volumes: the file storage layer

Volumes are Unity Catalog’s way of managing files (not tables). Think of volumes as governed file directories:

Volume TypeDescriptionUse Case
Managed volumeFiles stored in Unity Catalog’s managed storageInternal data files, staging area
External volumePoints to existing ADLS/S3 locationLanding zone for incoming data files
-- Create a managed volume
CREATE VOLUME IF NOT EXISTS prod_sales.raw.landing_files
  COMMENT 'Landing zone for CSV/JSON files from partners';

-- Create an external volume pointing to existing storage
CREATE EXTERNAL VOLUME prod_sales.raw.partner_uploads
  LOCATION 'abfss://partner-landing@adlsaccount.dfs.core.windows.net/uploads';

Ravi uses volumes at DataPulse to manage the CSV files that clients upload before they’re ingested into Delta tables.

Volumes vs. tables: when to use which
UseTablesVolumes
Structured data (rows and columns)
Files (CSV, JSON, images, PDFs)
Queryable with SQLRead with read_files() or COPY INTO
Schema enforcement✅ (Delta)❌ (files are as-is)
Governed by Unity Catalog

Exam tip: Volumes are for files, tables are for data. If the question mentions “landing zone for raw files” or “file-based ingestion” — think volumes.

Question

What are the three levels of the Unity Catalog namespace?

Click or press Enter to reveal answer

Answer

Catalog > Schema > Object. Fully qualified name: catalog.schema.table_name. Catalogs group by domain/environment, schemas group related objects, objects are the actual tables/views/volumes.

Click to flip back

Question

When should you use separate catalogs per environment vs. separate schemas?

Click or press Enter to reveal answer

Answer

Separate catalogs (dev_sales, prod_sales) for full isolation — permissions inherit at catalog level and prevent accidental cross-environment access. Separate schemas (sales.dev_raw, sales.prod_raw) when teams need shared catalog-level resources but environment separation at the data level.

Click to flip back

Question

What is a volume in Unity Catalog?

Click or press Enter to reveal answer

Answer

A governed file storage object. Managed volumes store files in UC's managed storage. External volumes point to existing ADLS locations. Use volumes for files (CSV, JSON, images) — use tables for structured data.

Click to flip back

Question

What is the naming convention for data shared externally via Delta Sharing?

Click or press Enter to reveal answer

Answer

Create a dedicated sharing catalog (e.g., shared_external) with schemas per partner. This isolates shared data from internal catalogs and makes permission management clear.

Click to flip back

Knowledge check

Knowledge Check

Dr. Sarah Okafor is designing Athena Group's Unity Catalog structure. She needs to ensure developers can never accidentally query or modify production data. Which naming strategy should she recommend?

Knowledge Check

Mei Lin needs to set up a landing zone at Freshmart where external suppliers upload CSV files before they're ingested into Delta tables. The files should be governed by Unity Catalog. What should she create?

Knowledge Check

Tomás is creating the data catalog structure for NovaPay's fraud detection system. He has three data layers: raw transactions, enriched data, and fraud alerts. Which Unity Catalog structure follows best practices?


Next up: Tables, Views & External Catalogs — managed vs external tables, views, materialized views, foreign catalogs, DDL, and AI/BI Genie.