Supercharge your dbt utils Workflow
dbt utils is a collection of reusable macros, tests, and functions that simplify dbt development. It eliminates repetitive SQL, improves data quality, and promotes consistent modeling practices. By using dbt utils, teams can build reliable, scalable, and maintainable data pipelines faster and with greater confidence.
name: 'my_project'
version: '1.0.0'
config-version: 2
profile: 'my_project'
model-paths: ["models"]
seed-paths: ["seeds"]
test-paths: ["tests"]
dbt utils Integration
packages:
- package: dbt-labs/dbt_utils
version: 1.1.0
Why dbt utils ?
dbt utils is the most widely downloaded package in the dbt ecosystem, serving as the go-to standard library for dbt projects. It provides reliable, battle-tested solutions for SQL patterns that are often complex, repetitive, or error-prone, helping analysts and engineers write cleaner, more maintainable data transformations.
With dbt utils, you can easily perform tasks such as generating a complete date dimension, pivoting tables, validating referential integrity across your data warehouse, and more.
Key Features
dbt utils provides everything you need to build robust, efficient, and reliable data pipelines effortlessly.
Data Transformation Macros
dbt utils provides powerful SQL generating macros that simplify complex transformations. Instead of manually writing repetitive SQL for reshaping datasets, you can use reusable building blocks that adapt to your data structure. These macros are especially helpful when working with wide tables, inconsistent schemas, or dynamic reporting needs. They reduce human error, improve readability, and make transformations easier to maintain over time.
- Key macros include:
- union_relations
- pivot
- unpivot
- star
- group_by
- safe_add
- safe_subtract
Generic Data Tests
Data quality is a core part of analytics engineering, and dbt utils extends dbt’s testing capabilities with more advanced and flexible tests. These tests help validate business logic, enforce data integrity rules, and catch unexpected changes early in your pipeline. They are configurable and reusable across models, making it easy to apply consistent quality standards throughout your warehouse.
- Common tests include:
- unique_combination_of_columns
- not_null_proportion
- accepted_range
- expression_is_true
- equal_rowcount
- fewer_rows_than
- at_least_one
DRY (Don’t Repeat Yourself) SQL
dbt utils encourages modular, reusable SQL patterns. Instead of copying and pasting logic across models, you can centralize common operations into macros. This keeps your codebase cleaner, reduces duplication, and makes future updates much easier. It also improves collaboration since teams can rely on shared transformation standards.
- Macros that support DRY principles:
- star
- group_by
- generate_surrogate_key
- union_relations
Surrogate Key Generation
Creating reliable unique identifiers is essential when source systems don’t provide clean primary keys. dbt utils offers a standardized way to generate hashed surrogate keys using multiple columns. This ensures consistency across models and avoids issues caused by null values, spacing differences, or concatenation mistakes. The macro works across different databases, so your logic stays portable.
- Main macro:
- generate_surrogate_key
Date & Series Generation
Time-based analysis often requires a continuous date or timestamp series even when your raw data has gaps. dbt utils helps generate these sequences using simple macros, making it easy to build date dimensions, fill missing periods, or standardize reporting timeframes. This is especially valuable for dashboards, retention analysis, and forecasting.
- Key macro:
- date_spine
Cross-Database Compatibility
Different data warehouses have slightly different SQL syntax, which can make dbt projects harder to maintain when switching platforms. dbt utils smooths over these differences by abstracting warehouse specific logic into macros. This allows teams to write transformations once and run them anywhere, improving portability and reducing vendor lock-in.
- Examples of cross-db helpers:
- type_string
- type_int
- current_timestamp
- dateadd
- datediff
Most commonly used macros
Here are some of the most loved macros that data teams use every day. Each one solves a common problem elegantly.
surrogate_key
Create stable hashed keys from multiple columns.
{{ dbt_utils.surrogate_key(['user_id', 'order_date']) }}
Very common in fact tables.
generate_surrogate_key
Newer + preferred over surrogate_key)
{{ dbt_utils.generate_surrogate_key(['user_id', 'order_date']) }}
Handles nulls more safely.
star
Select all columns except a few (life saver).
select
{{ dbt_utils.star(from=ref('raw_users'), except=['password']) }}
from {{ ref('raw_users') }}
union_relations
Union multiple models with different schemas.
select
{{ dbt_utils.union_relations(
relations=[
ref('events_2023'),
ref('events_2024')
]
) }}
Automatically aligns columns.
pivot
Turn rows into columns.
{{ dbt_utils.pivot(
'status',
['active', 'inactive', 'banned']
) }}
group_by
Dynamic GROUP BY without counting columns manually.
select
country,
sum(revenue)
from {{ ref('sales') }}
{{ dbt_utils.group_by(1) }}
Generic tests (huge deal)
Unique combination test
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- user_id
- order_date
tests:
- dbt_utils.not_null_proportion:
at_least: 0.99
Installation of dbt Utils
Installing dbt utils is quick and works the same across all supported data warehouses.
Prerequisites
Before installing dbt utils, make sure you have:
- dbt Core installed (v1.0+ recommended)
- A working dbt project
- Access to one of the supported warehouses (Snowflake, BigQuery, Redshift, Postgres, Databricks, etc.)
You can verify dbt is installed by running:
dbt --version;
Step 1: Add dbt utils to packages.yml
Inside your dbt project, open or create a file called packages.yml in the root directory.
Add the following:
packages:
- package: dbt-labs/dbt_utils
version: ">=1.0.0"
Versioning tip
Use >=1.0.0 to get recent features
Pin an exact version for production stability:
version: "1.1.1"
Step 2: Install the Package
Run this command in your project directory:
dbt deps
This will:
- Download dbt utils
- Store it in the dbt_packages/ directory
- Make all macros available to your project
Step 3: Verify Installation
You can confirm dbt utils is installed by running:
dbt debug
Or by using a macro in a model:
select
{{ dbt_utils.generate_surrogate_key(['id', 'created_at']) }} as surrogate_key
from {{ ref('raw_orders') }}
Common Installation Issues
Package not found
- Check YAML indentation
- Ensure packages.yml is in the project root
Version conflicts
Align dbt Core and dbt utils versions
Clear packages and reinstall:
rm -rf dbt_packages package-lock.yml
dbt deps
What’s Next?
After installing dbt utils, explore popular macros like:
- generate_surrogate_key
- star
- union_relations
- pivot
- date_spine
Real-World Use Cases of dbt utils
From startups to large enterprises, data teams use dbt utils to solve common, real-world data modeling and analytics challenges. Below are practical ways you can apply dbt utils in your own projects.
Analytics Engineering
Analytics engineers use dbt utils to build scalable, maintainable analytics models faster.
- Date spines (date_spine) help generate complete time-series tables, even when source data is missing dates.
- Pivots (pivot) convert metric rows into clean, analysis-ready columns.
- Surrogate keys (generate_surrogate_key) enable consistent dimensional modeling across facts and dimensions.
- Result: Cleaner models, less repetitive SQL, and faster iteration cycles.
Customer 360 Views
Creating a unified customer view often means working with multiple data sources that don’t align perfectly.
- Use union_relations to combine customer data from CRMs, marketing platforms, and support tools.
- Create stable, unique customer identifiers using generate_surrogate_key across email, user IDs, or device IDs.
- Result: A single, trustworthy customer record that supports segmentation, lifecycle analysis, and personalization.
E-commerce Reporting
E-commerce analytics requires flexible reporting across products, orders, and time.
- Build order cohorts using date logic and surrogate keys.
- Calculate running totals and rolling metrics with cleaner GROUP BY logic (group_by).
- Create product category pivots with pivot to analyze sales performance across dimensions.
- Result: Faster insights into revenue, customer behavior, and product performance.
Historical Data Analysis
Missing dates can break trend analysis and visualizations.
- Use date_spine to generate continuous date ranges.
- Join fact data to the spine to fill gaps where no events occurred.
- Analyze metrics across complete and consistent timelines.
- Result:Accurate trend analysis, better forecasting, and cleaner dashboards.
Financial Modeling
Finance teams rely on precise calculations and strict data quality.
- Build custom fiscal calendars using date_spine.
- Calculate cumulative metrics such as YTD revenue and expenses.
- Enforce data integrity with dbt utils testing macros like unique combinations and not-null thresholds.
- Result:Reliable financial models that stakeholders can trust.
Data Warehouse Design
dbt utils is commonly used to design robust warehouse architectures.
- Implement slowly changing dimensions (SCDs) using surrogate keys and audit columns.
- Add standardized audit fields (created_at, updated_at).
- Maintain referential integrity between fact and dimension tables with testing macros.
- Result:A scalable warehouse that supports growth, governance, and long-term analytics.
Real-World Examples
These examples show how dbt utils is applied in real dbt projects to solve common data modeling problems. Each scenario reflects challenges faced by analytics engineers and how dbt utils provides clean, scalable solutions.
-- Create a unique key from multiple columns
SELECT
{{ dbt_utils.generate_surrogate_key([
'customer_id',
'order_date',
'product_id'
]) }} as order_key,
customer_id,
order_date,
product_id,
quantity,
total_amount
FROM {{ ref('stg_orders') }}
-- Generate a complete calendar table
WITH date_spine AS (
{{ dbt_utils.date_spine(
datepart="day",
start_date="cast('2020-01-01' as date)",
end_date="current_date"
) }}
)
SELECT
date_day,
EXTRACT(YEAR FROM date_day) as year,
EXTRACT(MONTH FROM date_day) as month,
EXTRACT(DOW FROM date_day) as day_of_week
FROM date_spine
-- Turn row values into columns
SELECT
customer_id,
{{ dbt_utils.pivot(
'payment_method',
dbt_utils.get_column_values(
ref('stg_payments'),
'payment_method'
),
agg='sum',
then_value='amount'
) }}
FROM {{ ref('stg_payments') }}
GROUP BY customer_id
-- Combine tables with same structure
{{ dbt_utils.union_relations(
relations=[
ref('events_web'),
ref('events_mobile'),
ref('events_api')
],
include=['event_id', 'user_id', 'event_name', 'created_at'],
source_column_name='event_source'
) }}
Fact Table Modeling
Problem
When building fact tables, teams often need a single, stable primary key.
However:
- Source systems rarely provide one
- Keys may be spread across multiple columns
- Null values can cause inconsistent joins
- This makes downstream joins unreliable and breaks dimensional modeling best practices.
Solution: Using generate_surrogate_key
The generate_surrogate_key macro creates a hashed, deterministic key from one or more columns.
It:
- Handles null values safely
- Produces consistent keys across models
- Works across all supported warehouses
- This makes downstream joins unreliable and breaks dimensional modeling best practices.
Final SQL Example
select
{{ dbt_utils.generate_surrogate_key([
'order_id',
'order_date',
'customer_id'
]) }} as order_key,
customer_id,
order_date,
revenue
from {{ ref('stg_orders') }}
Result
A clean, analytics-ready fact table with a reliable primary key that supports joins, snapshots, and slowly changing dimensions.
Unioning Multiple Sources
Problem
Data often arrives from multiple systems or time-partitioned tables, such as:
- Historical and current data sources
- Region-specific event tables
- Platform-specific schemas
- Manually unioning these tables is error-prone, especially when schemas differ.
Solution: Using union_relations
The union_relations macro automatically:
- Aligns column names
- Handles missing columns
- Applies consistent data types
- Manually unioning these tables is error-prone, especially when schemas differ.
Final SQL Example
{{ dbt_utils.union_relations(
relations=[
ref('events_web'),
ref('events_mobile'),
ref('events_legacy')
]
) }}
Result
A single, unified dataset that can be modeled, tested, and queried consistently—without manual schema management.
Best Practices
Make the most of dbt utils by following these guidelines from experienced data engineers.
Do This
- Use surrogate_key for dimension tables instead of natural keys
- Leverage date_spine for time-series analysis and reporting
- Store get_column_values results in variables for reuse
- Use star() with except to avoid selecting unnecessary columns
- Pin your dbt_utils version in packages.yml for stability
Avoid This
- Don't call get_column_values multiple times for the same column
- Avoid using pivot on high-cardinality columns
- Don't override default macro behaviors unless necessary
- Skip union_relations for tables with different schemas
- Don't hardcode values that could be dynamic
Works With Your Stack
dbt utils is database-agnostic. It automatically adapts to your warehouse, so you can switch platforms without changing your code.
Snowflake
BigQuery
Redshift
Databricks
PostgreSQL
DuckDB
Spark
Trino
Database Adapters
Macros detect your adapter and generate the right SQL syntax automatically.
Cloud Native
Optimized for cloud data warehouses with parallel processing support.
dbt Core & Cloud
Works seamlessly with both dbt Core and dbt Cloud environments.
Frequently Asked Question
What is dbt-utils?
dbt-utils is an open-source package that provides reusable macros and tests to simplify common SQL patterns in dbt projects.
Why should I use dbt-utils?
It saves time, reduces repetitive SQL, improves consistency, and adds powerful testing and modeling capabilities without reinventing the wheel.
How do I install dbt-utils?
Add it to your packages.yml file and run dbt deps to install the package.
Is dbt-utils compatible with all databases?
Most macros are adapter-aware and work across major warehouses like Snowflake, BigQuery, Redshift, and Postgres, though some macros are adapter-specific.
How do I check the version of dbt-utils I’m using?
The version is defined in your packages.yml file and can be confirmed in the dbt_packages directory after running dbt deps.
Is dbt-utils officially supported by dbt Labs?
Yes, dbt-utils is maintained by dbt Labs and the dbt community.
What are macros in dbt-utils?
Macros are reusable SQL snippets written in Jinja that help generate dynamic and consistent SQL logic.
What is the surrogate_key macro used for?
It generates a hashed surrogate key from one or more columns, useful for dimensional models.
How does union_relations work?
It unions multiple relations together while automatically aligning column names and data types.
What does get_column_values do?
Yes, you can override macros by redefining them in your own project’s macros directory.
Can I customize dbt-utils macros?
The version is defined in your packages.yml file and can be confirmed in the dbt_packages directory after running dbt deps.
Are dbt-utils macros safe to use in production?
Yes, they are widely used in production environments, but you should test them with your warehouse and data volumes.
What kind of tests does dbt-utils provide?
dbt-utils offers advanced schema tests like unique_combination_of_columns, accepted_range, and not_null_proportion.
How is unique_combination_of_columns different from unique?
It ensures that a combination of columns is unique, rather than testing uniqueness on a single column.
What is the accepted_values test used for?
It checks whether a column contains only a predefined set of allowed values.
Can dbt-utils tests be used in schema.yml files?
Yes, dbt-utils tests are designed to be declared directly in schema.yml files.
Are dbt-utils tests warehouse-optimized?
Most tests are written to be efficient, but performance depends on table size and warehouse configuration.
Can I create my own custom tests using dbt-utils patterns?
Absolutely — dbt-utils provides great reference patterns for building custom generic tests.
Does dbt-utils impact query performance?
The macros themselves don’t run queries, but the SQL they generate can affect performance depending on how they’re used.
When should I avoid using dbt-utils macros?
Avoid them if a macro adds unnecessary complexity or hides logic that needs to be explicit for your use case.
How do I keep dbt-utils up to date?
Regularly update the version in packages.yml and review the changelog for breaking changes.
Can dbt-utils help with cross-database compatibility?
Yes, many macros abstract adapter-specific SQL, making models more portable across warehouses.
What’s the best way to learn dbt-utils?
Start with the official documentation, then explore the macro source code to understand how each one works.
Is dbt-utils suitable for large-scale analytics projects?
Yes, it’s commonly used in enterprise-scale dbt projects to enforce standards and speed up development.