Modern analytics teams thrive on accurate, timely data. But one of the biggest challenges is tracking how data changes over time. Enter snapshot tables in dbt, a powerful feature that lets you maintain historical records of your data with minimal hassle. In this post, we’ll explore why snapshot tables are useful, how they work, and how to set them up in dbt.
What Are Snapshot Tables?
A snapshot table is essentially a historical record of your data. Unlike standard tables that only reflect the latest state, a snapshot table captures changes over time. This is particularly useful for:
- Tracking slowly changing dimensions (SCDs) like customer addresses, subscription status, or pricing tiers.
- Auditing changes in key business data for compliance or internal reporting.
- Building trend analysis or cohort analysis dashboards.
Think of a snapshot as a “time machine” for your data—you can see not just the current state, but every meaningful change that happened along the way.
Why Snapshot Tables Are Useful
- Historical Insights
Without snapshots, your tables often only reflect the latest data, making it hard to see trends over time. Snapshots give you full historical context, allowing for better decision-making. - Auditability
In regulated industries or teams that require data verification, snapshots provide an audit trail of how data has changed, making it easier to answer questions like: “When did this customer change plans?” or “What was the previous price?” - Simplifying Slowly Changing Dimensions
Many analytical tasks require knowing past states of records. Snapshots automate this tracking, so you don’t need complex ETL pipelines to handle historical versions of your data. - Data Reliability
By storing changes systematically, snapshots reduce the risk of overwriting or losing critical historical data.
How Snapshots Work in dbt
dbt supports snapshot tables natively using the snapshots feature. Here’s the high-level idea:
- Identify a table you want to track.
- Specify a unique key (like
user_idororder_id) to identify records. - Specify which columns to track for changes.
- dbt will store a new record each time a tracked column changes, along with timestamps for when the record was valid.
This creates a table where each row includes:
- The record’s data at a point in time.
dbt_valid_from– when this version became valid.dbt_valid_to– when it was superseded.
Setting Up Snapshot Tables in dbt
Here’s a step-by-step guide:
1. Create a Snapshot File
Inside your snapshots/ directory, create a file like customer_snapshots.sql.
{% snapshot customer_snapshots %}
{{
config( target_schema='analytics', unique_key='customer_id', strategy='timestamp', updated_at='updated_at'
)
}} SELECT
customer_id,
email,
status,
updated_at FROM {{ source('raw', 'customers') }}{% endsnapshot %}
Key Config Options:
unique_key: Column that uniquely identifies the record.strategy: Can betimestamp(track changes based on a timestamp column) orcheck(compare values to detect changes).updated_at: The column used to detect changes when using thetimestampstrategy.
2. Run the Snapshot
Once the snapshot file is ready, run:
dbt snapshot
dbt will create a snapshot table and populate it with the current state of your data. On subsequent runs, it will add new rows only when data changes, preserving history.
3. Querying Your Snapshot
Once snapshots are running, you can query historical data easily. For example, to see all versions of a customer record:
SELECT * analytics.customer_snapshots
FROMWHERE customer_id = 123 dbt_valid_from;
ORDER BY
You can also query state at a specific point in time:
SELECT * analytics.customer_snapshots
FROMWHERE '2026-01-01' BETWEEN dbt_valid_from AND dbt_valid_to;
Best Practices
- Pick the right strategy: Use
timestampif your source has anupdated_atcolumn. Usecheckif you need to track changes across multiple columns without a reliable timestamp. - Limit tracked columns: Only track columns you care about; fewer columns mean faster snapshots and smaller tables.
- Schedule snapshots regularly: For high-frequency data changes, consider running snapshots daily or hourly.
- Monitor table growth: Snapshots can grow large over time. Consider partitioning or archiving old data if your warehouse supports it.
Conclusion
Snapshot tables in dbt are a game-changer for analytics teams that need historical insight, auditability, and accurate slowly changing dimensions. With minimal setup, you can start tracking changes in your data and unlock richer, more actionable insights.
Once you start using snapshots, you’ll wonder how you ever did analytics without them!
