The Fundamentals of dbt

This week we are on a journey to learn DBT and this blog will impart the knowledge I have gained on day 1.

What is DBT?

DBT stands for Data Build Tool is a tool that helps data analysts and engineers transform raw data into usable data. It is primarily used for data transformation that allows users to manage and document their data models in SQL.

SQL Similarities

SQL is used for querying, updating and managing data.

DBT is a tool built on top of SLQ, specifically designed for transforming raw data into data models. It helps automate and manage the process of writing, and running SQL transformations. DBT can be seen as a framework that helps users organise and automate SQL queries in a more structured way.

Key Features

  • Modularisation is a key aspect of DBT. This allows users to break down complex SQL transformations into smaller, reusable pieces.
  • Documentation is also easy to generate

DBT Project Structure

A typical project contains the following main folders and files:

  • Models/: SQL transformation files
    • You can organise your models into different layers.
    • Staging models (e.g. stg_ ): These models typically handle data ingestion and basic cleaning (e.g., removing nulls, formatting columns).
    • Intermediate models (e.g. int_): These models perform more complex transformations, like aggregating data or joining tables.
    • Final models (e.g., fact_ or dim_ ): These models often create fact tables (e.g. sales) or dimension tables (e.g. customers) for analytical use.
  • Macros/: custom SQL macros (reusable functions)
  • Tests/: contains tests for data integrity and quality
  • dbt_project.yml: configuration file that define project settings
  • profiles.yml: database connection configurations

DBT Commands

  • DBT run: runs all models, executing the transformations
  • DBT test: runs all tests on the models
  • DBT seed: loads CSV files into the data warehouse
  • DBT docs generate: generate documentation for your models

Key DBT Syntax

  • {{ ref('model_name') }}: References another model in the project
  • {{ config(materialized='table') }}: Sets how the model is materialised (shown in SQL)
  • {{ macro_name() }}: calls a defined macro
Author:
Priya Kondola
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab