dbt is a well-known data building tool that allows you to approach analytics more like playing with LEGO blocks and less like pulling teeth. However, even with dbt, there are some tricks-like deferring in dbt Core-that can save you loads of time, especially when working between production and development environments. By leveraging deferral, we can streamline your workflow, allowing us to focus on new models while seamlessly utilizing production data without the hassle of rerunning every dependency.
The Production-to-Dev Struggle
Picture this: you're testing out a new model, but instead of waiting for all those annoying slow dependencies to rerun, you wish you could just rely on what's already in production, test your stuff, and move on with your life. Lucky for us, deferral does exactly that.
So, what is deferral? Think of it as politely telling dbt, "Hey, I've got my development model here, but for everything else, could you just grab it from production so I don't have to rerun the entire data universe?".
A more sophisticated way of saying the same thing might be like, defer is a feature that allows you to run a subset of models or tests without having to rebuild all of their upstream dependencies.
How Defer Works in dbt Core
- Manifest File: Defer relies on a
manifest.json
file generated from a production run. This file contains metadata about all models in your project. - Reference Resolution: When using defer, dbt resolves
{{ ref() }}
functions differently. Instead of building upstream models, it uses the production metadata to locate the required data. - Command Line Usage: In dbt Core, you activate defer using command line flags:
dbt run --defer --state /path/to/production/manifest.json
- Selective Model Running: You can combine defer with model selection to run only specific models:
dbt run --models my_model+ --defer --state /path/to/production/manifest.json
This runsmy_model
and its downstream models, deferring others to production.
Step 1: Generate a Production Manifesto
First things first, you need to compile your project in production to create a manifest.json
. This file is basically a map of your production models-your trusty sidekick when deferring.
dbt compile --target prod
This command creates a manifest.json
file in the target/
directory. Now we have our production map, and the magic can begin.
But first, we would like to copy and name it target_prod
. Now we have the production map available.
cp -r target target_prod
Step 2: Time for Deferral!
In your development environment, the fun starts with the --defer
and --state
flags. There's no need to sprinkle in any special macros or go tinkering in your dbt_project.yml
file. Nope, just keep your model SQL files as-is and defer to production like a pro.
dbt run --defer --state /target_prod/manifest.json
With this command, we're saying to dbt, "I'm cool with running my local models, but for everything else, let's just pull the production versions, okay?"
Step 3: Get Selective with Model Runs
Want to test specific models while deferring the rest to production? No problem. Just use the --models
flag along with deferral:
dbt run --models my_model+ --defer --state /target_prod/manifest.json
Now, only my_model
and its downstream friends will run locally, while the rest will be deferred to production.
Step 4: Testing Without the Wait
We can also test our models against production data-without rerunning everything.
dbt test --defer --state /target_prod/manifest.json
This allows you to focus your energy on testing the data that matters while skipping the stuff that's already been blessed by production.
Why Deferral is a Game Changer
You know what happens when you stop rerunning your entire pipeline for every tiny change? You gain time-precious, precious time. Instead of waiting for the data gods to smile upon you and recompile everything, you get to work smarter, faster, and more efficiently. Deferring your models means quicker iteration, shorter testing cycles, and fewer "Did this seriously take 45 minutes to run?" moments.
Deferring in dbt Power User Extension for VSCode
In case you are using the dbt power user vscode extension, you can benefit from the 'defer' feature. When you enable the 'defer to production' feature:
- The extension references upstream models from the production environment
- Only the models you're actively working on are built in the development environment
- This creates a hybrid environment where some models come from production and others from development
Usage in the Extension
To use the defer functionality in the dbt Power User extension:
- Open the Actions panel in VSCode
- Find the slider to enable 'Defer to Production'
- When enabled, the status bar will display 'DEFER'
More about the feature and it's uses can be found in this video.
Ready to Defer the Workload?
From a business perspective, the ability to defer in dbt Core isn't just about speeding up development-it's about increasing operational efficiency. By minimizing the time spent on repetitive tasks like rerunning dependencies, your team can focus on delivering insights faster, ultimately accelerating decision-making across the organization.
Deferral in dbt Core is a simple, yet powerful, tool that can save you hours and keep your data workflow humming along. We think you can get it up and running in no time-leaving you to be the data hero your company needs.
So why not give it a go? You've got nothing to lose-except for those long, frustrating waits.
Work together with one of our consultants and maximize the effects of your data.
Contact us, and we'll help you right away.