Some people consider data modeling a mysterious art, and for good reason. It practically requires you to know what questions you want answered from your data beforehand. You need some subject matter expertise to understand what pieces of data will help you in your inquiry. DBT (Data Build Tool), is one tool that has made modeling easier. In fact, you could say that enabling business analytics with DBT supercharges your data team.
You will find dimension and fact tables in a Data Warehouse. Facts track the actual measures and dimensions bring context to the measures. This allows you to slice the data many different ways. When you add dimensions to your data, what you’re really doing is devising meaningful ways to organize your data. Well organized data translates to a more performant database, enabling you to retrieve information and build reports more efficiently.
DBT led the way as companies started to move from using ETL (Extract Transform Load) to ELT (Extract Load Transform). Basically, this change in paradigm allows you to extract data from any number of useful sources, load that raw data to the Data Warehouse, and then transform as needed. This approach is very versatile and accounts for much of DBT’s popularity. It connects to the raw data stored in a Data Warehouse and then builds facts and dimensions. The key here is that the origin and destination of that data likely expects the data to have a certain “shape” and this is where data modeling, in transform tools like DBT, is an essential tool for data analytics engineers.
Empower Analysts with Self-Serve Data
Because DBT uses SQL Select statements to build data models, it is an easy to learn tool for analysts. In fact, DBT has had such an impact in the Data Analytics industry that it basically spawned the “Analyst Engineer” job title. An Analyst can transform the data independently now with no need to depend on a Data Engineer, thanks to the tooling provided by DBT. Because Analysts work closely with key stakeholders in the business, they can quickly implement models to bring insights to their peers using self-serve data access. DBT takes care of all the complexity involved with data transformations and modeling so that you can focus on solving business problems instead of letting technical issues getting in the way. It does this by providing you with an easy-to-use interface for defining your data modeling needs.
Example Use Case for DBT in Analytics
Here’s an example of a very simple use case for business analytics with DBT. The first step is extraction. A business might be using several tools that can serve as data sources. For instance, three sources could be a Postgres database that stores data associated with sales from your e-commerce platform, a CRM, and additional data provided by a third party. The data team wants to extract all of that data and combine it in one place: a data warehouse. Then, the analysts can formulate a question about what demographic that marketing would like to target with their next campaign. Data analysts review the dimensions present in the warehouse and decide which ones can help answer that question. Now, with DBT, you can query your data using SQL SELECT statements. Then you can build tables to use as the basis for creating graphs and metrics that answer your question.
So, you transform the data in a way that makes sense for the BI tool. You could, alternatively, transform the data to prepare it for warehousing, or to store it in a Data Lake. DBT lets you use raw data, pulled from operational sources and stored in a Warehouse to transform as needed.
To make sense of how data transformation and adding dimensions work hand in hand, know that sometimes the destination requires a dimension not present in the original data set. Analysts can often address this problem by using calculated values.
Where do new dimensions come from?
DBT is a tool that helps companies create meaning out of their business and data analytics. It allows you to integrate diverse sets of data. Use that integrated data in a BI tool to present this information through dashboards, reports or other visualizations. Many companies aren’t pulling all of their data sources together. That means they might be missing out on some valuable insights. Let’s go back to that e-commerce company’s example. They had data from a CRM, some operational data, and some data from a third party. Someone on the team remembers that there is also an extensive survey that went out last quarter. They think there might be useful information in that survey to further inform marketing strategy.
Data modeling is a process for creating a logical structure of the data. It defines the way your company organizes its data, including where it resides and how it can be accessed. Data models are typically created without considering how they will be used by end users—who might not understand them at all. However, with a tool like DBT, it’s possible to enable self service data use by those who need it, using the dimensions they see as the most informative.
Creating a model for your data means taking time to think about what type of content you have and how that content is organized into categories and relationships with other pieces of information. This approach requires consideration from many different stakeholders—developers, product managers, designers and business analysts—to ensure you’re all on the same page when building out new features or products that require access to this information in order to function properly.The two main approaches to thinking about data modeling for warehousing are the Kimball methodology and the Inmon methodology. There is no definite answer as to which one is better, but each approach has its own set of benefits and drawbacks. In fact, sometimes companies will adopt a hybrid of these two approaches. Ultimately, businesses want to think of the warehouse as the place with the answers.
DBT is a very powerful and versatile tool and it might be just right for you
Avoid pitfalls and harness this powerful tool to modernize the way you use data at your company. We have worked with recent clients to use GitHub actions triggering DBT for Singlestore and now they’re set up to transform data they need for daily, hourly, and even minute-by-minute metrics.
The examples we demonstrated in this blog post should give you a good idea of the power of dbt. We’ve chosen to focus on data modeling, but there are plenty more examples out there where this tool shines. Your team could be leveraging basic SQL skills to perform powerful data transformations.
DBT is also open source, which means you can use if for free. When considering this tool to perform business analytics with DBT there are important trade-offs between the paid and free version. In summary, you should know that DBT is easy to use and navigate for analysts in the paid, cloud version. It also provides data lineage in the cloud offering. This is a type of data tracking, which is very useful especially when your company compiles data from multiple sources. DBT enables reusable macros, basically a code template that you can reuse again and again, further enabling self-serve data access. DBT has built-in testing for all your models. When you use DBT you can orchestrate either using DBT Cloud or triggers in Airflow. That’s a lot of analytical power packed into one tool. We are here to help with an expert consultant who understands how your company can best use DBT.
For more context on where DBT is used by businesses in a modern data ecosystem, read our blog post about the modern data stack.
Published On : 10/13/2022