Anyone who has been keeping up with the modern data tech stack and working in the data engineering or analytics engineering professions should know Dbt Labs (formerly Fishtown Analytics) – the creators of dbt (data build tool).
Dbt (data build tool) is an open-source software that helps analysts and data engineers optimize their data pipeline and transform their data in the warehouse more effectively. For more info, click here.
Dbt, also known as data build tool, is a freely available software designed to assist analysts and data engineers in enhancing the efficiency of their data pipeline and performing more effective data transformations in the warehouse. For additional information, visit this link: https://docs.getdbt.com/docs/introduction.
What is DBT Coalesce?
Every year, Dbt Labs organizes its premier analytics engineering conference called “Dbt Coalesce”, the biggest gathering of the Dbt community worldwide. Participants can attend the conference in person or online. This year, Dbt Coalesce 2023 took place over four days in San Diego, California, with a host of virtual attendees. Many one-day events were also held at locations across London and Sydney. The in-person event in San Diego was filled with hackathons, workshops, happy hours, and informative sessions.
In this DBT Coalesce 2023 Recap post, we summarize the key takeaways and valuable insights from the four-day conference in San Diego and the relevant sessions available to watch on demand.
Tristan Handy, founder and CEO of Dbt Labs kicked off the conference with an exciting keynote – unveiling a plethora of new and upcoming Dbt Cloud features. These revolved around the growing complexity of data models and the challenges data teams face as they scale.
DBT (data build tool) 2023 New Features
Dbt Labs has introduced a framework called dbt Mesh – a multi-project dbt architecture allowing the users to reference models across projects in dbt Cloud. The idea is to break down a complex project into smaller components, empowering smaller teams to operate independently.
I found the following sessions to be the most helpful on this topic.
• 5 steps to Data Mesh Nirvana
• Scaling Collaboration With Dbt Cloud
• The more, the merrier: Managing a dynamic, expanding, self-service dbt project
While working with dbt, you have most likely created documentation using “dbt docs”. Dbt Explorer is a new interactive documentation interface providing a 360-degree view of dbt assets. It is much more powerful than its predecessor and, more importantly, is compatible with dbt Mesh architecture.
Dbt Semantic Layer
Dbt Semantic Layer was introduced a few years ago and maintains consistent definitions of business concepts across various teams and analytics tools. This year, Dbt Labs made substantial strides forward in making the feature generally available with the most requested integrations: Tableau and Google Sheets. MetricFlow powers this new version of Semantic Layer and can calculate complex metric definitions efficiently.
Dbt Cloud CLI
If you are like me, you are a big fan of Microsoft Visual Studio code. With dbt Cloud CLI, we can now utilize all the important features of dbt Cloud while enjoying the familiarity of our default IDE (such as VS Code). Deep breaths
Another excellent and very useful feature in dbt Cloud is introducing a dedicated job type for continuous integration – with best practice configurations pre-applied. Gone are the days of writing custom scripts configuring GitHub actions for creating CI/CD pipelines!
Watch the following sessions to see how companies are using CI/CD
• Better CI for better data quality
• Data and monolith: Scaling a computationally slim 1500+ model beast
• Hands-on tips to get started with CI in dbt Cloud
Snowflake Cost Optimization
With infinitely growing data, companies are driven to reduce their data infrastructure costs. This year, quite a few sessions were focusing on this theme. Companies like SELECT.dev are dedicated to helping Snowflake customers automate savings, quickly identify and implement optimization opportunities, and easily control their usage. You can also check one of our recent posts about Snowflake SQL Tips
Optimizing Your Dbt Project
The session – From slow to swift: Proven methods for optimizing your dbt project, by the founders of SELECT.dev, provided an excellent overview of the happenings behind the Snowflake query engine, billing, and how small steps can be taken to improve query performances leading to cost optimization.
Tip: I highly recommend downloading the free, open-source snowflake-cost-monitxoring dbt package to understand Snowflake costs over time across various practical dimensions (service, warehouse, database, etc.).
In addition to the above, there were a couple of beneficial sessions on how Dbt Labs internal teams optimize their data costs using very similar strategies as adopted by the snowflake-cost-monitoring dbt package. You can watch the recordings here:
• How Dbt Labs tunes model performance and optimizes cloud data platform costs with Dbt
• Need for speed (and less spending): The story of finance data at Snowflake
Data Democratization was one of the key themes during the 4-day conference. Data Democratization is the movement pushing to make data more approachable and usable for all, technical or not. Data cataloging and data governance are essential aspects of democratizing data. One of the most talked about features during the vendor conversations was the column-level lineage feature, which could save hundreds of hours for data engineers and analysts not having to chase down what caused downstream models and reports to break.
Final Thoughts About Dbt Coalesce 2023
Dbt Coalesce 2023 surpassed previous years in terms of size and excellence. While the emphasis was on enhancing the paid product, dbt Cloud, the free dbt Core is still available for data teams to utilize essential data transformation capabilities. Moreover, the dbt community continues to make valuable contributions, making the open source product a valuable and budget-friendly choice for startup data teams, such as the vs-code extensions dbt-power-user.
Where Can I Watch All The Coalesce 2023 Sessions?
You can watch all the Coalesce 2023 Sessions on YouTube by clicking here.
We also invite you to register early for Coalesce 2024 which will take place in Las Vegas or Online from October 7-10, 2024.
Data-Sleek collaborates with dbt (data build tool) to assist our clients in constructing optimal data warehouse solutions.
Thank you to Ovais Siddiqui, for this insightful recap. Dive deeper into optimal data solutions with Data-Sleek and DBT!
Published On : 12/13/2023