Real-Time Analytics with dbt + Rockset

July 15, 2021

,

Rockset was founded to make it easy for developers and data teams to go from real-time data to actionable insights. We designed Rockset to remove many of the barriers teams face while building with real-time data including data preparation, performance tuning and infrastructure management. We also built ground up to support full SQL (including joins and aggregations), the most common query language for analytics.

That’s why we’re excited to bring the power of dbt's data transformation framework to real-time analytics with our new dbt-Rockset adapter. dbt is an open-source tool that lets data teams collaborate on transforming data in their database to ship higher quality data sets, faster. It does this by enabling them to use software development best practices like modularity, version control, testing and documentation. To execute transformations in dbt, users only need to define logic in SQL using SELECT statements, and dbt takes care of the DDL/DML and defining the order of execution. All of this reduces the need for expensive and time-consuming engineering work.

real-time-analytics-with-dbt-rockset-figure-1-1

dbt labs, the company behind dbt, believes in many of the same principles that we believe in here at Rockset. Both products support transformations within the data system to avoid creating and maintaining brittle pipelines. dbt and Rockset respect SQL as the lingua franca of data analysis and make it more easily available to all. And, dbt and Rockset enable teams to create shared “building blocks” of data for broad use across the entire organization.

We believe these core principles are even more important in the world of real-time analytics where transformations must happen on the fly so that new data is queryable the second it is generated.

We’re excited to make it easy for data teams to analyze real-time data and unlock new use cases including:

  • Real-time customer 360s: A centralized, real-time view of customer activity enables teams to respond to events as they happen and create a seamless customer experience.
  • Real-time personalizations: Create custom user experiences using their latest interactions to increase engagement and grow revenue.
  • Real-time business reporting: Live dashboards enable operations and business teams to monitor and respond to time-critical events.
  • Real-time embedded dashboards: Embedded dashboards are real-time visualizations that are embedded in user-facing SaaS applications.

How the dbt-Rockset adapter works

Rockset ingests and indexes all kinds of data- structured, semi-structured, geo, or time-series data- for millisecond latency queries on the latest data (<1 second data latency).

There are four simple steps to go from real-time data to insights in Rockset:

  1. Connect to your data source: Set up secure integrations with transactional databases, event streams, data lakes or warehouses using built-in data connectors. These integrations give Rockset read-only access to your data.
  2. Create a collection: Collections are the same as tables in a relational model.
  3. Run SQL queries: Run sub-second SQL queries across any collection.
  4. Create data APIs: Query Rockset directly from your favorite visualization tool or application using Query Lambdas. Query Lambdas are named, parameterized SQL queries that can be executed from a dedicated REST endpoint.

With the new dbt-Rockset adapter, you can load data into Rockset and create collections by writing SQL SELECT statements in dbt. Collections can be built on top of one another to support highly complex queries with many dependency edges.

real-time-analytics-with-dbt-rockset-figure-2

Here’s how you can get up and running with dbt and the dbt-Rockset adapter:

  1. First, if you have never worked with dbt before, we recommend following their getting started guide. This will walk you through downloading dbt, connecting it with an external data source and running a few basic models. Because the dbt-Rockset adapter is not available on dbt cloud, you will need to use the dbt cli for this tutorial.

real-time-analytics-with-dbt-rockset-figure-3

  1. Download the dbt-Rockset adapter available here via PyPi. dbt is built on the idea of modularized plugins that can be quickly incorporated in any dbt project. The dbt-Rockset adapter can be installed in this standard way.

real-time-analytics-with-dbt-rockset-figure-4

  1. Configure a dbt profile to connect with your Rockset account. Enter any workspace that you’d like your dbt collections to be created in, and any Rockset API key. The database field is required by dbt but unused in Rockset.
rockset:
  outputs:
    dev:
      type: rockset
      threads: 1
      database: N/A
      workspace: <my_workspace>
      api_key: <my_api_key>
  target: dev
  1. Finally, update the dbt project that you created in step 1 to use the Rockset dbt profile that you created in step 3. You can switch profiles in your project by editing the dbt_project.yml file.

We’ve open-sourced the first release of the dbt-Rockset adapter, and would love your input and feedback. You can find us on the dbt Slack or in the Rockset community.

This is just the initial release of several exciting upcoming releases. Hint hint: full-fledged streaming ELT workflows with views. Our goal is to make real-time analytics possible and easy for data teams- please join us on this journey!

Learn more about how Rockset is creating a world where data is always fresh, queries run in 1ms and analytics engineers build web-scale, real-time data apps. Listen to Rockset CEO and co-founder Venkat Venkataramani on The Analytics Engineering Podcast sponsored by dbt Labs.