SQL Search and Analytics for Your Data Lake
Real-time APIs and dashboards on Amazon S3 and Google Cloud Storage

Analyze Raw Data as It Lands in Your Data Lake

Rockset is a fully managed service that enables real-time search and analytics on raw data from Amazon S3 and Google Cloud Storage – with full featured SQL. Rockset takes an entirely new approach to loading, analyzing and serving data so that you can run powerful SQL analytics on data from your data lake without ETL.

Use Cases
Real-time dashboards
To make live dashboards possible, Rockset reflects new data in seconds and delivers query responses in milliseconds. To bring interactive dashboards and drilldowns to life, Rockset supports high concurrency and thousands of queries per second.
Data as an API
APIs are a critical part of the strategy for fast-moving developers building on a global scale. Rockset uses an API-first approach so you can deliver your data as an API and allow your developers to build applications faster.
Analytics on primary database
Stream data from any transactional database to your data lake, and Rockset will automatically pick up new data as it lands and deliver real-time SQL analytics on your latest business data.

Operational Analytics on Data in Your Data Lake

Run fast SQL analytics on data collected in your data lake without requiring upfront schema definition or ETL. As new data lands in your data lake, it becomes queryable in seconds in Rockset. Queries return in milliseconds, so performance is no longer a limiting factor.

Load continuously from your data lake
Rockset delivers low data latency through native integrations with S3 and GCS. Rockset initially batch loads data from S3 or GCS, then switches to continuous ingest to stay in sync, with no more than few seconds delay. It automatically monitors to ensure consistency between S3/GCS and Rockset and purges old data using time-based retention policies. No ETL tools like AWS Glue required.
Explore unstructured data as SQL tables
Rockset enables millisecond SQL including joins, filters, aggregates and full text search. It schemalessly ingests raw data, including nested JSON, and represents them as SQL tables that automatically adapt to any changes in the data from your data lake. Join, filter, aggregate across datasets without upfront schema definitions.
Deliver real-time APIs & dashboards
Rockset delivers low query latency with cloud-native auto-scaling and performance isolation. Real-time applications can programmatically access Rockset via Python, Java, Javascript, GO or REST APIs and live dashboards can use SQL clients like Tableau, Redash, Apache Superset or others via the JDBC connector. Deploy operational analytics in production without staging in temporary databases.

How It Works

Rockset continuously ingests data from S3 or GCS so you don't have to manage ETL pipelines. It uses converged indexing to deliver real-time SQL over REST with serverless auto-scaling under the hood.

Visualization Tools

Enable your business teams to visualize real-time event streams using dashboarding tools that they already know and love, using Rockset's JDBC connector and native support for standard SQL-based visualization tools.

Grafana

Grafana is an open observability platform for analytics and monitoring. Grafana requires a SQL backend and cannot query Kafka directly. Use Rockset to visualize Kafka events in Grafana.

Learn more

API Access

Insert, update and query data programmatically from custom application code using Rockset's client libraries wrapped on top of REST.

Python

# connect to Rockset
from rockset import Client, Q, F
rs = Client()

# build a query object
q = Q('hello_world').where(F['name'] == 'Jim Gray')
results = rs.sql(q)

Rockset’s Python package is called rockset and the entire API is contained within a single Python module called rockset. APIs defined in the rockset module allow you to securely connect to the Rockset service, create or manage collections and query Rockset.

Learn more
# connect to Rockset
from rockset import Client, Q, F
rs = Client()

# build a query object
q = Q('hello_world').where(F['name'] == 'Jim Gray')
results = rs.sql(q)

Powerful Analytics on Your Data Lake Without Any ETL

Resources



Related Blog

How to Do Data Science Using SQL on Raw JSON

Learn how an investment management firm used Rockset to analyze complex, third-party data sets from Amazon S3.

Read more
Related Blog

From Schemaless Ingest to Smart Schema: Enabling SQL on Raw Data

Rockset allows schemaless ingest from data lakes, and turns raw data into SQL tables for fast analytics.

Read more
Related Blog

Serverless Data Management: A SQL Search and Analytics Engine

Use Rockset in conjunction with cloud object stores like Amazon S3 and Google Cloud Storage.

Read more
How-To Guide

Building a Serverless Microservice Using Rockset and AWS Lambda

Use SQL to join and query JSON and CSV data from Amazon S3 and Kinesis to build a serverless microservice.

Read more
Documentation

How to use Amazon S3 as a data source

Learn how to securely connect Amazon S3 with Rockset and create collections that sync your data in real time.

Read more
Documentation

How to use Google Cloud Storage as a data source

Learn how to securely connect Google Cloud Storage with Rockset and create collections that sync your data in real time.

Read more

Our Customers

See how the most innovative companies do more with their data, faster.

"Building our dashboard on Rockset was the easiest way to analyze our call data in DynamoDB and get real-time insights on the metrics we care about."

-Naresh Talluri, product manager at FULL Creative

Read more

Try SQL on S3 or GCS now