What is Rockset?
Rockset is a real-time analytics solution that enables low-latency searches and aggregations. Rockset automatically indexes structured, semi-structured, geo, and time-series data for real-time search and analytics at scale.
Use Rockset to create personalized user experiences, build real-time decision systems, serve IoT applications, and more, with a real-time indexing database that can power sub-second queries on a massive scale.
Why Rockset?
Rockset provides the following key features and benefits:
Built-in Connectors
Rockset has pre-built integrations for:
- Amazon DynamoDB
- Amazon Kinesis
- Amazon S3
- Amazon MSK
- Apache Kafka
- Azure Blob Storage
- Azure Event Hubs
- Azure Service Bus
- Google Cloud Storage
- Microsoft SQL Server
- MongoDB
- MySQL
- Oracle
- PostgreSQL
- Snowflake
As you follow our step-by-step tutorials, Rockset will automatically load your data within seconds so you can begin making SQL queries immediately. New data can be queried with a p95 (95th percentile latency) of two seconds. Rockset initially loads the data in bulk. It then continuously ingests upwards of millions of events per second, to stay in sync with your data source. No ETL tools are required for this process.
Note: If your data source is not currently supported, or if you prefer to build your own custom connector, you can use the Write API to manually load your data.
Learn more about how real-time updates are architected in Rockset.
Smart Schemas
Rockset ingests your data without the need for a pre-built schema. Smart schemas are then automatically generated based on the exact fields and types present in the ingested data. The schema represents and enables SQL queries for semi-structured data, nested objects and arrays, mixed types and nulls. You can also define your own ingest transformation to be applied as documents are ingested into Rockset to create new fields or manipulate existing ones from your data source.
Learn more about how smart schemas are generated in Rockset.
Full SQL Support
Rockset supports full SQL including:
- aggregations
- filtering
- windowing
- joins
over all types of fields (including heavily nested objects and arrays), on any semi-structured data. This enables the use and flexibility of SQL queries over data in supported data sources, even if they don't natively support SQL.
See our SQL Reference for the full list of all functions available for writing SQL queries in Rockset.
Query Lambdas
Query Lambdas are named, parameterized SQL queries stored in Rockset that can be executed from a dedicated REST endpoint. With Query Lambdas, you can save and enforce version control for your SQL queries and integrate them into your CI/CD workflows.
Use the Rockset CLI to create, manage, and deploy your Query Lambdas directly from your local computer. Query Lambdas are also fully supported in Rockset’s official client libraries and the Rockset API.
Watch our tutorial on how to build applications using Query Lambdas.
How does it work?
The following subsections describe key aspects on how Rockset works:
- Converged Index™
- Scale Compute and Storage Independently
- Serverless Auto-Scaling in the Cloud
- Enterprise-Grade Security
Converged Index™
All fields, including deeply nested fields, are automatically indexed in a Converged Index™ as each record is ingested. They include three indexes:
- Inverted index
- Columnar index
- Row index
A Converged Index™ allows analytical queries on large datasets to return in milliseconds. Using Rockset, you will never have to manually define or create your indexes or update them over time. You can also customize Rockset for efficient, cost-optimizing, and massive-scale applications.
Read more about how Rockset builds a Converged Index™ and other design concepts in Rockset’s Architecture Whitepaper.
Scale Compute and Storage Independently
Using Rockset, you can scale compute and storage resources independently for the best balance of price and performance. As your data size grows, you can choose the right amount of compute for the query performance you need at any given time. Hot storage and ingest costs are charged at a fixed rate, while compute resources are based on your Virtual Instance Type.
See Rockset’s full pricing model.
Serverless Auto-Scaling in the Cloud
Rockset uses a modern, cloud-native architecture that auto-scales in the cloud, and automates cluster provisioning and index management. This significantly minimizes any operational overhead, because you will never need to provision capacity or manage servers,
For more information about Rockset's architecture and its performance benchmarks, see the Evaluating Data Latency for Real-Time Databases white paper.
Enterprise-Grade Security
Stored data is encrypted using AES-256, and SSL used in transit. In addition, you can mask sensitive information using an ingest transformation. Read more about our security features including SAML, OAuth, and Okta for single sign-on in the Security section of our documentation. See our Data Privacy Addendumfor additional information.
Next steps
To get started, create a Rockset account. See our Quick Start guide to try out Rockset by running queries on some sample data. Or, learn how to start loading your data by connecting your data source!