Real-time Indexing Database in the Cloud

Rockset is a real-time indexing database service for serving low latency, high concurrency analytical queries at scale. It builds a Converged Index™ on structured and semi-structured data from OLTP databases, streams and lakes in real-time and exposes a RESTful SQL interface.

Use Rockset

  • Step 1 icon
    Step 1
    Create an account and login to Rockset
    Rockset is a fully managed cloud service
  • Step 2 icon
    Step 2
    Connect to your data source
    Rockset builds a Converged Index™ for you
  • Step 3 icon
    Step 3
    Save your SQL statement as a Query Lambda
    You get a REST endpoint for your data API
  • Step 4 icon
    Step 4
    Hit the REST endpoint from your application code
    Get results in milliseconds

Features

Built-in Connectors

Securely ingest data using native connectors with MongoDB, DynamoDB, Kafka, Kinesis, S3 and GCS. Rockset initially bulk loads data and then switches to continuous ingest to stay in sync with your source. Ingest millions of events per second. New data is queryable with p95 of 2 seconds. No ETL tools required.

Smart Schemas

Smart schemas are automatically generated schemas based on the exact fields and types present in the ingested data. The schema represents semi-structured data, nested objects and arrays, mixed types and nulls, enabling relational SQL queries over all these constructs. No more schema drift.

{"user_id": "caligralist", ...},
{"user_id": none, ...},
{"user_id": "hacker123", ...},
{"user_id": 413452322, ...},
...
$ DESCRIBE typing_system_demo
FIELD         OCCURRENCES   TYPE
['user_id']   4232 / 8312   string
['user_id']   3267 / 8312   int
['user_id']    813 / 8312   null_type
['...']               ...   ...
Converged Index™

All fields, including deeply nested fields, are automatically indexed in a Converged Index™ which includes an inverted index, columnar index and row index. A Converged Index™ compiles indexes of information and allows analytical queries on large datasets to return in milliseconds.

Full SQL

Run standard SQL queries, including filters, sorts, aggregations, inner and outer joins, directly on semi-structured data. Greater flexibility to query constantly changing, semi-structured and heavily nested data.

Query ran in ms.
Loading
promotionclickthrough
"yeti_tocayo"
0.00129
"bmw_530i_sedan"
0.01857
Query Lambdas

Query Lambdas are named parameterized SQL queries stored in Rockset that can be executed from a dedicated REST endpoint. With Query Lambdas, you can enforce version control and integrate into your CI/CD workflows. Or use our Node.js, Java, Go, Python client libraries.

$ rock sql "SELECT 
        visits.promotion AS promotion, 
        sum(visits.converted)/count(visits.converted) AS clickthrough 
    FROM visits 
    GROUP BY clickthrough"
Separation of Compute and Storage

Scale compute and storage resources independently for the best price-performance. As your data size grows, you can can use exactly the right amount of compute for the query performance you need at any given time.

Serverless Auto-Scaling in the Cloud

Rockset uses a modern, cloud-native Aggregator Leaf Tailer (ALT) architecture which auto-scales in the cloud and automates cluster provisioning and index management. Optimize costs while minimizing operational overhead with serverless auto-scaling.

Enterprise-Grade Security

Data is encypted at rest and SSL in transit. Mask sensitive information using field mappings at the time of ingest. SAML, OAuth and Okta for single sign-on. Optional support for AWS VPC deployments.

Real-Time Analytics At Lightning Speed

See Rockset in action

Sample APIs

Explore sample APIs for read-intensive applications like recommendation engines, personalization features, geo-tracking services and more. Using Query Lambdas you can save your SQL query as a dedicated endpoint and turn it into an API.

    Left Arrow
  • Card Icon

    {Leaderboard API}

    source: gamer, tournaments and scores tables from DynamoDB
  • Right Arrow
  • Card Icon

    {Leaderboard API}

    source: gamer, tournaments and scores tables from DynamoDB
  • Card Icon

    {Item Tracking API}

    source: store_items and location Kafka topics
  • Card Icon

    {Blockchain Search API}

    source: blockchain table from DynamoDB
  • Card Icon

    {Shopping Recommendation API}

    source: shopping_cart, orders, lineitems collections from DynamoDB
  • Card Icon

    {Location Search API}

    source: water polygon data loaded from a geopanda script
  • Card Icon

    {Customer 360 API}

    source: login-activity and clickstream Kafka topics and orders and shopping-cart collections from MongoDB
  • Card Icon

    {Product Recommendation API}

    source: recommendations bucket from S3 and orders and lineitems collections from MongoDB
  • Card Icon

    {Connected Car API}

    source: vehicle_sensor Kafka topic
Optimized for Speed
  • A Fully Mutable Converged Index™

    A Converged Index™ stores each individual field of the document as an independently addressable key in an inverted, columnar and row index. It is fully mutable at the field level which means Rockset can keep up with high rate of inserts, updates and deletes by updating a single key without having to re-index the entire document.
  • Massive Write Rate

    Use of RocksDB's LSM trees, an in-memory buffer to cache incoming writes and a lockless protocol makes writes visible to existing queries as soon as they happen. Remote compaction speeds up indexing even in the face of bursty writes.
  • Microsharding

    Your index is document sharded for low latency. It is organized in the form of thousands of micro-shards, to eliminate the need for re-indexing. This is a key enabler for indexing massive cloud-scale data sets. Rockset automatically rebalances and distributes shards across a cluster.
  • Distributed Query Execution

    A cost-based optimizer selects the optimal indexes and a distributed query engine executes each portion of the query with shard-level parallelism. A query hits all shards in the index, processes the query in parallel and returns results faster, unlike Cassandra, HBase, Aurora and Citus which are term-sharded.
  • RocksDB on SSD

    Rockset stores all indexes on RocksDB using SSD for hot storage, backed by S3 for durable storage. Built and open-sourced by the Rockset founding team, RocksDB is a high performance embedded storage engine used by other modern datastores like CockroachDB, Kafka, Flink.
read the rockset whitepaper

Rockset Concepts, Design & Architecture

Learn how Rockset's architecture enables highly parallelized execution of complex queries across diverse data sets.

download

Try Rockset Now