- Rockset uses its Converged Index™ which combines a columnar store, search index and row index to achieve both low latency search and analytics compute-efficiently. In contrast, ClickHouse only uses a column-oriented data store.
- Rockset and ClickHouse both adopt SQL as their query language. But ClickHouse enforces a strict schema whereas Rockset has schemaless ingest and dynamically typed SQL for greater flexibility.
- Rockset has been built in the cloud, putting real-time analytics within reach of even lean engineering teams. ClickHouse was built at Yandex where a team of engineers managed the distributed data system. This required a lot of operational expertise.
How Rockset Addresses Challenges with ClickHouse
ClickHouse Challenge #: 1
Heavy Operational Burden
ClickHouse requires a lot of operational expertise, which results in higher total cost of ownership. ClickHouse is a complex distributed system and requires users to configure, scale and capacity plan servers and clusters. For example, there are a lot of knobs to turn when upgrading a cluster as there is no distributed upgrade to all nodes. You have to manually upgrade each node and during this time your cluster will be running at reduced capacity. We find that ClickHouse users see great single-node performance, but that it becomes increasingly challenging to manage a multi-node system.
Rockset is a cloud-native, horizontally scalable analytics database. It uses microsharding to index massive data sets in the cloud. Rockset automatically rebalances and distributes shards across a cluster. This makes it easy for teams to scale their production application without any manual intervention. If you need to reduce latency, move to a higher Virtual Instance with the click of a button or an API call.
ClickHouse Challenge #: 2
Inflexible Data Model
ClickHouse requires time-consuming data preparation, making it ill-suited for use cases where the data model is in constant flux. For example, while ClickHouse supports JSON; any new fields added will need to perfectly match the schema in the table. ClickHouse also recommends working with a flat data structure, so teams will need to transform nested JSON data as it arrives in the system. Denormalizing data requires additional engineering overhead and can be challenging to manage when data is constantly changing.
Rockset embraces a flexible data model, enabling new data to be schemalessly ingested and indexed for out-of-the-box analytics. At ingest time, Rockset indexes every field of the data, even deeply nested data, for sub-second SQL analytics. We’ve added custom extensions to our SQL interface for querying of nested documents and arrays. Rockset has been able to achieve sub-second queries on the Star Schema Benchmark (SSB) and can do so in real life scenarios where the data is messy.
ClickHouse Challenge #: 3
Manual Performance Tuning
When executing certain queries, including search and filter queries, ClickHouse requires you to configure and manage indexes to achieve good performance. This requires a deep understanding of the system, including how each index behaves and when to use which index. Furthermore, ClickHouse does not have a query optimizer, so you are required to dictate the best path for query execution. This can be both time-consuming and lead to compute inefficiency.
Rockset: Rockset's Converged Index™ builds a columnar index, search index and row index on all data at ingest time. There's no need to create or manage indexes. Rockset's query optimizer is designed for SQL on semi-structured data. It has the ability to perform all SQL joins — hash joins, nested loop joins, broadcast joins and lookup joins — and takes filtering into account before joining data for greater compute-efficiency. With Rockset, you have the ability to get sub-second search, aggregations and joins.
Rockset Beats ClickHouse and Druid on the Star Schema Benchmark
If you want flexible and easy real-time analytics, check out Rockset.
Here are four reasons why:
Creation of search, columnar and row indexes at ingest time
SQL search, aggregations and joins on semi-structured data
Efficient inserts, updates and deletes
Independent scaling of storage and compute in the cloud
See Why Companies Choose Rockset for Real-Time Analytics
Modern companies are building real-time logistics tracking, security analytics, predictive maintenance and more in record time.See Our Customers
“Rockset fits all the requirements that we have for a new kind of database. It's serverless, real-time, provides a common API like SQL, and is able to ingest event data easily via a Kafka connector..”
Ralph Debusmann, IOT Solution Architect at Bosch
“Over 80% of North America’s concrete delivery tickets are generated from our systems. We track millions of material and haul tickets on any given day and being able to search, analyze and act on this data in real-time is mission critical for us. We have embraced a modern serverless stack, and we chose Rockset for our application.”
Doug Moore, VP of Cloud at Command Alkon
Compare Rockset and ClickHouse
Connect with our solutions team to dig deeper into the architecture, indexing, data ingestion and query processing.Let's Talk