Snowflake vs SingleStore

Compare and contrast Snowflake and SingleStore by architecture, ingestion, queries, performance, and scalability.

Compare Snowflake to Rockset here

Compare SingleStore to Rockset here

Snowflake vs SingleStore Architecture

Architecture

Snowflake

SingleStore

Deployment model

SaaS - infrastructure, software and cluster ops managed by service provider

Self managed and SaaS deployment options

Use of storage hierarchy

Cloud object storage for shared data accessible from any virtual warehouse

• Memory - for data requiring the highest performance • High-performance block storage for persistent cache - the working dataset should fit within the persistent cache • Cloud object storage for long-term retention

Isolation of ingest and query

Yes - separate virtual warehouses for batch data loading, ELT jobs and queries

No - databases share ingest and queries

Separation of compute and storage

Yes

Yes - Singlestore Cloud uses cloud object storage for separation of compute and storage

Isolation for multiple applications

Yes - separate virtual warehouses for each workload

Snowflake is the data warehouse built for the cloud. Snowflake is well-known for separating storage and compute for better price performance. With Snowflake, multiple virtual warehouses can be spun up or down for batch data loading, transformations and queries all on the same shared data.

SingleStore is a proprietary distributed relational database that handles both transactional and analytical workloads. It relies on memory and a persistent cache to deliver low latency queries. For longer term data retention, SingleStore Cloud separates compute from cloud object storage. SingleStore Cloud pricing is based on compute and storage usage.

Snowflake vs SingleStore Ingestion

Ingestion

Snowflake

SingleStore

Data sources

• Third party ETL tool to ingest data into Snowflake including Fivetran, Hevo or Striim • Bulk loading from S3, GCS, Azure Blob Storage • Sink Connector for Apache Kafka in Confluent Cloud

Integrations to: Amazon S3, Apache Beam, GCS, HDFS, Kafka, Spark, Qlik Replicate, HVR

Semi structured data

Ingests JSON and XML as a VARIANT data type

Ingests JSON as a JSON column type

Transformations and rollups

• Third party ELT/ETL tools like dbt • Simple COPY commands at data loading for column recording, omission and casts

SingleStore pipelines do common data shaping including normalizing and denormalizing data, adding computed columns, filtering data, mapping data, splitting records into multiple destination tables

Snowflake is an immutable data warehouse that is built for batch ingestion and relies heavily on the modern data stack ecosystem for data connectors and transformations. Snowflake has a number of integrations to ETL and ELT solutions including Fivetran, Hevo, Striim and dbt. While Snowflake does have support for semi-structured data in the form of a VARIANT type, it is best to structure the data for optimal query performance.

SingleStore has integrations to common data lakes and streams. With SingleStore pipelines, users can perform common data transformations during the ingestion process. SingleStore provides limited support for semi-structured data with its JSON column type. Many users structure data prior to ingestion for optimal query performance.

Snowflake vs SingleStore Performance

Performance

Snowflake

SingleStore

Updates

Data warehouse with immutable storage. Updates rewrite and merge entire partitions

SingleStore columnar store/universal storage is immutable. Updates are fast when the data still resides in memory

Indexing

Indexes can be manually configured: Skiplist index, hash index, full-text index, geospatial index

Query latency

Seconds to minutes on petabytes of data

50-1000ms queries when the working set is contained in memory

Storage format

Compressed columnar format stored in cloud object storage

Two table formats-either use the rowstore or columnstore/universal storage

Streaming ingest

• Ingests on a batch basis • Snowpipe typically ingests in minutes

• Columnnar store/universal storage ingests on a batch basis • Data latency is typically seconds by relying on memory

Snowflake is designed for batch analytics with analysts and data scientists infrequently accessing large-scale data for trend analysis. Snowflake, like many data warehouses, is immutable and does not support frequently changing data efficiently. Snowflake uses a columnar store to return aggregations and metrics efficiently, often with query response times in the seconds to minutes on petabytes of data.

SingleStore has two storage formats: a rowstore and a columnar store referred to as universal storage. The columnar store is used for analytical workloads, loading data in batch and relying on memory to achieve seconds of data latency. The columnar store can also execute queries in seconds when the working set is contained in memory. SingleStore provides the ability to configure and manage additional indexes on the data for faster performance.

Snowflake vs SingleStore Queries

Queries

Snowflake

SingleStore

Joins

Yes

Query language

SQL

Developer tooling

• SQL APIs - make SQL calls to Snowflake programmatically • UDFs for Javascript, Python, Java and SQL functions • Go, JDBC, .NET, Node.js, ODBC, PHP, Python drivers

• API for querying data via POST command • JDBC driver, Python client • Compatibility with MySQL and MariaDB to support additional drivers

Visualization tools

Integrations with QuickSight, Chartio, Domo, Looker, PowerBI, Mode, Qlik, Sigma, Sisense, Tableau, ThoughtSpot and more

Integrations with Cognos Analytics, Dremio, Looker, Microstrategy, Power BI, Sisense, Tableau and Tibco Spotfire

Snowflake supports SQL as its native query language and can perform SQL joins. Snowflake for developers introduced a number of developer tools including SQL APIs, UDFs and drivers to support application development. As Snowflake was originally built for business intelligence workloads, it integrates with a number of visualization tools for trend analysis.

SingleStore supports SQL as its native query language and can perform SQL joins. It is designed for querying structured data with static schemas. Users can create data APIs to execute SQL statements against the database over an HTTP connection. Common SingleStore use cases include business intelligence and analytics, and the database offers a number of integrations to visualization tools.

Snowflake vs SingleStore Scalability

Scalability

Snowflake

SingleStore

Vertical scaling

Resize virtual warehouses via web interface or using DDL commands for warehouses

• Cloud offering: Resize compute workspaces in the UI or using the Management API • Self-managed offering: Change cluster configuration by updating command-line arguments or to the cluster directly.

Horizontal scaling

• Multi-cluster warehouses allocate additional clusters for higher concurrency workloads • Auto scaling policies can be set

Self-managed offering: Increase or decrease the number of nodes in the cluster. Rebalancing required

Snowflake virtual warehouses can be scaled up for faster queries or scaled out using multi-cluster warehouses to support higher concurrency workloads. Snowflake has shared blob storage that scales automatically and independently.

SingleStore Cloud can be sized up or down using the UI or the Management API. There is no ability to scale out by increasing or decreasing the leaf and aggregator nodes in the cloud offering. In the self-managed offering, horizontal and vertical scaling can occur by updating command-line arguments or the cluster directly. Horizontal scaling does require rebalancing