See Rockset
in action

Get a product tour with a Rockset engineer

Weaviate vs Elasticsearch

Compare and contrast Weaviate and Elasticsearch by architecture, ingestion, queries, performance, and scalability.

Compare Weaviate to Rockset here

Compare Elasticsearch to Rockset here

Weaviate vs Elasticsearch Ingestion

Ingestion
Weaviate
Elasticsearch
Streaming and bulk ingestion
Weaviate recommends batching vector embeddings in sizes of 100-300. For large-size files, it recommends breaking them up and ingesting them using libraries like ijson for JSON files and pandas for CSV files. Some manual flushing of batches may be required.
Elasticsearch supports updates and bulk ingestion. It recommends bulk ingesting into larger, fewer segments as each segment has an HNSW graph that needs to be searched. This results in higher latency.
Index updates
Weaviate can update the values of an existing property or an entire object in a schema. Weaviate does use HNSW index type which is more costly when it comes to adding or updating vectors. Weaviate does not support adding or deleting properties to the schema.
Index updates happen through expensive merge operations and reindexing. It's recommended to avoid heavy indexing during approximate kNN search and to reindex the new documents into a separate index rather than update them in-place.
Embedding generation
Weaviate supports API calls to OpenAI, Cohere and Huggingface to index embeddings.
Elasticsearch introduced its own Elastic Sparse Encoder model that can be used for embedding generation. To use a third party model with Elasticsearch, you must import and deploy the model, and then create an ingest pipeline with an inference processor to perform data transformation.
Size of vectors and metadata
The maximum number of vector dimensions for an embedding is 65,535.
Elasticsearch supports vectors up to 2048 dimensions. With approximate KNN search, all vector data must fit in the node's page cache for it to be efficient.
Versioning
There does not appear to be a way to version in Weaviate.
Elasticsearch uses aliases to reindex data with no downtime.

Weaviate supports batch insertion of vectors and updates and in-place updates for vectors and metadata. Weaviate supports searches across high dimensional vector embeddings.

Elasticsearch supports both streaming and bulk ingestion. It recommends using fewer Lucene segments and avoiding updates and reindexing to save on compute costs. Elasticsearch supports searches across large-scale data, including vector embeddings and metadata.


Weaviate vs Elasticsearch Indexing

Indexing
Weaviate
Elasticsearch
KNN and ANN
Weaviate supports KNN and ANN search using HNSW.
Elasticsearch supports KNN and ANN search. ANN search uses the HNSW algorithm. HNSW is a graph-based alorithm which only works efficiently when most vector data is held in memory.
Additional indexes
Weaviate has an inverted index that can be used for filters, hybrid search and BM25 search.
Elasticsearch includes inverted index for text search, BKD trees for geolocation search and ANN indexes.
Vectorization
Weaviate supports vectorization to speed up query execution.
Elasticsearch added vectorization to its 8.9.0 version to speed up query execution.
Index management
Weaviate users are responsible for configuring and managing indexes and product quantization.
Elasticsearch users are responsible for index maintenance including the number of index segments and the reindexing of data.

Weaviate supports KNN and ANN search using HNSW indexing algorithms. Weaviate provides inverted indexes and vector search indexes and uses vectorization to speed up query execution. Users are responsible for index maintenance.

Elasticsearch supports KNN and ANN search using HNSW indexing algorithms. Elasticsearch provides inverted indexes and vector search indexes and uses vectorization to speed up query execution. Users are responsible for index maintenance.

Weaviate vs Elasticsearch Querying

Querying
Weaviate
Elasticsearch
Metadata filtering
Weaviate supports metadata filtering and hybrid search. Weaviate pre-filters the data and only if a number of records returns (default- greater than 40,000) will it run an ANN search. Otherwise, it uses a brute force exact search.
Weaviate uses a strict schema system with all of the fields and their type specified before the data is indexed.
Elasticsearch supports metadata filtering and hybrid search. Elasticsearch filters are applied during the approximate kNN search.
Multi-modal models
Weaviate supports multi-modal modules with CLIP.
Elasticsearch enables searches across multiple kNN fields to support multi-modal models.
API (SQL, REST, etc)
Weaviate has RESTful APIs for database management and CRUD operations and a GraphQL API for accessing data objects and search.
Elasticsearch exposes REST APIs that can be called directly to configure and access Elasticsearch features.

Weaviate pre-filters data before an approximate kNN search. Weaviate supports a GraphQL API for search.

Elasticsearch supports REST APIs.


Weaviate vs Elasticsearch Ecosystem

Ecosystem
Weaviate
Elasticsearch
Integrations (Huggingface, Langchain, etc.)
Weaviate supports API calls to OpenAI, Cohere and HuggingFace to insert and index embeddings. Weaviate has an integration to Langchain and LlamaIndex.
To use a third party model with Elasticsearch like Huggingface, you must import and deploy the model, and then create an ingest pipeline with an inference processor to perform data transformation. Elasticsearch has an integration with Langchain.

Weaviate vs Elasticsearch Architecture

Architecture
Weaviate
Elasticsearch
Cloud architecture
Weaviate was built for on-prem and has recently introduced a managed offering. Weaviate has a tightly coupled architecture where CPU, RAM and SSD scale together for ingestion and queries. Weaviate stores its object store and inverted index within the same shard; it places its vector index next to the object store. Users responsible for clusters, shards and indexes. Resharding is an expensive operation.
Built for on-prem. Indexing and search are run on the same instances which has the potential to cause compute contention. Users responsible for clusters, shards and indexes.
Scalability
Weaviate scales horizontally for ingestion and queries. Replicas to support high QPS use cases are still in development. Dynamically scaling a cluster is not fully supported- nodes cannot be removed if data is present. In this architecture, ingestion and queries use the same CPU and memory resources, there is no resource isolation, allowing for potential resource contention.
Elasticsearch requires deep expertise around servers, clusters, nodes, indexes and shards to operate at scale. For vector search on Elasticsearch, users may face scaling challenges given that indexing and search are run on the same instance, all vector data must fit into the page cache and each index segment has an HNSW graph that needs to be searched which constributes to latency.
Enterprise readiness
Weaviate does not have case studies of enterprises using their product in production.
Elasticsearch is used by enterprises at scale including Booking.com and Cisco.

Weaviate is built for on-prem with a tightly coupled architecture. Scaling Weaviate requires data and infrastructure expertise and management.

Elasticsearch is built for on-prem with a tightly coupled architecture. Scaling Elasticsearch requires data and infrastructure expertise and management. Elasticsearch is used by enterprises including Booking.com and Cisco.