Real-time AI: Live Recommendations Using Confluent and Rockset

September 26, 2023


Real-time AI is the future, and AI models have demonstrated incredible potential for predicting and generating media in various business domains. For the best results, these models must be informed by relevant data. AI-powered applications almost always need access to real-time data to deliver accurate results in a responsive user experience that the market has come to expect. Stale and siloed data can limit the potential value of AI to your customers and your business.

Confluent and Rockset power a critical architecture pattern for real-time AI. In this post, we’ll discuss why Confluent Cloud’s data streaming platform and Rockset’s vector search capabilities work so well to enable real-time AI app development and explore how an e-commerce innovator is using this pattern.

Understanding real-time AI application design

AI application designers follow one of two patterns when they need to contextualize models:

  • Extending models with real-time data: Many AI models, like the deep learners that power Generative AI applications like ChatGPT, are expensive to train with the current state of the art. Often, domain-specific applications work well enough when the models are only periodically retrained. More generally applicable models, such as the Large Language Models (LLMs) powering ChatGPT-like applications, can work better with appropriate new information that was unavailable when the model was trained. As smart as ChatGPT appears to be, it can’t summarize current events accurately if it was last trained a year ago and not told what’s happening now. Application developers can’t expect to be able to retrain models as new information is generated constantly. Rather, they enrich inputs with a finite context window of the most relevant information at query time.
  • Feeding models with real-time data: Other models, however, can be dynamically retrained as new information is introduced. Real-time information can enhance the query’s specificity or the model’s configuration. Regardless of the algorithm, one’s favorite music streaming service can only give the best recommendations if it knows all of your recent listening history and what everyone else has played when it generalizes categories of consumption patterns.

The challenge is that no matter what type of AI model you are working with, the model can only produce valuable output relevant to this moment in time if it knows about the relevant state of the world at this moment in time. Models may need to know about events, computed metrics, and embeddings based on locality. We aim to coherently feed these diverse inputs into a model with low latency and without a complex architecture. Traditional approaches rely upon cascading batch-oriented data pipelines, meaning data takes hours or even days to flow through the enterprise. As a result, data made available is stale and of low fidelity.

Whatnot is an organization that faced this challenge. Whatnot is a social marketplace that connects sellers with buyers via live auctions. At the heart of their product lies their home feed where users see recommendations for livestreams. As Whatnot states, "What makes our discovery problem unique is that livestreams are ephemeral content — We can’t recommend yesterday’s livestreams to today’s users and we need fresh signals."

Ensuring that recommendations are based on real-time livestream data is critical for this product. The recommendation engine needs user, seller, livestream, computed metrics, and embeddings as a diverse set of real-time inputs.

"First and foremost, we need to know what is happening in the livestreams — livestream status changed, new auctions started, engaged chats and giveaways in the show, etc. Those things are happening fast and at a massive scale."

Whatnot chose a real-time stack based on Confluent and Rockset to handle this challenge. Using Confluent and Rockset together provides reliable infrastructure that delivers low data latency, assuring data generated from anywhere in the enterprise can be rapidly available to contextualize machine learning applications.

Confluent is a data streaming platform enabling real-time data movement across the enterprise at any arbitrary scale, forming a central nervous system of data to fuel AI applications. Rockset is a search and analytics database capable of low-latency, high-concurrency queries on heterogeneous data supplied by Confluent to inform AI algorithms.

High-value, trusted AI applications require real-time data from Confluent Cloud

With Confluent, businesses can break down data silos, promote data reusability, improve engineering agility, and foster greater trust in data. Altogether, this allows more teams to securely and confidently unlock the full potential of all their data to power AI applications. Confluent enables organizations to make real-time contextual inferences on an astonishing amount of data by bringing well curated, trustworthy streaming data to Rockset, the search and analytics database built for the cloud.

With easy access to data streams through Rockset’s integration with Confluent Cloud, businesses can:

  • Create a real-time knowledge base for AI applications: Build a shared source of real-time truth for all your operational and analytical data, no matter where it lives for sophisticated model building and fine-tuning.
  • Bring real-time context at query time: Convert raw data into meaningful chunks with real-time enrichment and continually update your vector embeddings for GenAI use cases.
  • Build governed, secured, and trusted AI: Establish data lineage, quality and traceability, providing all your teams with a clear understanding of data origin, movement, transformations and usage.
  • Experiment, scale and innovate faster: Reduce innovation friction as new AI apps and models become available. Decouple data from your data science tools and production AI apps to test and build faster.

Rockset has built an integration that offers native support for Confluent Cloud and Apache Kafka®, making it simple and fast to ingest real-time streaming data for AI applications. The integration frees users from having to build, deploy or operate any infrastructure component on the Kafka side. The integration is continuous, so any new data in the Kafka topic will be instantly indexed in Rockset, and pull-based, ensuring that data can be reliably ingested even in the face of bursty writes.

The Rockset console where you can setup the Confluent Cloud integration
The Rockset console where you can setup the Confluent Cloud integration

Real-time updates and metadata filtering in Rockset

While Confluent delivers the real-time data for AI applications, the other half of the AI equation is a serving layer capable of handling stringent latency and scale requirements. In applications powered by real-time AI, two performance metrics are top of mind:

  • Data latency measures the time from when data is generated to when it is queryable. In other words, how fresh is the data on which the model is operating? For a recommendations example, this could manifest in how quickly vector embeddings for newly added content can be added to the index or whether the most recent user activity can be incorporated into recommendations.
  • Query latency is the time taken to execute a query. In the recommendations example, we are running an ML model to generate user recommendations, so the ability to return results in milliseconds under heavy load is essential to a positive user experience.

With these considerations in mind, what makes Rockset an ideal complement to Confluent Cloud for real-time AI? Rockset offers vector search capabilities that open up possibilities for the use of streaming data inputs to semantic search and generative AI. Rockset users implement ML applications such as real-time personalization and chatbots today, and while vector search is a necessary component, it is by no means sufficient.

Beyond support for vectors, Rockset retains the core performance characteristics of a search and analytics database, providing a solution to some of the hardest challenges of running real-time AI at scale:

  • Real-time updates are what enable low data latency, so that ML models can use the most up-to-date embeddings and metadata. The real-timeness of the data is typically an issue as most analytical databases do not handle incremental updates efficiently, often requiring batching of writes or occasional reindexing. Rockset supports efficient upserts because it is mutable at the field level, making it well-suited to ingesting streaming data, CDC from operational databases, and other constantly changing data.
  • Metadata filtering is a useful, perhaps even essential, companion to vector search that restricts nearest-neighbor matches based on specific criteria. Commonly used strategies, such as pre-filtering and post-filtering, have their respective drawbacks. In contrast, Rockset’s Converged Index accelerates many types of queries, regardless of the query pattern or shape of the data, so vector search and filtering can run efficiently in combination on Rockset.

Rockset’s cloud architecture, with compute-compute separation, also enables streaming ingest to be isolated from queries along with seamless concurrency scaling, without replicating or moving data.

How Whatnot is innovating in e-commerce using Confluent Cloud with Rockset

Let’s dig deeper into Whatnot’s story featuring both products.

Whatnot is a fast-growing e-commerce startup innovating in the livestream shopping market, which is estimated to reach $32B in the US in 2023 and double over the next 3 years. They’ve built a live-video marketplace for collectors, fashion enthusiasts, and superfans that allows sellers to go live and sell products directly to buyers through their video auction platform.

Whatnot’s success depends on effectively connecting buyers and sellers through their auction platform for a positive experience. It gathers intent signals in real-time from its audience: the videos they watch, the comments and social interactions they leave, and the products they buy. Whatnot uses this data in their ML models to rank the most popular and relevant videos, which they then present to users in the Whatnot product home feed.

To further drive growth, they needed to personalize their suggestions in real time to ensure users see interesting and relevant content. This evolution of their personalization engine required significant use of streaming data and buyer and seller embeddings, as well as the ability to deliver sub-second analytical queries across sources. With plans to grow usage 4x in a year, Whatnot required a real-time architecture that could scale efficiently with their business.

Whatnot uses Confluent as the backbone of their real-time stack, where streaming data from multiple backend services is centralized and processed before being consumed by downstream analytical and ML applications. After evaluating various Kafka solutions, Whatnot chose Confluent Cloud for its low management overhead, ability to use Terraform to manage its infrastructure, ease of integration with other systems, and robust support.

The need for high performance, efficiency, and developer productivity is how Whatnot selected Rockset for its serving infrastructure. Whatnot’s previous data stack, including AWS-hosted Elasticsearch for retrieval and ranking of features, required time-consuming index updates and builds to handle constant upserts to existing tables and the introduction of new signals. In the current real-time stack, Rockset indexes all ingested data without manual intervention and stores and serves events, features, and embeddings used by Whatnot’s recommendation service, which runs vector search queries with metadata filtering on Rockset. That frees up developer time and ensures users have an engaging experience, whether buying or selling.

The data stack with Confluent Cloud and Rockset for personalized recommendations at Whatnot
The data stack with Confluent Cloud and Rockset for personalized recommendations at Whatnot

With Rockset’s real-time update and indexing capabilities, Whatnot achieved the data and query latency needed to power real-time home feed recommendations.

“Rockset delivered true real-time ingestion and queries, with sub-50 millisecond end-to-end latency…at much lower operational effort and cost,” Emmanuel Fuentes, head of machine learning and data platforms at Whatnot.

Confluent Cloud and Rockset enable simple, efficient development of real-time AI applications

Confluent and Rockset are helping more and more customers deliver on the potential of real-time AI on streaming data with a joint solution that’s easy to use yet performs well at scale. You can learn more about vector search on real-time data streaming in the webinar and live demo Deliver Better Product Recommendations with Real-Time AI and Vector Search.

If you’re looking for the most efficient end-to-end solution for real-time AI and analytics without any compromises on performance or usability, we hope you’ll start free trials of both Confluent Cloud and Rockset.

About the Authors Andrew Sellers leads Confluent’s Technology Strategy Group, which supports strategy development, competitive analysis, and thought leadership.

Kevin Leong is Sr. Director of Product Marketing at Rockset, where he works closely with Rockset's product team and partners to help users realize the value of real-time analytics. He has been around data and analytics for the last decade, holding product management and marketing roles at SAP, VMware, and MarkLogic.