JetBlue Scales Real-Time AI on Rockset
October 26, 2023
JetBlue is the data leader in the airline industry using data to offer industry-leading customer experiences and disruptive low fares to popular destinations around the world. The key to JetBlue’s customer experiences driving strong loyalty is staying efficient even when operating in the most congested airspaces in the world- a feat that would be unattainable without real-time analytics and AI.
JetBlue optimizes for the high utilization of aircraft and crew by acquiring a deep understanding of global airline operations, the relationship between aircraft, customers and crew, delay drivers, and potential cascading effects from delays that can lead to further disruptions.
Getting to this level of insight requires making sense of large volumes and varieties of sources from all components of operations data to weather data to airline traffic data and more. The complexity of the data and situation can be hard to quickly comprehend and take action on without the assistance of machine learning.
That’s why JetBlue innovates with real-time analytics and AI, using over 15 machine learning applications in production today for dynamic pricing, customer personalization, alerting applications, chatbots and more. These machine learning applications give JetBlue a competitive advantage by enhancing their commercial and operational capabilities.
In this blog, we’ll discuss how JetBlue built an in-house machine learning platform, BlueML, that enables teams to quickly productionize new machine learning applications using a common library and configuration. BlueML has been central to supporting LLM-based applications and JetBlue’s AI & ML real-time products.
Data and AI at JetBlue
BlueML Feature Store
JetBlue adopts a lakehouse architecture using Databricks Delta Live Tables to support data from a variety of sources and formats, making it easy for data scientists and engineers to iterate on their applications. In the lakehouse, data is processed and enriched following the medallion framework to create batch, near real-time and real-time features and predictions for the BlueML feature store. Rockset acts as the online feature store for BlueML, persisting features for low-latency queries during inference.
The BlueML feature store has accelerated ML application development at JetBlue, enabling data scientists and engineers to focus on modeling and reusable feature engineering and not complex code and ML operations. As a result, teams can productionize new features and models with minimal engineering lift.
A core enabler of the speed of ML development with BlueML is the flexibility of the underlying database system. Rockset has a flexible schema and query model, making it possible to easily add new data or alter features and predictions. With Rockset’s Converged Indexing technology, data is indexed in a search index, columnar store, ANN index and row store for millisecond-latency analytics across a wide range of query patterns. Rockset provides the speed and scale required of ML applications accessed daily by over 2,000 employees at JetBlue.
Vector Database for Chatbots
JetBlue also uses Rockset as its vector database for storing and indexing high-dimensional vectors generated from Large Language Models (LLMs) to enable efficient search for chatbot applications. With the recent enhancements and availability of LLMs, JetBlue is working quickly to make it easier for internal teams to access data using natural language to find the status of flights, general FAQ, analyzing customer sentiment, reasons for any delays and the impact of delays on customers and crews.
Real-time semantic layer for AI & ML applications
In addition to the BlueML initiative, JetBlue has also leveraged the lakehouse architecture for its AI & ML products requiring a real-time semantic layer. The Data Science, Data Engineering and AI & ML team at JetBlue have been able to rapidly connect streaming pipelines to Rockset collections and launch lambda query APIs. These REST API endpoints are integrated directly into the front-end applications resulting in a seamless and efficient product go-to-market strategy without the need for large software engineering teams.
The users of real-time AI & ML products are able to successfully use the embedded LLMs, simulation capabilities and more advanced functionalities directly in the products as a result of the high QPS, low barrier-to-entry and scalable semantic layers. These products range from revenue forecasting and ancillary dynamic pricing to operational digital twins and decision recommendation engines.
Requirements for online feature store and vector database
Rockset is used across the data science team at JetBlue for serving internal products including recommendations, marketing promotions and the operational digital twins. JetBlue evaluated Rockset based on the following requirements:
- Millisecond-latency queries: Internal teams want instant experiences so that they can respond quickly to changing conditions in the air and on the ground. That’s why chat experiences like “how long is my flight delayed by” need to generate responses in under a second.
- High concurrency: The database supports high-concurrency applications leveraged by over 10,000 employees on a daily basis.
- Real-time data: JetBlue operates in the most congested airspaces and delays around the world can impact operations. All operational AI & ML products should support millisecond data latency so that teams can take immediate action on the most up-to-date data.
- Scalable architecture: JetBlue requires a scalable cloud architecture that separates compute from storage as there are a number of applications that need to access the same features and datasets. With a cloud architecture, each application has its own isolated compute cluster to eliminate resource contention across applications and save on storage costs.
In addition to evaluating Rockset, the data science team also looked at several point solutions including feature stores, vector databases and data warehouses. With Rockset, they were able to consolidate 3-4 databases into a single solution and minimize operations.
“Iteration and speed of new ML products was the most important to us,” says Sai Ravuru, Senior Manager of Data Science and Analytics at JetBlue. “We saw the immense power of real-time analytics and AI to transform JetBlue’s real-time decision augmentation & automation since stitching together 3-4 database solutions would have slowed down application development. With Rockset, we found a database that could keep up with the fast pace of innovation at JetBlue.”
Benefits of Rockset for AI at JetBlue
The JetBlue data team embraced Rockset as its online feature store and vector search database. Core Rockset features enable the data team to move faster on application development while achieving consistently fast performance:
- Converged Index: The Converged Index delivers millisecond-latency query performance across lookups, vector search, aggregations and joins with minimal performance tuning. With the out-of-the-box performance advantage from Rockset, the team at JetBlue could quickly release new features or applications.
- Flexible data model: The large-scale, heavily nested data could be easily queried using SQL. Furthermore, Rockset’s dynamic schema management removed the data science team’s reliance on engineering for feature modifications. As a result of Rockset’s flexible data model, the team saw a 30% decrease in the time to market of new ML features.
- SQL APIs: Rockset also takes an API-first approach and stores named, parameterized SQL queries that can be executed from a dedicated REST endpoint. These query lambdas accelerate application development because data teams no longer need to build dedicated APIs, removing a development step that could previously take up to a week. “It would have taken us another 3-6 months to get AI & ML products off the ground if it weren’t for query lambdas,” says Sai Ravuru. “Rockset took that time down to days due to the ease of converting a SQL query into a REST API.”
- Cloud-native architecture: The scalability of Rockset enables JetBlue to support high concurrency applications without worrying about a sizable increase in their compute bill. As Rockset is purpose-built for search and analytical applications in the cloud, it provides better price-performance than lakehouse and data warehouse solutions and is already generating compute savings for JetBlue. One of the benefits of Rockset’s architecture is its ability to separate both compute-storage and compute-compute to deliver consistently performant applications built on high-velocity streaming data.
The Future of AI in the Sky
AI is only starting to take flight and is already benefiting JetBlue and the roughly 40 million travelers it carries each year. The speed of innovation at JetBlue is enabled by the ease-of-use of the underlying data stack.
“We’re at 15+ ML applications in production and I see that number exponentially growing over the next year,” says Sai Ravuru. “It goes back to our investment in BlueML as a centralized, self-service platform for AI and ML where real-time data and predictions can be accessed across the organization to enhance the customer experience,” continues Ravuru. “We’ve built the foundation to enable innovation through AI and I can’t wait to see the transformative impact it has on our customers’ experience booking, flying, and interacting with JetBlue’s digital channels. Up next, is taking many of the insights served to internal teams and infusing them into the website and JetBlue applications. There’s still a lot more to come.”