Scaling Real-time Gaming Leaderboards for Millions of Players

Real-time analytics is integral to leaderboards, personalized experiences, and fraud detection in gaming and social applications, but can be challenging because of the need for complex queries at scale.

In this talk, we will discuss the growing need for real-time leaderboards in gaming and gamification scenarios, and examine various requirements when building leaderboards:

real-time aggregations over user activity
joins across data sets
scalability to thousands of concurrent users
minimal operational overhead

We will show how Amazon DynamoDB and Rockset can be used in conjunction to handle the massive scale of user activity and real-time aggregations and joins required by leaderboards.

Speakers

Kehinde Otubamowo is a Database Partner Solutions Architect at Amazon Web Services (AWS). Prior to AWS, he held various database administrator / engineering roles at leading Financial Services Institutions. When he is not tinkering with databases, Kehinde enjoys investigating the global economy, distributed systems and artificial intelligence.

Kevin Leong is Director of Product Marketing at Rockset, where he works closely with Rockset's product team and partners to help users realize the value of real-time analytics. He has been around data and analytics for the last decade, holding various product management roles at SAP, VMware, and MarkLogic.

Show Notes

Kevin Leong:

Very happy to be here today along with Kehinde to share with you about building leaderboards, and especially around what DynamoDB and Rockset can offer in that regard. This is the outline for what we'll cover today, we'll give a brief intro about real-time analytics and gaming in general. And then we'll talk a little bit about the requirements and challenges around building leaderboards. And we'll show what that would take to build scalable leaderboards, putting DynamoDB and Rockset together. And we'll show a demo to show you how the two databases can can work together to build a sample leaderboard. And we'll take some questions and answers at the end. So hope you can stay with us for a webinar.

Kevin Leong:

So first off, real-time analytics in gaming. So, gaming is an industry where there's lots of potential to use data and analytics to further business objectives. And a good sized gaming company could have on the order of millions of players and capture billions of events per day, popular titles on their own can account for up to five terabytes of data per day. And so a game publisher with several studios and several titles could be generating on the order of 20 to 50 terabytes of data a day.

Kevin Leong:

So, is this data being well used? We've had conversations with a number of companies. We understand that there's certainly many different kinds of real-time analytics being performed on gaming data today. And yet, that's probably still a good opportunity to use a lot of the data, some of which may find its way into the data lake, for real-time purposes, as well. And Kehinde, any thoughts on what you're seeing with gaming and data?

Kehinde Otubamowo:

Yeah, absolutely. Thanks Kevin. Today, games are generating more data more than ever, and a lot of games themselves have become data intensive applications. Most games today that we see, they usually come with social feeds, you have ability to be able to store your game state and be able to play across many device. And all these things naturally leads to a lot of data sitting around. And for example, Epic Games, the company that I built Fortnite. A lot of you attendees will probably have heard about Fortnite or play Fortnite. They are an AWS customer. And the reason I mentioned Fortnite is that they process terabytes of data each month, which is used to deliver insights for continual improvements of the game.

Kehinde Otubamowo:

So before I turn it back to Kevin, I'd like to give you a little bit of an overview of analytics in gaming as well. With analytics, you can turn data into valuable application data into valuable action and actionable insights about ongoing trends and on player behavior, and to be able to meet your players' expectations for amazing games.

Kehinde Otubamowo:

So now, while there are many reasons for analyzing data, analyzing them can be very difficult, and it could require a lot of high amount of compute capacity. And also the size and the amount of data your data analyzing could be very large. So this type of workloads benefit from the pay-as-you-go cloud computing model, as well as using cloud-based managed services. The pay-as-you-go model allows applications to scale up and down based on demand, and you only pay for what you use. So, cloud-based managed services like DynamoDB and Rockset provide easier and faster, more economical ways to be able to process analyze your gaming data, which is a very important use case. So with that, I'll turn it back to you Kevin.

Kevin Leong:

Hey, thanks Kehinde. And so to briefly illustrate the point that we were making on the value of real-time analytics and how DynamoDB and Rockset can help, I'll talk through the example of eGoGames. And they're an eSports platform for mobile in Europe. So there are a platform, right. So they're getting data from multiple games at a time, writing that to DynamoDB, and also pulling in other data, like data on customer retention and acquisition data that's going into S3 and that's being pulled into Rockset to fuel various types of real-time analytics.

Kevin Leong:

Some of the things that they do are in the area of customer experience. The ops team will attempt to promote certain games and load balanced uses across various games to ensure that there's a critical number of players on each game. And this helps with matchmaking and multiplayer games so that players aren't waiting too long to start a game. So that improves customer experience there. Also, fraud detection, which is vital for any gaming companies. You definitely want to prevent things like score manipulation through unfair means, through hacking, you want to prevent match fixing. And that's especially important when money is involved, as it is in the case of eGoGames, and things like identity theft, and anything that's payment related. And then finally, they also track real-time metrics. Things like active usage, session length, conversions, on an ongoing basis.

Kevin Leong:

So that's really a brief intro to areas where real-time analytics can play in the gaming context. For today, we're focusing on just one of these aspects that is building leaderboards. And so we'll talk about what leaderboards are, what some of the requirements, design considerations should be, and some challenges when it comes to implementing leaderboards.

Kevin Leong:

Okay, so what are leaderboards? You're probably acquainted with them on some level, right? It's ranking of players and users based on scores in the game. And you'll see leaderboards over various timeframes, like daily, weekly, monthly. There could be global public leaderboards of everybody playing the game. Or it could simply be leaderboard among a group of friends, where you have set up a league, for example, or it could be a leaderboard among a group, perhaps everyone in your geographic region, for example. So many different types of permutations for leaderboards. And it's an essential feature for a lot of games. The idea here is that leaderboards promote healthy competition, they enhanced the social nature of certain games. And ultimately, they motivate more gameplay as players try to make progress in the ranking.

Kevin Leong:

But just a brief detour here. Leaderboards actually go beyond the gaming contexts with much of the same goals, right; promoting healthy competition, motivating uses, but in some instances, really to its real life goals. And several examples might be in the areas of fitness. One of our customers, Rumble, uses gamification with daily exercise routines, such as counting the number of steps you take each day, and offering rewards based on meeting certain thresholds and creating leaderboards out of that, for maybe everyone in your company, right, if that's the group of people that you're competing against, if you will. We have examples in education and training. You think companies like Duolingo, leaderboards there. Loyalty programs, community participation, where you may want to recognize the most active or expert members of a community who may share the most knowledge, and so on. So, I just want to make the point that leaderboards do have a wide application outside of the gaming vertical.

Kevin Leong:

But what are some things to look at when you're building a leaderboard? Leaderboards, definitely conceptually seem quite simple. We're totaling up scores over a certain time period, and then presenting rankings to a user. So it can be fairly straightforward to start. But the challenges come when you need to scale a leaderboard. So some of the things that are essential to building a leaderboard include real-time player activity, being able to capture and analyze that. Being able to do low-latency aggregations and joins. Being able to scale to lots of players, lots of concurrent users. And then, really minimal operational overhead. You don't want to have to spend a lot of time managing infrastructure for your leaderboard. So we'll talk about each of these in turn and how they can be challenging in certain ways.

Kevin Leong:

So real-time play activity, that's key for a leaderboard. The leaderboard has to have the most current data. And our shorthand for this is really to call it data latency, right. We want the data as soon as possible to be reflected in the leaderboard. So one way of ensuring this might be to build a leaderboard on the primary transactional system. But that runs the risk of having a performance impact on the game. So it's not really advisable for that reason. Most organizations will use a secondary system to build a leaderboard. But then you have the challenge of keeping the secondary system in sync, and also up to date with the primary system. So you can't be minutes off from the state of the game in the primary system. So those are some of the challenges around being able to work on the real-time play activity that you're capturing.

Kevin Leong:

So that's one aspect. Next aspect is really around low latency aggregations and joins. So not only is data latency important, query latency is going to be important for the class of analytical queries that are needed to power the leaderboard. So think about what's needed to calculate player's scores and rank them. So this might be the type of query that you do. Will need operations like sum and group by, order by, where, the timeframe is today or this week. And those would be the SQL operations, or you need the equivalent if you're not querying and SQL. But basically, these are with type of aggregations that you need to be able to return quickly in order to build leaderboards.

Kevin Leong:

And then furthermore, it's possible that we will need to join different tables based on player ID. So perhaps we'll need to display a player data alongside their scores, we may want to display player info like the country they're from or any team affiliations that they have, or we may want to join data from different games. Imagine if you have some type of casino game or application going on, in order to create a leaderboard, you might want to aggregate scores across individual games like Blackjack, and Roulette and Poker for a total score there. Or you may need to join scores across different events, different competitions, different tournaments to come up with a total score. So joins are definitely a useful thing to have in these situations. The challenge here is that transactional systems that are scalable, and that can accommodate high write rates, which are typically NoSQL systems, are not always great at these types of migrations and join queries.

Kevin Leong:

Another thing to look at is scalability. Leaderboards definitely need to scale as the game grows, and that's usually a steadily increasing user growth. But perhaps harder to handle may be the spikes in usage that you have to deal with, periods of peak usage that you have to handle as well. And these peaks might be geographically based. Does the game has a geographic concentration at some place. And then in the daytime, that might be higher. For instance, maybe if you have a mobile game, when people used to commute, right, mobile games might have peaks during commute times, for instance. And in the case of fitness example, with Rumble, their peak, if you're counting steps, they tend to be at the end of the day when people are trying to look at the progress that they made during that day. And then of course, if you're doing things like fantasy sports that take data from real sporting events, you can expect to have peaks maybe on the weekends when those sporting events are going on in the case of something like fantasy football, for instance.

Kevin Leong:

So some have tried leaderboards on databases that don't scale out. So you can think of things like Postgres, which don't support scale out quite as well. So scalability is something to plan for when building leaderboards.

Kevin Leong:

And then lastly, operational overhead, something not to be overlooked. Leaderboards are great to drive player engagement. But ideally, you don't want to spend a lot of human effort managing this feature so that you can focus your energies on other aspects of game development. In many cases, database operations will require significant effort to configure correctly to performance engineer to scale up and scale out as the case maybe, or to do regular upgrades. So that's definitely something to keep in mind that from the get-go when implementing a leaderboard to ensure that you're not expending too much effort on the operations that go into supporting them.

Kevin Leong:

So, those are requirements right. And then, we actually worked with DynamoDB and to Kehinde to come together to provide an example architecture that's scalable for building leaderboards using DynamoDB for the right path for your gaming application, and using Rockset for real heavy functionality like a leaderboard.

Kevin Leong:

So how do DynamoDB and Rockset actually work together? This is a basic block diagram that depicts the flow of data. Rockset has a built-in connector to DynamoDB where all you need to provide as a pointer to your DynamoDB instance and the table you wish to ingest data from, and Rockset will ingest from DynamoDB streams so that Rockset and DynamoDB are kept in sync. So now users would use DynamoDB as the primary database on which the operational application or the game in this case would run. And then Rockset would access the analytical backend on which real-time analytics can be performed on the gaming data.

Kevin Leong:

So we'll come back to talk in more detail about how we would build leaderboards using this integration. And we'll show that in demo. But we'll first discuss the characteristics of DynamoDB and Rockset components to see what makes them suitable for the real-time applications that we're talking about such as leaderboards. So, Kehindi, maybe you want to tell us a bit about DynamoDB and its benefits.

Kehinde Otubamowo:

Yeah, great. A lot of customers with gaming use case prefer to use Amazon DynamoDB. So as you can see out there, we're having a couple of names; FanDuel, Epic Games, Electronic Arts, Riot Games, Pokemon. These are all AWS customers that use DynamoDB to store some parts of their game data. And, in some cases, social and community data related to the game.

Kehinde Otubamowo:

You can also use DynamoDB to store performance data like win times, delays. And delays, session data, application logs, and other metrics related to your game, so that you can track where the issues are with the game. In the demo that we presented in this webinar, we'll use the gameplay data stored in DynamoDB to simulate leaderboard for a fantasy soccer scenario. Later on, we'll talk more about fantasy soccer and why we chose that particular model.

Kehinde Otubamowo:

For the sake of participants that are not so familiar with Amazon DynamoDB, it's basically a fully managed serverless, key-value Dynamo document database that delivers single digit millisecond latency performance at any scale. So gaming customers use DynamoDB in all capabilities of game platforms, including leaderboards, achievements, user management, inventory, and in-game stores. Like if you have a shopping cart as part of your game, you can use DynamoDB as well. So the main benefits customers get from using DynamoDB are its ability to reliably scale to millions of concurrent user requests while ensuring consistently low latency.

Kehinde Otubamowo:

DynamoDB is also a fully managed service, and it has no operational overhead. Think about if you have to manage availability durability of your database by also it can get very difficult. So developers are able to focus on their games rather than managing databases. Game developers also looking to expand to multiple regions. So they will need the databases like Amazon DynamoDB that comes with features like global tables or multi-region, multi-active application of data to support their gaming use case.

Kehinde Otubamowo:

Kevin could you move to the next slide, please? So what really is Amazon DynamoDB? This is really not meant to be a deep dive about DynamoDB where we'll take the architecture part. So I'm just going to give a little bit of history of DynamoDB and explain what it really is about as a database.

Kehinde Otubamowo:

So in the simplest terms, I think maybe the best phrase that summarizes Amazon DynamoDB is that it is essentially a NoSQL key-value database. And Amazon DynamoDB that is the web service, the NoSQL key-value database, is based on design principles presented in amazon.com's Dynamo white paper written by Dr. Werner Vogels and other AWS engineers in 2007. The white paper was kind of like the precursor to the NoSQL database.

Kehinde Otubamowo:

So at its core data, Dynamo is a highly available key-value data storage system. And it was built to power some of Amazon's core services that need to be available 24 by 7. Really, it's all began when some of the Oracle databases that were used in amazon.com at a time were not able to meet the scalability requirements of the data. And what did we do differently in DynamoDB? We made use of extensive object versioning, application assisted conflict resolution, and a novel interface for developers to use. That is, it is connectionless. The developers don't... It works very well with stateless applications. Think of internet-scale applications that require like Kafka. They really don't work very, very well with relational database that requires persistent establishment of a connection to the database. So downloading these API base, and it works very well with a lot of internet-scale applications and gaming application use cases.

Kehinde Otubamowo:

So when we launched DynamoDB, it was designed from ground up to support extreme Scale, same extreme Scale that is required by Amazon with the same security and availability requirements that is needed to run mission-critical workloads. And today, DynamoDB powers a lot of internet-scale applications with many customers. It's also some of the use cases that we mentioned earlier. So many of the world's largest businesses, like internet scale businesses, like Lyft, Tinder, Redfin, as well as gaming companies, like Epic Games, FanDuel. These are all gaming companies that use DynamoDB, and I'll discuss more about one of the use cases and the next slide.

Kehinde Otubamowo:

So basically, DynamoDB is able to support that kind of scale and performance. If you really want to learn about the architecture of DynamoDB, we have a lot of talks available on YouTube and in our public documentation, where you can get a deep dive on Amazon DynamoDB. Next slide, please.

Kehinde Otubamowo:

So what really makes Amazon DynamoDB great for right parts of your gaming application? So when you develop your game and your game hits a ditch, DynamoDB is able to scale to meet the volume and velocity of data generated, all while being only what you use. So good developers build on Amazon DynamoDB for it's scalability, durability, and consistency. It's also a multi-region database like I mentioned earlier, with built-in security, backup and restore functionalities, in memory caching as well with DynamoDB Accelerator, or in short, they will call it DAX. So with that said, let me talk about some of the benefits of Amazon DynamoDB.

Kehinde Otubamowo:

So when we talk about the benefits of DynamoDB, we categorize them under three pillars, which is performance and scale. The fact that you don't have any servers to manage and it ships with advanced features that make it enterprise ready.

Kehinde Otubamowo:

So talking about performance and scale, DynamoDB is able to deliver consistent single digit millisecond response times at any scale. And you can build apps virtually with unlimited DynamoDB throughputs and storage and adding cache on top of it to be able to get better performance and move your response time from the seconds to microseconds. So just as important to, you are also able to scale up and scale down in minutes with auto scaling. So this helps customers to be able to future-proof application growth when you're building with DynamoDB and support both the type of workloads.

Kehinde Otubamowo:

So, with DynamoDB, like I mentioned, developers get that in single digit millisecond latency performance or rather at any scale, and that is regardless of whether you have a 10TB table or you have a 1GB table. So, design data modeling in DynamoDB requires some forethought which to be able to get the best performance, you need to model your data and your tables properly with access patterns that support DynamoDB use case. For example, When we should mean game session, information is best practice to retrieve start times and end times and other user properties in a single query.

Kehinde Otubamowo:

Talking about the fact that DynamoDB is serverless, that is, you don't have any servers to manage, DynamoDB is a fully managed service. And with DynamoDB, all you do is just recreate the table, you don't have to provision a server or tune CPU or select like a particular storage type. So all those complexities are kind of like a struggle for you when you are using DynamoDB. So with DynamoDB, there are no servers to patch. You don't have to provision servers or manage. Also, you don't have to install software or maintain or preserve any operating system, it automatically scales up and down and adjust capacity to maintain performance, depending on the capacity model you're using.

Kehinde Otubamowo:

By the way, DynamoDB offers two capacity mode, which is the on demand and production capacity mode. So additionally, availability and fault tolerance are built into the service, eliminating the need to architect your applications for these capabilities. Think of it, if we have to manage NoSQL on your own, and you have to implement sharding on some other open source NoSQL database, it gets very, very difficult. So we DynamoDB, with the on-demand and provision capacity mode, you're able to optimize costs by specifying capacity per workload or only being for resources that you use in the on demand module. So this simplicity is really what customers like about DynamoDB because it helps them reduce the risk and move faster and save money, by reducing amount of these operations teams that you have.

Kehinde Otubamowo:

The last pillar for benefits of DynamoDB, the fact that it is enterprise ready. So DynamoDB really comes with advanced database features like asset transactions, encryption at rest, on-demand backup and restore. To add to that, we also have features like I mentioned earlier, like global tables. And Amazon DynamoDB basically, it supports a lot of critical infrastructure at AWS. So we classify it as a tier-one service, that is, we can use it to build critical applications or we can use it to store critical data. So it is enterprise ready, it can support any type of application really. Next slide please, Kevin.

Kehinde Otubamowo:

So I'm going to move on to talk about the FanDuel, which is one of gaming customers that store data on DynamoDB. FanDuel really is a fantastic sports platform. They're currently one of the largest in the world with over 1 million active paying users. And what FanDuel has done is, they're taking the traditional concept of having season long Fantasy games and turning them into daily fantasy games. So with that kind of daily fantasy games, the explosion of data is really immense, and you have more data to process and a lot of users changing fantastic teams. I will expand more about fantasy sports later on in the talk. But for now let's focus on FanDuel.

Kehinde Otubamowo:

So since their launch in 2019, they've really grown immensely, and they've grown with... The popularity of the games that the offer has really increased. And DynamoDB has been able to scale to meet their demands. And according to the head of infrastructure at FanDuel, they mentioned that with DynamoDB, they don't have to worry about latency. It retrieves all the information that is stored in there extremely, very quickly. So they have a lot of also critical peak periods that they need the database to be highly available for them, and DynamoDB is able to support all these use cases for them.

Kehinde Otubamowo:

With that, I'm going to turn it back to Kevin to discuss the data model we are looking at and the architecture of how we really hook up DynamoDB and Rockset.

Kevin Leong:

Hey, thanks Kehinde for giving us the rundown on DynamoDB. And then on the Rockset side, if you're not familiar, Rockset is a real-time indexing database. It's meant for analytical workloads. So for building the real-time analytics that we talked about, live dashboards on data from OLTP databases like DynamoDB. Some of the key Rockset features include seamless ingest, so you can have the flexibility to ingest data assets, no predefined schema required. This is compatible with NoSQL systems like DynamoDB, in that same vein.

Kevin Leong:

Then converged indexing is another feature. This is where Rockset acts as an external indexing layer to DynamoDB. In this case, in converged indexing means that Rockset automatically indexes all the data ingested every field in multiple ways. Rockset will build a search index, a column index, a role index, and that query time, Rockset query engine will decide the best index to use to execute the query.

Kevin Leong:

Then distributed SQL. Rockset distributed queries across multiple nodes for efficient execution, and also Rockset Query language is full SQL with support for filtering upgrades and enjoying some of the things that we talked about. Dynamo has being useful for analytics.

Kevin Leong:

And then query Lambdas is a way to save and store SQL queries in Rockset where you can simply execute these queries after the fact by hitting a rest end points. These are some of the core functionality around Rockset. And so, you might be wondering, okay, so how does this happen under the hood? I won't give a deep dive here, but just a brief overview, so you can understand what the technical underpinnings are and why it's important to have a cloud native architecture for this in the same way that DynamoDB does.

Kevin Leong:

Rockset has three tiers that are high level. We have tailors on the left, which ingest data from various sources. We have the leaves, which stored in that data. We have aggregators which execute queries. And all these three tiers scale independently and can scale to handle different types of workloads. If you have heavy ingest going on, that can be accommodated by increasing the ingest tier. If you have growth in data volume, Rockset can scale the storage tier. If there are large numbers of users of queries going on at the same time, the query tier formed by the aggregators can be scaled out.

Kevin Leong:

Putting it all together is how Rockset achieves real-time SQL. We spoke earlier about the design considerations for leaderboards in terms of being able to analyze real-time play activity, data shows up within seconds of it being generated. And also complex analytical queries are returned quickly as well with this architecture. Then for accommodating scale, this cloud scalability is how Rockset and provide analytical backend for leaderboards that can scale that can handle high concurrency when needed. Then overall in the same philosophy that DynamoDB brings to the table, this is a servers model, no ops experience, right.

Kevin Leong:

Now that we've given you an overview of the DynamoDB and Rockset components, we can move on into a short demo. This is what we'll be looking at today. DynamoDB is a primary OLTP database for the gaming application, and it can handle large amounts of bytes from games and can scale seamlessly to do so, as we heard Kehinde sharing. Then under read side for analytic applications, we connect Rockset to DynamoDB. And this allows us to use SQL and more flexible indexing to perform analytic type queries while at the same time, not interfering with the primary database for the game.

Kevin Leong:

When we make the connection between Rockset and DynamoDB, there's a one-time scan of what's in the DynamoDB table that gets written into Rockset. Then in steady state, the connector picks up updates from DynamoDB streams to provide continuous ingest into Rockset with the DynamoDB data being queryable in Rockset within a couple of seconds, as we mentioned previously. Then we run our leaderboard queries on Rockset, these are analytic queries and in the setup they won't impact DynamoDB on the right path. For our example, we are looking at a fantasy soccer type game. Fantasy football, if you're not American. And Kehinde can walk us through more of the data that's being stored in DynamoDB on this case.

Kehinde Otubamowo:

Yeah. Thank you, Kevin. LIke Kevin mentioned, for this demo, we created a mini fantasy soccer league game with limited functionality where we include some features like weekly team select, live point scoring, leaderboard, etc. But first off, let's take a step back and talk about why we chose fantasy sports. Some of you might wonder, "why this particular type of game?"

Kehinde Otubamowo:

The reason we did this is because fantasy sport game is typically a data-intensive game. Think about it. It's often played by hundreds of thousands to millions of participants. These participants are distributed globally as well. And typically most of them are doing similar type of operations at the same time, and this type of operations are some transactional in nature because you're essentially spending money. Although it's not real money at the end of the day but you're essentially spending money to buy players. These things are like, they are typical, they look like transactions and they need to be operated at scale with very, very low latency.

Kehinde Otubamowo:

So, a brief overview of fantasy games. There are many ways to play a fantasy game, but basically in the basic version of the game, you create a fantasy virtual team made of real professional players sports of like hockey, soccer or football depending on which side of the Atlantic ocean you are. And other sport like cricket, basketball. In most versions of the game, participants assemble imaginary teams, really, or virtual teams. These are virtual teams of real players in professional sports. And these virtual teams, they compete based on the statistical performance of the real players in the actual games. To calculate the points, computers are used to track and compare the actual results of real games and statistical performance of those professional players in a particular game day or a game week.

Kehinde Otubamowo:

In most fantasy sports, team owners drives trade, quote and drop players similar to how it is done in real sports. That is an overview of the game. And then, like I mentioned earlier, usually most of these games have players distributed globally. We mentioned the example of FanDuel earlier. They have more than 1 million user. For example, the Fantasy Premier League had like a record breaking 7.6 million registered players worldwide for 2019/2020 season. That is why we chose this particular model and we build a simple data sets to represent what a fantasy soccer game would like in DynamoDB. Next slide.

Kehinde Otubamowo:

To describe the data model, we built an entity relationship diagram made of four tables. When you are working with DynamoDB or building an application on DynamoDB, it's very important to first of all start with the entity relationship diagram and understand what the access patterns are. There are many ways of modeling your data in DynamoDB. You can obviously take this particular entity relationship diagram and model it as a single denormalized table and store it in DynamoDB. For the purpose of this demo, we are leaving the data, these entities in the data model in separate tables, just simplicity and also to be able to emphasize and really show the value proposition of Rockset. Because if you're working with some complex application, sometimes it's not always possible to put all your application with a single table, and sometimes you might need join separate tables. That is really where Rockset comes in, and that is the value proposition Rockset, who happens to be a select AWS partner.

Kehinde Otubamowo:

For this particular application, we have the gamer's table or the gamer's entity that stores information about gamers playing the game. We also have the soccer players table, which contains the information about soccer players, real professional players that can be selected by the gamers each game week or game day and other attributes like weekly score. For this demo, we populate this table by randomly assigning points to players each game week. We are not going to be storing actual game data.

Kehinde Otubamowo:

For the soccer player game week starts, we store players weekly performance or that game statistics. Finally, the gamers team table stores team selected by each gamer for the particular game week. Like the diagram that we have up here, you can see a particular team selection there. That particular team selection that is on the picture on the left-hand side of the screen is what we modeled in the gamers table. The gamers table has gamers weekly table. The gamers table has a one-to-one relationship with soccer players table and then all the other tables are... The soccer player game week stats also has a many-to-many relationship with the soccer players table.

Kehinde Otubamowo:

Like I mentioned, there are other alternative ways to model this in DynamoDB. So we are going with this particular model. And I will turn it back to Kevin now to describe how we hooked up Rockset along with these data that we stored in DynamoDB, to be able to power a leaderboard, a real-time leaderboard in Rockset.

Kevin Leong:

Okay. Well, thanks Kehinde. Let me switch over here. This is the Rockset console. So we'll start showing a demo here. The data that Kehinde talked about, that's all loaded up in DynamoDB, and you can imagine that can be live data that can be changing as gamers change up the teams or the soccer players that they're actually mirroring score a goal or something like that. Those stats may change in DynamoDB and then Rockset will pick up all those changes.

Kevin Leong:

So how do we go about syncing Rockset with the data in DynamoDB? Just to show you, this is where you'd start in the console. You go ahead and create a collection from DynamoDB. Then the first thing you want to do is you want to create an integration. You can give it a name. You have to do several steps for configuring the right permission for Rockset to access your DynamoDB table and DynamoDB screens. Once that's been created, we have an integration here. You can go ahead and create a collection. A Rockset collection will have a one-to-one correspondence with your DynamoDB table.

Kevin Leong:

Again, you can go ahead and name your collection, and then we'll just start off saying that's for the purpose of a demo. Then you specify integration. And then you give the name of the table that you want to pull into Rockset. So this is the demo game as Kehinde was talking about earlier. Notice when I entered the name of the DynamoDB table, it actually went to pull a preview of the data that's in the table so that you can take a look and make sure it's what you expect.

Kevin Leong:

This is a fairly straightforward table. You have a different gamer IDs, you have some information on each gamer and in this case some demographic information, what country they're from and their age. So that's one of the tables that Kehinde was talking about in the game. And then another table... So those were the gamers.

Kevin Leong:

Then we have soccer players as well. And this is again the preview of the data that's in the soccer players' table. You'll see more stats around each player; what position the player plays; defenders, some attributes such as physical attributes, such as speed, stamina, and so on. Also, associated with the player scores for each game week. In game week one, this player scored nine points, perhaps for making this number of tackles or keeping a clean sheet, if that player is a defender and so on. This is typically how fantasy games would work.

Kevin Leong:

Again, this is just to give you a sense for how you would set up a connection from Rockset to the data that's in DynamoDB here. Let's say we've created these collections and we have a number of them, soccer gamers which I showed earlier soccer players, which I showed as well, and then soccer gamer teams. Let's look at one of them and see how we can actually query from them.

Kevin Leong:

We can query from this collection here, soccer gamer teams. Go ahead and run that. We can go ahead and look at what's returned just from selecting 10 of these records that's in this collection that's been pulled over from DynamoDB. You will see it's a somewhat complex data, we have nested fields and so on. What this is saying is that for some combination of game week, so in game week four, this particular gamer picked this team. This number of players, who's the goalkeeper, who's the captain, and so on. So the set of data and this table defines who the game picked for his or her team in a certain game week.

Kevin Leong:

So what actually happens when Rockset ingest all this data, is that Rockset indexes every field even nest fields in order to speed up queries on the data. Then if there's an update in DynamoDB, like if the gamer changes the makeup of their team, this will be reflected in the Rockset collection within seconds, because both are kept in sync using the connector.

Kevin Leong:

With that, let's look at some queries that we can use to build a leaderboard. One of these queries is this, which is... And you can see, we're getting fairly complex here, where we are calculating the score for all the gamers teams for a specific week. When you construct these queries, you have all the SQL operations at your disposal, whether it's using joins or group by, or where clauses, for example.

Kevin Leong:

What this is essentially doing is that for a particular week in this case we can give that parameter, week. For week six, it's going to calculate all the scores for your entire gaming universe. Each gamer for the team the gamer picked in game week six, these are the scores totaled out across all the players in the team. That's certainly one of the queries on which you would probably need to build a leaderboard on. You need to figure out how many points each player scored for each week. Another possible query might be just calculating the score for a particular player. This query actually totals up the scores across all weeks for a single gamer. This is something that the game can possibly display whenever the gamer logs into the game. In this case, this particular gamer for all their teams across all game weeks, the total score was 597. Again, you can use fairly complex SQL to define your query.

Kevin Leong:

And then when it comes to building a leaderboard, this is probably the money query that powers an overall leaderboard, which is, for each gamer you want to total up all their scores across all weeks and then order it in descending order. And this is what this query will get to you. Then you can parameterize this query to come up with a daily, weekly, monthly leaderboard as you desire. What I'll do here is... Let's come up with a leaderboard query that comes up with the top, say, N gamers for overall. In this case, I've added a parameter, which I'm just using tennis as the default here.

Kevin Leong:

What I can do is that I can create a query Lambda out of this, which is... And I'll update the query Lambda here, right. And this actually gives me a way to call this query by hitting a rest endpoint. So this query that powers my leaderboard, I can simply copy this example here. If I went to my terminal, copy that in, what I've essentially done is I've executed this safe query from a rest endpoint and curl here. That's a simple way of productionizing your leaderboard query. In this case, we're just returning top 10 and you can see the query giving that back to you, and that you can build leaderboards, you can build other real-time applications easily in this manner.

Kevin Leong:

With that, I hope that I've given you a flavor of Rockset and how it integrates with DynamoDB and how you can build real-time analytics and things like leaderboards on top of it. Let's go back to our slide deck here and I'll turn it over to Kehinde for a while, see if you want to sum up anything that we've talked about across the course of our webinar, Kehinde.

Kehinde Otubamowo:

Yeah, great. Thank you, Kevin. The really nice thing about the demo was how quick those queries were running and how fast it's able to aggregate data across many different DynamoDB tables. That's a really neat feature there by Rockset. Basically, like we mentioned earlier in the talk, analytics in gaming is similar to analytics in any other application or in any other industry. Basically, you measure and track elements of your data to monitor performance and understand customer satisfaction, monitor sales or purchases on your teams and so on.

Kehinde Otubamowo:

SO the basic premise is common. Analytics in gaming is not different from analytics in any other use case application use case in another industry. Basically, we are trying to achieve similar goals like improving the product and analyzing telemetry, all of those things. The demo that we showed here, though we used a very specific use case, can really be applied to any particular application as long as you're storing the data in DynamoDB and you want to be able to query across many different tables or do some aggregations on the data that is stored in DynamoDB.

Kehinde Otubamowo:

Back to gaming, depending on the size and majority of your gaming company, I mean, analytics might not be one of the core competencies that you have. And that is where a software and service solution like Rockset, and also a fully managed database service like DynamoDB come together to be able to power real time analytics for your gaming and give you a competitive advantages. Analytics is very important in gaming to be able to make well-informed decisions and be able to make really fast decisions as well, to be able to improve game play, the gamer experience basically, and ultimately grow revenue of your game. With that, I'm going to turn you back to Kevin to wrap it all up.

Kevin Leong:

Hey, yeah, thanks Kehinde. Of course, feel free to reach out to us, our contact information is on this slide. And if you have any questions, feel free to send them in the webinar tool and then Julie can help read those off. I'll also point out as you're doing that, that we are offering... If you attended this webinar and if what you see is interesting, do check us out on rockset.com. You can of course sign up for an account and you can try to tool out for yourself. But if you would be interested in a guide or trial of Rockset, you can start that this week and we are offering $500 in free credits for doing that and the link for doing that is on the slide. So please avail yourself of that opportunity if you're so inclined. Okay. With that, Julie, anything on your end?

Julie:

Yes. Thanks, Kevin and Kehinde. We have a couple of questions coming in. First one is for you, Kevin. What are the advantages of using Rockset over using a leaderboard built with Redis' ordered Sets?

Kevin Leong:

Yeah. I think we've seen here that one of the things the considerations to look at in the area of operations, and I think that's one area where my Rockset will shine and it shares the same service philosophy that DynamoDB does, if you want hands-off, no-ops approach to doing that. And that's one of the benefits, I think. Another of the benefits is the flexibility that SQL gives you in terms of writing analytic queries on top of your gaming data. Those are definitely some of the things that I would look at when comparing those.

Julie:

Great. Next question for you again, Kevin. Let us say that I have a webpage displaying the leaderboards, the webpage would run the query to display the kind of top players. Could you elaborate a little bit more on how this is updated in real-time as new data arrives into DynamoDB?

Kevin Leong:

Again, as new data arrives into DynamoDB, it is continuously ingested into Rockset. So Rockset is going to have the data coming from DynamoDB with maybe a couple of seconds lag. What you could do is, you saw that Rockset can expose APIs from which you can call safe queries. So what you can do is, your web app can call this API and get the latest information as you desire.

Julie:

Great. Another one is, I have considered using Athena or Elasticsearch for these types of analytical queries. How does Rockset compare to these other solutions?

Kevin Leong:

Okay. A couple of things there. We spoke about some of the core capabilities of Rockset. What's core to that is compared to, let's say Athena, one of the things that Rockset does is Rockset indexes every field, even nested fields, and multiple lays. So that's, I think, one of the core differences. I would say that's a right tool for every use case, and Athena would be a great fit for certain use cases. But for real-time analytics, if you want the query latency that you need really for real-time analytics, you probably want your data indexed and that's how Rockset will give you low latency aggregations, joins and other analytic queries. So that's on the Athena side.

Kevin Leong:

I talked quite a bit about indexing. And so, as it relates to Elasticsearch, actually Rockset and Elasticsearch, which share the same type of indexing philosophy, where we try to index everything. The difference between Rockset and Elasticsearch, again, there is a right tool for everything. I think Elasticsearch would be great for texts and log analytics. I think if you're doing things like aggregations on metrics and such, Rockset is probably a better fit there. Then again, not to overlook the operational burden of one versus the other, a lot of our customers actually come to Rockset after having tried some of these other solutions, mainly because of the operational overhead that something like Elasticsearch [crosstalk 00:58:16].

Kehinde Otubamowo:

Yeah, to add to that, really Rockset, like Kevin mentioned, is really purpose built for log analytics and analyzing mostly unstructured data and be able to use SQL query on top of that. It depends on your use case basically, whatever your use case it should be is should be what will determine a particular database or suitable database solutions that you'll use. There's really no one size fits all database or one size fit all query engine. So that's an important thing to keep in mind.

Kevin Leong:

Yeah.

Julie:

Great, with that, I'm going to have us conclude our tech talk for today. Thank you everyone for joining. Thank you Kehinde for joining us today as well. We'll be following up with sending you all the recording via email. Enjoy the rest of your day. Take care.

Kevin Leong:

Thank you.

Kehinde Otubamowo:

Bye-

Recommended Webinars

Serverless Real-time Indexing: A Low Ops Alternative to Elasticsearch

Scaling MongoDB Best Practices for Sharding, Indexing and Performance Isolation

How Standard Cognition Builds AI-powered Autonomous Checkout on Computer Vision Data

Best Practices for Analyzing Kafka Event Streams