In this post we show how to create an interactive gamification feature that scales. We demonstrate that it is feasible and cost effective to build such kind of interactive features using MigratoryData KE (Kafka Edition) to engage in realtime with millions of users. To this end, we perform a benchmark study that demonstrates that MigratoryData KE scales vertically up to one million concurrent users on a single instance in the given scenario, and scales horizontally in a linear fashion, with a shared-nothing clustering architecture, by simply adding more instances to the cluster.

Realtime Interactions with Millions of Sports Fans

MigratoryData is a publish/subscribe web messaging solution with over a decade experience in delivering realtime data to large audiences across various industries, including sports. In a previous article, we explained how one of our customers has been using MigratoryData to deliver realtime scores and odds to more than 100 millions sports fans, of which more than one million are simultaneous. As soon as a live score or odd is published to the sports platform on a certain subject, MigratoryData broadcasts it in realtime over WebSockets to the fans subscribed to that subject. So this is a subscribers-dominated use case.

In this post, we’ll look at a different use case for sports involving both realtime data and a large number of sports fans, but where subscribers and publishers are balanced. We’ll look at a use case where sports fans not only receive realtime data from the sports platform, but also send back realtime data to the sports platform. We take a look at the OTT platform Disney+ Hotstar and its interactive game Watch’N Play. During the last edition of the World Cricket Cup, the OTT platform Disney+ Hotstar counted 25.3 million simultaneous viewers, a world record in terms of audience. The OTT platform has been using gamification to engage with the large number of cricket fans in India using the Watch’N Play interactive game where viewers of a live cricket match can answer in realtime to live questions about what happens on the next ball to win points and redeem prizes.

Watch’N Play has been built around Kafka. We were curious to see how MigratoryData KE — which is an off-the-shelf Kafka-native solution for realtime web messaging at scale — would address such massive scalability needs. Therefore, we’ve built a live demo inspired by Watch’NPlay and performed a benchmark study for this use case as detailed below.

Live Gamification Demo

We’ve built a live gamification demo asking every 20 seconds a live question with a 10-second deadline. Therefore, a player of the demo has 10 seconds to answer to a live question to win points. The demo also displays a leaderboard with the ten highest scoring players. A number of ten bot players are permanently connected to the demo, answering randomly to the live questions to generate some activity.

The demo consists of the following components besides MigratoryData KE and Apache Kafka:

Component Name Description
UI A realtime web application (MigratoryData subscriber and publisher)
Questions
Generator
Generate live questions (Kafka producer)
Answers
Processor
Process the answer received from a player (Kafka consumer) and publishes back the new points won by that player (Kafka producer)
Leaderboard
Processor
Update the leaderboard with the new points won by a player (Kafka consumer), get a leaderboard request (Kafka consumer), and publish the current leaderboard following a request (Kafka producer)

MigratoryData KE should be configured to consume the Kafka topics question, result, and top that are explained below. Thus, the configuration file kafka/consumer.properties of MigratoryData KE should contain the following configuration:

topics = question,result,top

The UI of the demo consists of two views: a Play View and a Leaderboard View.

Play View

The following MigratoryData subjects and Kafka topics and keys are used to model the realtime interactions of the Play View:

Kafka (Topic, Key) MigratoryData Subject Description
(question, null) /question Used by the Questions Generator to generate live questions
(answer, x) /answer/x Used by the Player with the id x to deliver the answer to a live question
(result, x) /result/x Used by the Answers Processor to let the Player with the id x, as well as the Leaderboard Processor, know how many points the Player won after having answered to a live question
The mapping between MigratoryData subjects and Kafka topics and keys is automatic with no configuration needed.

The realtime information flow between the components of the demo using these subjects, topics, and keys as depicted in the following diagram:

A Player x that opens the Play View connects to MigratoryData KE using a persistent WebSocket connection and subscribes to the subjects /question and /result/x. The Questions Generator generates a live question every 20 seconds by publishing a message on the topic question ❶. As MigratoryData KE is configured to consume the topic question, it will get each live question ❷ and will stream it to the Player on the subject /question ❸, to which the Player is subscribed to. The Player has up to 10 seconds to answer to the live question by publishing a message on the subject /answer/x to MigratoryData KE ❹, which in turn sends it to Kafka on the topic answer with the key x ❺. The Answers Processor consumes the message from the topic answer ❻, checks if the answer provided by the Player x is correct, and sends back the points won by the Player by publishing to Kafka a message on the topic result with the key x ❼. This message will be consumed by the Leaderboard Processor to update its list of top Players ❽. The message is also consumed by MigratoryData KE ❽, which is configured to consume the topic result, that in turn sends it to the Player on the subject /result/x ❾, to which the Player is subscribed to.

Leaderboard View

The following MigratoryData subjects and Kafka topics and keys are used to model the realtime interactions of the Leaderboard View:

Kafka (Topic, Key) MigratoryData Subject Description
(gettop, null) /gettop Used by a Player to request the leaderboard
(top, x) /top/x Used by the Leaderboard Processor to let the Player with the id x, which requested the leaderboard, know what is the current list of the top ten players

The realtime information flow between the components of the demo using these subjects, topics, and keys as depicted in the following diagram:

A Player x that opens the Leaderboard View connects to MigratoryData KE using a persistent WebSocket connection, subscribes to the subject /top/x, and publishes a request message containing its id x on the subject /gettop to get the updated list of the top ten Players ❶. MigratoryData KE sends the request message to Kafka on the topic gettop without using a key ❷, which is consumed by the Leaderbord Processor ❸. Once the request received, the Leaderboard Processor sends back a response message with the current leaderboard on the topic top, with the key x ❹. As MigratoryData KE is configured to consume the topic top, it gets the response message ❺, and pushes it to the Player on the subject /top/x ❻, to which the Player is subscribed to.

The source code of the demo is available on GitHub. The folder frontend contains the source code of the UI. The folder backend contains the source code of the Leaderboard Processor, Answers Processor, and Questions Generator.

Vertical Scalability

We’ve performed a benchmark test that demonstrates that a single instance of MigratoryData KE running on a commodity server with 2 x Intel Xeon E5-2670 CPU and 64 GB RAM can handle 1 million concurrent players in the given gamification scenario.

Benchmark Setup

A Players Simulator running on the Test Machine 3 opens one million WebSocket connections to a single instance of MigratoryData KE running on the Test Machine 2. Kafka, Answers Processor, and Questions Generator run on the same Test Machine 1.

Benchmark Scenario

The benchmark covers the realtime interactions of the Play View which represents the main activity of the gamification feature. The activity produced by the Leaderboad View typically involves a small number of concurrent Players and so has been ignored by our benchmark.

Therefore, we use the following benchmark scenario:

  • each player i of the one million players subscribes to the subjects /question and /result/i
  • all players receive every 20 seconds a live question as a JSON object of under 256 characters, containing the id of the question, the question itself, the answer options, the number of points to win (the demo project includes the list of the questions from which a live question is picked randomly)
  • the player i parses the JSON object containing the live question, and sends back after a random number of seconds between 0 and 10 seconds (remember, each question has a 10-second deadline) an answer as a JSON object, which contains the id of the live question and the option selected randomly from the received answer options
  • the player i gets the points it won on the last question along the subject /result/i
Benchmark Results

MigratoryData KE ran on the Machine 2 with the IP address 192.168.5.22 and accepted the connections from the Players Simulator on the port 5000. We’ve monitored MigratoryData KE using Prometheus and Grafana. From the Grafana dashboard below we can see that there are 1 million concurrent WebSocket connections which remain persistent during the test time. We also see from the dashboard that all messages are processed until the next live question is proposed. We also observe that no message is lost even if MigratoryData KE is configured to use Standard Message Delivery rather than Guaranteed Message Delivery. More concretely, we can see 1,000,001 messages from Kafka to MigratoryData KE during a round when a question is processed: 1 message is the live question and 1,000,000 messages are the result messages sent for each player. Also, we can see 1,000,000 messages from the Players Simulator to MigratoryData KE which represent the answer messages the players send following a live question. Finally, we see 2,000,000 messages from MigratoryData KE to the Players Simulator which represent the result and the live question messages for each of the one million players. Note that if the result message is specific for each player, the question is the same for all players. Therefore, we can see that the 1 message received from Kafka as a live question is multiplexed by MigratoryData KE to one million messages to the clients.

Linear Horizontal Scalability

MigratoryData KE can be clustered to scale horizontally. Moreover, MigratoryData KE does not share any user state across the cluster. Each instance of MigratoryData KE running in the cluster is independent of the other instances. Therefore MigratoryData KE scales linearly.

The benchmark result above has shown that for the given scenario, a cluster of one instance of MigratoryData KE running on a machine with 2 x Intel Xeon E5-2670 CPU and 64 GB RAM can handle one million concurrent users. To exemplify the linear scalability discussed above, we show that by adding a new instance of MigratoryData KE to the cluster, running on a machine with the same hardware, the cluster capacity increases to 2 million concurrent users.

Benchmark Setup

A cluster of MigratoryData KE with two instances are deployed on the Test Machine 2 and respectively Test Machine 4. A Players Simulator A running on the Test Machine 3 opens one million WebSocket connections to the first instance of the cluster, and a Players Simulator B running on the Test Machine 5 opens one million WebSocket connections to the second instance of the cluster. Kafka, Answers Processor, and Questions Generator run on the same Test Machine 1.

Benchmark Scenario

We use basically the same scenario as described above, with a change to the subjects and topics to limit the communication between Kafka and a MigratoryData KE instance only to the messages related to the clients of that MigratoryData KE instance:

  • each player i of the Players Simulator A subscribes to the subjects /question and /resultA/i
  • each player j of the Players Simulator B subscribes to the subjects /question and /resultB/i
  • all players receive every 20 seconds a live question on the subject /question
  • the player i of the Players Simulator A and the player j of the the Players Simulator B respond on the subjects /answerA/i and respectively /answerB/j to each live question after a random number of seconds between 0 and 10 seconds (remember, each question has a 10-second deadline)
  • the player i of the Players Simulator A and the player j of the the Players Simulator B get the points they won on the last question along the subjects /resultA/i and respectively /resultB/j
Benchmark Results

An instance of MigratoryData KE ran on the Machine 2 with the IP address 192.168.5.22 and accepted one million concurrent WebSocket connections on the port 5000 from an instance of the Players Simulator. A second instance of MigratoryData KE ran on the Machine 4 with the IP address 192.168.5.24 and accepted one million concurrent WebSocket connections on the port 5000 from another instance of the Players Simulator. From the Grafana dashboard below, we can see that there are 2 million concurrent WebSocket connections which remain persistent during the test time. We can count the number of messages as explained in the Vertical Scalability section above and see that all messages are processed until the next live question is proposed with no message loss.

Conclusion

In this post we have seen how to use MigratoryData KE and Apache Kafka to build a realtime gamification feature at scale with a focus on sports. Of course, a realtime gamification feature, and more generally a feature involving realtime interaction with lots of users, can apply to other fields. Instead of a live sports match, we can have a live video produced by a content creator, a conference or class broadcasted live, a live audio-chat, and in general any live content either if it is video, audio, or even text which has many users and needs realtime engagement with them.

We’ve also seen that a single instance of MigratoryData KE running on a commodity server with 2 x Intel Xeon E5-2670 CPU and 64 GB RAM can handle one million concurrent users in the given scenario. Also, we’ve explained the linear scalability of MigratoryData KE and exemplified it: we’ve increased the cluster capacity with another million of concurrent users by adding a new instance to the cluster. Therefore, assuming the impact on Apache Kafka remains acceptable as the number of cluster members increases, it is safe to extrapolate and assume that the cluster capacity can be increased to many millions of concurrent users in the given gamification scenario, where each cluster member handles one million concurrent users.

Should your application need a feature to interact in realtime with your audience, please contact us. We look forward to learning about your application requirements.