Wednesday, January 13, 2021

BookMyShow System Design

Problem: We need to design a ticket booking system like BookMyShow. Here are the features which we are trying to target:

  1. High Concurrency
  2. Responsive UI
  3. Search
  4. Multiple cities
  5. Payments
  6. Booking
  7. Movie Suggestion
  8. Comments and ratings
  9. Movie Info
  10. Sending ticket via email, SMS, What's APP or Notification

Design: We will go through the design for each feature but before we go in depth let's see how our system looks like. 

A client / App calls BookMyShow APIs and BookMyShow calls Theater APIs to book the ticket so here is how our system initially looks like:



Let's look at smaller subset of this System which is Theater. We won't go in details as this is out of scope of this problem but let's have a look at the basics:



Basically every theater will maintain server and Database. Theater will expose APIs through which BookMyShow or any other service providers will interact with theater for bookings, getting the booked seats and other theater related information.

Theater should maintain an app server at their end as if they just expose the Database to service providers, they might take advantage of it and can lock seats for themselves for a long time.

One more thing to mention here before jumping to our features is that BookMyShow obviously can't run on a single server given the traffic that means it needs multiple servers and loadbalancer to distribute the load among these servers. For now our design looks like:




Now that we understood little about theater, let's go through the different steps of booking:

1. Go to a particular movie: There are two ways using which user can go to a particular movie:

1.1 Select a particular city (we can use GPS / IP info to auto detect) and see the list of movies in the city and then user can select a movie which will show the different theaters on which that movies is running. If you see this looks like a Relational Database as this data has lots of relationships and also this data is not going to be very big. That means we need to maintain a Relational Database at our end. I came up with a schema which looks like follows:



I understand that at this point there are many tables in the schemas which are not used yet but we will come to that later. Please ignore those tables for now. Focus City, Movie, Show and Cinema related tables only. You will understand what I am trying to achieve here. This database can be preloaded on every Friday using Batch API requests to different Theater APIs. 

1.2  User can directly search on the search bar given on UX. In that case we can't just use MySQL or other Relational database as it could require contains search. Here we can use Elastic Search which is very good at these kind of searches and is distributed system. Once we get the movie id from search we can use the same MySql database to fetch the theaters.

In both approaches we can see that reading from Database can be a bottleneck here as it is a time consuming task so we can introduce a distributed cache like Redis to hold the data like City and movies, Movie and Theaters with Screen id and timings. Even the seat arrangements can be cached there. 

So now our BookMyShow server is looking like this:




2. Seat selection and Payment: Once the user select a timing and then seats, BookMyShow sends a lock request to the theater server. The theater server can maintain a different table for this or it can lock a row so that no other provider including BookMyShow can book the same seats. This lock can be maintained for 10 minutes. During this time User will be redirected to Payment gateway and once the payment is successful we can update the booking and Payment Tables given in the schema above. We can share the booking id with the theater so that they can match this id when user actually show the ticket to the theater.

Our task is not done yet. We need to send the tickets to client. For that we can use async workers (microservices) which are listening for an item to be queued, so our main server will just enqueue the request of sending tickets and one of the workers just pick the request from the queue, process it and send the email/sms/whatsapp message to client containing the tickets.

Now we are done with our Booking process and our BookMyShow server looks like:

 

This actually covers multiple features:
  1. Search
  2. Multiple cities
  3. Booking
  4. Payments
  5. Concurrency
  6. Sending tickets
Now look at the additional features, we have in our design:

Let's see how we can support comments, ratings and movie info like trailers, actors, genre etc. If you see this is like a big data and its very difficult for Relational Database to maintain it efficiently so for this kind of data we can use NoSql Database like MongoDB or Cassandra. We can also use our cache to save this data for efficient retrieval for frequently accessed data. With this info our Server will now look like:



Now let's look at How we can support Movies suggestion to users. Basically for this we need to feed user activities/logs to ML. What we can do we can dump all this to Hadoop (HDFS) and then we can use PIG / HIVE queries to extract the data and then this structured data can be used to train ML models.
With this now we have our final BookMyShow server design:



Now that we have completed our feature set, let's see how we can improve the speed of our UX. To make UX much more efficient, we can use a frontend cache like Varnish which actually caches copy of web pages and later whenever the same page has been requested, instead of going to server, it can be delivered from the Varnish itself. We can also use CDNs to quickly access the trailers and other bigger size data. With these optimizations our components look like:



I think we have covered all our features. We have put some optimization techniques too and that means we are done with our design :)


No comments:

Post a Comment