Tuesday, July 22, 2025

Design Swiggy

Designing a highly scalable food delivery application like Swiggy or Zomato is a complex challenge that involves optimizing for scale, consistency, low latency, and real-time tracking. These apps handle millions of users, orders, and real-time delivery updates across cities, all while ensuring a seamless user experience.

In this post, I’ll walk through a step-by-step architectural design of a food delivery system, starting from functional and non-functional requirements to API design, services, data modeling, and finally addressing performance, availability, and consistency trade-offs.


Requirements:

1. Functional requirements:

  1. It's an app only system.
  2. Only registered users can order food.
    • Delivery partner registration is manual and out of scope
  3. User and delivery partners need to login to order food and deliver food.
  4. Delivery partners need to make themselves available to receive the delivery order.
  5. Restuarants need to register and upload the menu to receive the food order.
  6. Restaurants need to make themselves available to receive the order.
  7. We need to minimize the:
    • Wait time for the users
    • Distance / Time for delivery partners to deliver the food.
  8. User should be able to search the dishes, cuisines and restaurants.
  9. User should be able to see the top rated nearby restaurants on home page.
  10. Collect payment information from user.
  11. Collect bank information from restaurant to pay them
  12. Delivery Partner should be able to mark the order state like in transit, delivered
  13. Users should be able to track their order.
  14. Users should be able to rate restaurants and delivery partners.

2. Non functional requirements:

  1. Scalability
    • 100 M total users.
      • 10 M daily orders. ~ 115 orders / seconds or 1000 - 2000/second during peak hours.
    • 100 K delivery partners
    • Collect delivery partner's location every 5 seconds i.e. 20 K messages / second.
    • 300 K total restaurants
    • 100 items per restaurant menu so total menu items are 30 M.
  2.  Availability
    • 99.99 or higher
  3. Performance
    • Search < 200 ms
    • Loading home screen < 500 ms
    • Loading Menu < 500 ms
  4. CP over AP
    • Consistency is imporant over availability here. 
    • Can't have older menu or older addresses etc.

Design:

Step 1: API design:

As usual, for the api design let's first translate our functional requirements in a sequence diagram which should look like follows:


Given the sequence diagram, we can see the need of having a bidirection connection between server & delivery partners and also server & users.

We now have clear APIs:

Restaurants:

  • POST /restaurants
  • GET /restaurants
  • GET /restaurants/{restaurant_id}
  • PUT /restaurants/{restaurant_id}
  • POST /restaurants/login
  • POST /restaurants/{restaurant_id}/menu
  • GET /restaurants/{restaurant_id}/menu
  • POST /restaurants/{restaurant_id}/join
  • POST /restaurants/{restaurant_id}/state
Delivery Partners:
  • POST /dpartners/login
  • POST /dpartners/{partner_id}/state
  • POST /dpartners/{partner_id}/location
  • POST /dpartners/{partner_id}/rate
Users:
  • POST /users
  • POST /users/login
  • PUT /users/<user_id>
  • GET /users/<user_id>
Order:
  • POST /orders
  • GET /orders?restaurant_id=<id>
  • GET /orders?partner_id=<id>
  • GET /orders?user_id=<id>
  • POST /orders/{order_id>/state
  • GET /orders/{order_id>
Search:
  • GET /search/restaurants
  • GET /search/dishes


Step 2: Mapping functional requirments to architectural diagram:

1. User need to register and login to order food, Should be able to update the info

We will have a User service to serve these two requirements. we can collect the payment info while registeration. User service will use a SQL DB which will have two tables:

  1. User:
    • user_id
    • user_name
    • password
  2. User_Profile:
    • user_id
    • phone_no
    • email
    • address
    • PIN

User Payment information will be sent to another service that is Payment service from User Service. This service will have its own SQL DB which will have following table:

  1. UserPaymentInfo:
    • user_id
    • credit_card
    • billing_address



2. Restaurant registration, login, upload menu and update these details:

We will have a separate service say Restaurant Service which will have its own SQL DB. It will have following tables:

  1. Restaurants
    • id
    • user_name
    • password
    • name
    • location
    • address
    • pin
    • IsActive
    • menu_id
    • avg_rating
    • num_rating
  2. RestaurantMenu
    • id
    • restaurant_id
    • name
  3. RestaurantMenuItem
    • id
    • menu_id
    • item_id
    • price
    • IsAvailable
    • avg_rating
    • num_rating
  4. MenuItem
    • id
    • Name
    • Description
The payment info will be sent to Payment Service to store the bank details. Payment Service will have another table in its SQL DB to store this info:
  1. RestaurantPaymentInfo
    • restaurant_id
    • bank_details
The menu and restaurant metadata will be sent to a new Search service. Search service will store this metadata in it's Elastic Search database.

3. Restaurants need to make themselves available to receive the order.:

Restaurant admin simply call Restarant service. Restaurant service will mark the restaurant's IsActive as True. The diagram for both 2 and 3 FRs is as below:



4. Delivery Partners need to login and make themeselves available to recieve the orders:

To support delivery partners, we will have another service say DPartner Service. This has its own SQL DB to store the delivery partners info. Here are the tables:

  1. DeliveryPartner 
    • id
    • name
    • user_name
    • password
    • rating
    • is_active
    • num_deliveries

Deliver Partners can make themselves available using DPartner Service. It will just change the state in the DeliveryPartner table. Once they made themselves available, DPartner service will create a bidireactional connection using web socket. This will have following benefits:

  • Delivery Partners need to send location very frequently so it will remove the overhead of TCP connection every time.
  • Server can also push messages like pick_up order etc. 

To record these location we will use a new service say Location Service. DPartner service will continue to queue these locations to Location service. We will use a write performant LSM tree based NoSQL DB to store these locations updates. This will contain partner_id and location like lat and long and we continue to update this record until the partner is active. Here is the schema of the collection:

  • partner_id
  • latitude
  • longitude
  • order_id

We are going to have another service say ConnectionManager Service. It will store user_id and connection_id in a highly performant KeyValue pair DB like redis so once the WSS connection is established, DPartner service will call ConnectionManager service to store this info. For now it is not required but it will be required when server push some mesage to delivery partner or user to know which connection to use.



5. When user order the food, system assign the best possible Delivery Partner:

As mention in the functional requirement, user should get the order in minimum time and delivery partner must travel as less as possible.

To order the food, we will have a new service say Order service. Order service will have its own SQL DB. Here are the tables:

  1. Order
    • id
    • user_id
    • restaurant_id
    • payment_id
    • dpartner_id
  2. OrderDetails:
    • order_id
    • restaurant_menu_item_id

For the order to complete, the payment will be done using Payment Service. Payment Service will connect with credit card company and restaurant's bank to complete the payment using third party payment service.

Once the payment is done, order service queue the message to Notification service and Notification service push the notification to Retaurant. Restaurant then fetch the order details from order service and set the state of order as approved by calling Order Service API.

To match the delivery partner, we will have a new service Matching Service. Let's see what it does to match the delivery partner to an order. 

  • Matching service send the user's and restaurant's location to Location service
  • Location service finds all the closed available delivery partners by drawing a circle of 2 - 3 KMs. Restaurant could be the center of the circle.
  • Location service now sends the triplet {partner_location, restaurant_location, user_location} to a third party service.
  • Third party service sends the ETA for every every triplet.
  • Location service then send the {partner_id, ETA} to matching service.

Matching service will then match the partner with order based on ETAs and other criterias. Once matched, Matching service will send the Location service {partner_id, order_id} and Location Service will save order_id corresponding to partner_id to indicate that delivery partner is serving an order.

Now Matching Service will send {order_id, partner_id} to Order Service which will update the partner_id corresponding to order_id. 

Once the partner_id is saved, Order Service queue another message with order info like restaurant location, user location and order_id to Notification service which pushes these details to delivery partners. 


6. User should be able to track their orders

Once the User reaches to order tracking page, user app creates an WSS connection with a new service say Order Tracking Service. Here is the flow:

  1. Once the order is confirmed and delivery partner is assigned, it queues the order_info like order_id, user_id, state, dpartner_id etc to Order Tracking Service.
  2. Whenever restaurant update order state using Order Service API, Order Service also queue this order_info to Order Tracking Service.
  3. Location Service when it sees that and order_id is attached the delivery partner id, then it queues the order_id, location to Order Tracking Service.
  4. Order Tracking Service maintains redis cache to store this info. 
  5. When user creates an WSS with Order Tracking Service, it also sends it to Connection Manager Service to store it.
  6. When Order Tracking Service receives any messages, it first checks with Connection Manager to get the connection id and use the connection id to send the update to User. 

So this is architectural diagram for order placement and tracking:


7. Delivery partner should be able to mark the order state:

Delivery partners can simply call Order service to mark the order state. With this, this is the compelet flow of Orders:



8. User should be able to rate delivery partners and restaurants:

For this FR we can have another service say Rating service. It will have its own SQL DB. We can have different tables or we can use the same table for both restaurants and delivery partners with one more column say type but I am going to use different tables.

Both tables have the same schema:

  1. RestaurantRating
    • user_id
    • restaurant_id
    • rating
    • created_at
  2. DPartnerRating
    • user_id
    • dpartner_id
    • rating
    • created_at

In this way we can also restrict same user to rate restaurants or delivery Partners multiple times. Now we also show restaurant's and delivery partner's rating to users. I know we are not handling NFRs for now but I am handling little performance right now. Otherwise to show details of restaurants or partners we need to make 2 calls (restaurant/dpartner and rating). We are going to use timer jobs at Restaurant Service and DPartner Service which will run at scheduled interval say every 1/2/4/8 hours and fetch the data from Rating service using created_at field and then calculate rating and save it in their own DB.



9. User should be able to search restaurants and menu items.

For this we have already a Search Service which is using Elastic Search. Restaurant Service is already pushing the data there on onboarding or updates.



10. User should be able to see the top rated nearby restaurants on home page:

This is little tricky but we will deal with it during NFRs handling for now, you can see we have every data available at Restaurant Service DB and we can fetch this from its DB only and show it to user.



Step 3: Mapping non functional requirments to architectural diagram:

1. Scalability:

For achieving the scalability we need to have multiple instances of each service and also replicas of DBs.

2. Availabilty:

We have already achieved it using multiple instances and replicas of DBs,

3. Performance:

a. Search < 200 ms

For search performance we already are using Elastic search which should serve the purpose. If required we can add additional details like rating, state in the Elasic Search.

b. Home page load < 500 ms

Here we need to show the Top rated nearby restaurants around the user location but as of now it is not possible as we are storing lat and long. 

We can use Geohash here to divide the earch into geo cells. Now these geo cell ids will be stored in the Restaurant DB as per the restaurant's locations. That means we are going to add a new table RestaurantLocation table in the Restaurant DB. Here is the schema:

  • geo_cell_id
  • restaurant_id
  • restaurant_name
  • restaurant_rating

Now the circle which we are drawing centered at User's location can have around 3 or 4 geo cells. Here is how the flow looks like for getting the nearby restaurants:

  • Get the geo cell id as per the user's location.
  • Get all the geo cell ids which is covered by 2 KM radius circle
  • Now the query to DB will be something like SELECT * from Location where geo_cell_ids in [list of geo cells]

To make this query efficient, we can partition this table based on geo_cell_id. 

If it is still required we can cache top restaurnts of geo_cell_ids in the Redis.

* For matching Delivery Partners we will use the same approach of geo hash and a column "geo_hash_id" will be added  to Location DB. Just like what we did in design of uber.

c. Menu load < 500 ms

There are four tables; Restaurants, RestarantMenu, RestaurantMenuItem and MenuItems, we need to join if we want to get the whole menu which is making this very inefficient. What we can do is maintain another table, RestaurantMenuCache which will just have following fields:

  • restaurant_id
  • menu_blob
  • last_updated_at
We can partition based on restaurant_ids but on range based method to make it more efficient.

d. CP over AP

We are already using SQL DB where consistency is required.


In this post, we methodically designed a scalable, consistent, and performant food delivery application from scratch. We identified critical services, database designs, WebSocket communication for real-time updates, and employed techniques like Geohashing for efficient geo-based queries.

We also mapped our non-functional requirements (NFRs) like scalability, performance, availability, and consistency across the architecture, ensuring that the system remains robust even at 100M users and 10M daily orders scale.

This design is a strong foundation — in a real-world scenario, we can extend it further by adding observability, security layers, fallback mechanisms, and cost optimizations.