Friday, April 23, 2021

Distributed Cloud Storage System Design

Problem: Design a distributed cloud storage system like Amazon S3 or Microsoft Azure Blob Storage. We need to take care of mainly following requirement:

  1. Durability
  2. Availability
  3. Multitenancy
  4. Scalability

Design: Instead of jumping directly to design, let's start with the very basic design where there are clients and just one server which is exposing the APIs and also writing the files / data into hard disk so our initial design looks like:



This will work but this has huge scalability problems:

  1. A single server can only serve certain number of requests.
  2. Number of i/o are limited.
  3. Parallelization is limited otherwise HDD will choke.

To solve this problem what we can do is we can scale the servers horizontally. Basically we can have multiple replicas of same server so with this our design looks like:


This will solve some scalability problems as now it can handle more requests and also can store more data, but the problem here is one user has to interact with only one server but the client request will go to load balancer and load balancer has to redirect the request to correct server. 

How will Load Balance know which Server to go to? We can use sticky sessions, we can use hash based load balancing but sticky sessions has its own problems. To resolve this problem, we can separate the API servers and the DB servers / HDD Servers and we can have a metadata service which will tell which files is uploaded into which server. With this here is how our design looks like:



What we can do from here. Now there are problems here in this design too like what if Metadata Service is down or one of the DB servers are down then we can't achieve our target availability.

That means we need to horizontally scale our Metadata Servers and also we need to have replicas of DBs. An interesting point here is we want to take backup of DBs to different regions so that if one region is down, still client's request can be served to achieve the target. While replicating the files/data to different regions we need to follow strong consistency as we can't depend on asynchronous process. What if data is written to primary server but before the data is written to replica servers the primary server crashes. The whole file got lost. Our durability target can't be achieved. Here is the design looks like with these modifications:

Now we need to see how DB replication will happen? DB server itself can initiate the process of replication to other DBs but DB server will know where to replicate data. How API Server will come to know which DB server to hit when primary DB is down? All these responsibility we can give to our Metadata Service/Server only. To scale we just need to add more API servers and DBs.

Now with this understanding let's jump to actual design. Let's first understand an important component of this design which  is Cluster Manager. This component actually manages the whole cluster / data center. Here are it's responsibilities:

  1. Account Management
  2. Disaster Recovery
  3. Resource tracking
  4. Policies, Authentication and Authorization

Let's have a look what happens when a new account has been created:

  1. User sends a create account request to Cluster Manager.
  2. Cluster Manager collects the necessary info, make DB entries.
  3. Cluster Manager creates an URL for the root folder, make an entry of the URL to DNS with the IP address of Load Balancer for the further requests.
  4. Returns the root folder URL to User / Client.

Here is flow looks like:

Now let's go to the next step that is read and write files etc. Now let's look at the complete design:

Before double clicking into every component, let's understand the flow of say writing the file:

  • Client gets the IP of load balancer from DNS using the root folder URL.
  • Client send the write file request to load balancer.
  • Load balancer redirects the request to one of the API server.
  • API Server first check the auth related info using the Cluster Manager DB.
  • API Server get the hash number using the file Id.
  • API Server query the cache to know in which partition server, we should write the file using the hash number.
  • Basically we can use range based distribution say 0-50 will go to 1st partition server, 50 - 100 will go to 2nd partition server and so on.
  • API Server redirect the request to Partition Server.
  • With every partition server a stream server is attached. Partition Server redirects the request to stream server to actually write the content. 

That's all about the flow of write, we can have similar flow of read. Let's double click into what a stream server is; A stream server is a layer over linked list of file servers where only head is empty. The file managers are say a layer over hard disks. This is maintained by Stream Manager. Stream Manager also can add more Stream Servers and notify Partition Manager to add more partition server to attach these stream server.

Stream manager also holds the info of in which file server within a stream server a particular file exist. Stream Manager can also tell the stream server where to replicate the files.

Having Partition Manager can result into many advantages:

  • Parallelization
  • Make multiple same read requests into one request
  • Caching etc.

At API Server also we can maintain cache so that if the same read request happened multiple times, it can just serve from the cache itself for optimization purpose.

Wednesday, April 21, 2021

Web Crawler System Design

Problem: Design a distributed web crawler to crawl billion of pages.

Design: Here are our main architecture goals:

  1. Scalability
  2. Extensibility
  3. Freshness of pages
  4. Priority
  5. Politeness

Here is our design looks like:



Let's jump into every component to see the details:

1. URL Provider: URL Provider provides the URLs to URL Fetcher workers based on priority and politeness Here politeness means how frequent we should hit and fetch the page of a specific domain. Here is how we can implement it:


Let's see the flow:

  • Prioritizer will prioritizes the URLs to be crawled. It could be based on freshness of the page, importance of the URL, how frequently the site is being updated like new sites and given some priority with in some range say [1...n].
  • There will be n Priority Queues so prioritizer puts the URL into it's priority queue so that means priority 1 URL will go to priority queue 1 and so on. This will handle the priority requirements.
  • Polite Queue Router fetch the URLs from the highest priority queue first and then second highest and then third highest and so on and put the URLs in the Polite Queues. Please note that it will put one domain's URLs in one polite queue. This is to ensure the politeness. That means it will maintain the domain and queue number. 
  • The number of Polite Queues are equal to number of URL fetcher workers.
  • Whenever URL Fetcher ask for a URL, the Queue URL Provider can provide URL from Polite Queues. It provides the URL one by one from Polite Queues. (If needed we can have heap if fetching time of URLs are way too different) 


2. URL Fetcher: These are kind of worker / threads which gets the URL from URL provider and fetch the content of the given URL. This can use it's own DNS resolver to resolve the IP address faster. 

Once it fetched the content it store it in the DB and also put it in the cache so that next components can get the content quickly.

Once it stored the content into the cache, it pushed the URL in the queue.

3. The URL extractor gets the URL and using the URL it gets the content from cache. From the content it extracts all the unique URLs. 

4. There could be one more component here which is called URL Filter, basically it can be used to filter out the URLs which we are not interested in. Say we are only interested in images then it can filter out all the URLs which are not pointing to an image.

5. URL Loader get the list of URLs from URL extractor / URL Filter and it may stored these in DB. It also look for the URLs which are already crawled from the input lists. It then send the filtered list of URLs to URL Provider.

That's all about the system design of web crawler.

Google Question: Google Docs System Design

Problem: Design a large scale collaborative editor like Google Docs, OneNote, Word online etc. 

Design: Before going into the details of components, services, DBs, we are going to address the key question which is how to solve the concurrency issue. First thing which comes to mind is "Locks" but this will introduce too much delay. The editor with locking mechanism is not even usable if 5 (may be lesser) or more people are working on the same document.

Till now we have understood that we need to get a lock free optimistic concurrency control mechanism here. There are two approaches:

  1. Operational Transformation: The client can send raw events (operations) character by character to the server (like character M inserted at position 3, character O deleted from position 10 etc.). This is also called operational based sync.
  2. Differential Sync: The client can send the delta (difference) between server version and local copy. The server will not know of the exact operation performed at the client in this case.

Let's go in depth of every approach:

Operational Transformation(OT):  Let's understand the problem first using an example:



In this scenario, the users start with an initial string “abc”, User1 intends to insert the character “Y” at the index “2” and User2 intends to insert the character “Z” at the index “2”. Both of the users execute the operation locally, resulting in the strings “abYc” and “abZc” respectively. After local execution, they both send their operations to each other and execute it straight away; subsequently resulting in two different strings “abZYc” and “abYZc”.

This problem illustrates how naively sending operations between clients will result in inconsistent states between users. The solution to this problem is to transform both the operations with respect to each other, and then apply it to their counterparts.

Now that we have understood the problem, let's understand how OT solves this synchronization problem. Here is the mathematical function:



The math form is little tough to understand, but it is best explained graphically using the OT Diamond illustration:



Two clients perform two operations simultaneously (operation a and b) at their respective ends. These two operations can be send to transform function that will generate two new operations called (a’ and b’).

When b’ is applied to a on one client end, and a’ is applied to b on another client end, both the clients will come to the same consistent state. Let's use the above example to understand the approach better:

Our original string was ABC (when everything was in sync).

  • User 1 performed operation a (insert Y at index 2) resulting in the state: abYc.
  • User 2 performed operation b (insert X at index 2) resulting in the state: abXc.

Note the timestamp is maintained for both the operations to stick to the laws of Optimistic Locking. Based on the timestamps we will come to know operation 'a' happened before operation 'b'.

When we pass both these operations and their order to the transform function, it will result in two new operations.

  • operation a‘ : insert Y at index 2
  • operation b’ : insert X at index 3

User 1: Apply b’ after applying a. It will first insert Y at index 2 to the original string (ABC), and then insert X at index 3 to the intermediate string (abYc), getting the resultant string as abYXc.
User 2: Apply a’ after applying b. It will first insert X at index 2 to the original string (ABC), and then Y at index 2 to the intermediate string (abXc), getting the resultant string as abYXc.

You can see after applying the transformed operations, both clients arrive at the synchronous state.

The above diamond problem is pretty much the only thing OT algorithm can solve at the fundamental level. Given two upper sides of the diamond, OT can transform them wrt each other and complete the remaining bottom two parts but with the scale different problems arise. Let's add just one more concurrent problem and consider the client - server architecture. The server is the single source of truth and clients can edit the document concurrently and send the changes to the server to sync.

Consider a case when a client performs two consecutive operations a and b. It sends the operations one by one to the server. The server, meanwhile, got operation c from another client working concurrently elsewhere. 

So, now server has to first merge a and c because those two operations are branched from the last synced server state. The Server can apply transformation to operation a and c, and figure out a’. It can then apply a’ on top of changes from c and get a consistent state. So far so good.

Ideally, at this point, the server should send c’ to the client to make the client state consistent. But just when it was about to do so. Operation b arrives, the server is in fix. It has nothing to transform operation b to because it was derived from a, and not from any server synced state. To fix this the server uses a technique called Bridging.

To incorporate changes of operation b onto the server side, we need to somehow find a way to transform "b". But we only have one side of the diamond. This is where server has to create a bridge to form the other side of the diamond (as shown by the dotted green line).

As it turns out, the bridge is nothing but transformation of operation c against a which the server knows how to perform because a and c both are derived from the same last synced server state. Once the bridge is established, the upper two sides of diamond is complete for operation b. So now, the server can compute b’ (the lower side of the diamond). Apply it on its local copy (the one formed after applying a’ earlier) and make everything consistent at its end. The server now has applied operation c, a and b at its end successfully.

However, the client is yet to be in sync. It needs a operation which it can apply to its own state to make his end consistent. This is achieved as shown below:


The server will transform c against the composition of a and b, producing c’. It sends c’ back to the client which applies to its own state and become consistent.

Although the above demonstrated approach will work functionally to any number of client and server operations, and make everything consistent but it has huge scalability issues mainly because the server is required to do all the heavy lifting; server is performing all the transformations. Server is required to create these bridges and for that it has to hold onto every intermediate operations. In the above example, after transforming a and c against each other, and getting a’, the server cannot forget c’ because it could be further needed to create bridges (like in case of operation b’s bridge).

This sounds like a trivial problem initially but imagine if there are hundreds of operation pumping in from a client and hundreds of un-synchronized operations pending on the server end. The server has to remember all intermediate bottom half OT operations for bridging which is practically impossible.

To solve this problem we can follow the rules given below:

  1. All the operations that client sends to the server needs to be branched out from a known state in the server history. Otherwise, the server will discard the operation. So, operation b in the above example will be discarded because it was not branched from a known server state.
  2. Client will only send one operation at a time, and wait for the acknowledgement from the server before sending the next one.
  3. Server will broadcast the new changes it has applied to its own state to all the clients as soon as it applies it.

Of-course, simply rejecting the divergence will make the whole collaborative system all but useless. So, the server offloads all the heavy lifting to the client. We don’t want the server to maintain intermediate states of thousands (or millions) of clients. The server offloads all the compound OT work to the client. Server will only perform simple OT in which there will never be any need to remember any intermediate state.

The client will keep the track of its own local unsync operations and also listen for new server operations that happened after its own last synced version. The client has to keep track of just one server, since there’s never going to be any more than one (logical) server. This does simplify things from the server’s perspective but also the client will not do double the work as illustrated below:


  • To being with, client performs an operation a from the last synced server state. It starts building the bridge until the operation is synced with the server. It sends operation a to the server, and waits.
  • Client performs another operation (b) locally but as the rules says, it must not send it to the server until it receives acknowledgment from the server about operation a. So, it keeps operation b in the buffer.
  • Meanwhile, at the server end, before operation a, two more operations were pending to be synced. (operation c and d). They happened before a as validated by the timestamp. Server applies c to its local copy and creates a new state. It then broadcasts c to all clients.
  • Our client receives operation c which it transforms against its two local operations (a and b). It generates c’ which it adds to the bridge.
  • Client performs another local operation e locally. It adds this operation to the bridge and also adds to the buffer behind b.
  • Meanwhile, the server performs operation d at its end (on top of c), and broadcasts the message to all the clients. Our client receives the message, it transforms d against the composition of all its buffered operations (a, b and e), and generate d’. It adds d’ to the bridge.
  • Finally, server picks up operation a from its queue, transforms it against operation c and d (these two operations were branches out from the same parent as that of operation a). Generates operation a’. It applies a’ to its local state making it in sync. Broadcasts it to every client.
  • Our client will receive operation a’. It will come to know it is its own operation because with every operation client generates a unique id, which server returns back after transformation and broadcasting.
  • Now, the client will pick the next operation from the buffer (operation b) transform it wrt a’ since that is now the latest server state, and send it to the server for sync, and so on.

Hopefully with this the whole OT mechanism is clear and now let's move to the second mechanism.

Differential Synchronization: Differential synchronization is a symmetrical algorithm employing an unending cycle of background difference (diff) and patch operations. Below is an idealized data flow diagram for differential synchronization. It assumes two documents (say Client Text and Server Text) which are located on the same computer with no network.



The following walk-through starts with Client Text, Common Shadow and Server Text all being equal. Client Text and Server Text may be edited at any time. The goal is to keep these two texts as close as possible with each other at all times.
  1. Client Text is diffed against the Common Shadow.
  2. This returns a list of edits which have been performed on Client Text.
  3. Client Text is copied over to Common Shadow. This copy must be identical to the value of Client Text in step 1, so in a multi-threaded environment a snapshot of the text should have been taken.
  4. The edits are applied to Server Text on a best-effort basis.
  5. Server Text is updated with the result of the patch. Steps 4 and 5 must be atomic, but they do not have to be blocking; they may be repeated until Server Text stays still long enough.
The process now repeats symmetrically in the other direction. This time the Common Shadow is the same as Client Text was in the previous half of the synchronization, so the resulting diff will return modifications made to Server Text, not the result of the patch in step 5.

Here's an example of actual data flow.

a. Client Text, Common Shadow and Server Text start out with the same string: "Macs had the original point and click UI."
b. Client Text is edited (by the user) to say: "Macintoshes had the original point and click interface." (edits underlined)
c. The Diff in step 1 returns the following two edits:

@@ -1,11 +1,18 @@
 Mac
+intoshe
 s had th
@@ -35,7 +42,14 @@
 ick 
-UI
+interface
 .

d. Common Shadow is updated to also say: "Macintoshes had the original point and click interface."
e. Meanwhile Server Text has been edited (by another user) to say: "Smith & Wesson had the original point and click UI." (edits underlined)
f. In step 4 both edits are patched onto Server Text. The first edit fails since the context has changed too much to insert "intoshe" anywhere meaningful. The second edit succeeds perfectly since the context matches.
g. Step 5 results in a Server Text which says: "Smith & Wesson had the original point and click interface."
h. Now the reverse process starts. First the Diff compares Server Text with Common Shadow and returns the following edit:

@@ -1,15 +1,18 @@
-Macintoshes
+Smith & Wesson
 had

i. Finally this patch is applied to Client Text, thus backing out the failed "Macs" -> "Macintoshes" edit and replacing it with "Smith & Wesson". The "UI" -> "interface" edit is left untouched. Any changes which have been made to Client Text in the mean time will be patched around and incorporated into the next synchronization cycle.

The method described above is the simplest form of differential synchronization, but it will not work on client-server systems since the Common Shadow is, well, common. In order to execute on two systems, the shadow needs to be split in two and updated separately. Conceptually this is the same algorithm.

Client Text and Server Shadow (or symmetrically Server Text and Client Shadow) must be absolutely identical after every half of the synchronization. This should be the case since "(v1 Diff v2) Patch v1 == v2". Thus assuming the system starts in a consistent state, it should remain in a consistent state. Note that the patches on the shadows should fit perfectly, thus they may be fragile patches, whereas the patches on the texts are best-effort fuzzy patches.


However, on a network with best-effort delivery, nothing is guaranteed. Therefore a simple checksum of Client Shadow ought to be sent along with the Edits and compared to Server Shadow after the patches have been applied. If the checksum fails to match, then something went wrong and one side or the other must transmit the whole body of the text to get the two parties back in sync. This will result in data loss equal to one synchronization cycle.

This will work but what will happen in case of network failure.  In this case the client might stop synchronizing for a while until the connection times out. When the connection is restored on the following synchronization, the shadows will be out of sync which requires a transmission of the full text to get back in sync. This will destroy all changes since the previous successful synchronization. If this form of data-loss is unacceptable, a further refinement adds guaranteed delivery.



In a nutshell: Normal operation works identically to the Dual System Method described above. However in the case of packet loss, the edits are queued up in a stack and are retransmitted to the remote party on every sync until the remote party returns an acknowledgment of receipt. The server keeps two copies of the shadow, "Server Shadow" is the most up to date copy, and "Backup Shadow" is the previous version for use in the event that the previous transmission was not received.

  • Normal operation: Client Text is changed by the user. A Diff is computed between Client Text and Client Shadow to obtain a set of edits which were made by the user. These edits are tagged with a client version number ('n') relating to the version of Client Shadow they were created from. Client Shadow is updated to reflect the current value of Client Text, and the client version number is incremented. The edits are sent to the server along with the client's acknowledgment of the current server version number ('m') from the previous connection. The server's Server Shadow should match both the provided client version number and the provided server version number. The server patches the edits onto Server Shadow, increments the client version number of Server Shadow and takes a backup of Server Shadow into Backup Shadow. Finally the server then patches the edits onto Server Text. The process then repeats symmetrically from the server to the client, with the exception that the client doesn't take a backup shadow. During the return communication the server will inform the client that it received the edits for version 'n', whereupon the client will delete edits 'n' from the stack of edits to send.
  • Duplicate packet: The client appears to send Edits 'n' to the server twice. The first communication is processed normally and the response sent. Server Shadow's 'n' is incremented. The second communication contains an 'n' smaller than the 'n' recorded on Server Shadow. The server has no interest in edits it has already processed, so does nothing and sends back a normal response.
  • Lost outbound packet: The client sends Edits 'n' to the server. The server never receives it. The server never acknowledges receipt of the edit. The client leaves the edits in the outbound stack. After the connection times out, the client takes another diff, updates the 'n' again, and sends both sets of edits to the server. The stack of edits transmitted keeps increasing until the server eventually responds with acknowledgment that it got a certain version.
  • Lost return packet: The client sends Edits 'n' to the server. The server receives it, but the response is lost. The client leaves the edits in the outbound stack. After the connection times out, the client takes another diff, updates the 'n' again, and sends both sets of edits to the server. The server observes that the server version number 'm' which the client is sending does not match the server version number on Server Shadow. But both server and client version numbers do match the Backup Shadow. This indicates that the previous response must have been lost. Therefore the server deletes its edit stack and copies the Backup Shadow into Shadow Text (step 4). The server throws away the first edit because it already processed (same as a duplicate packet). The normal workflow is restored: the server applies the second edit, then computes and transmits a fresh diff to the client.
  • Out of order packet: The server appears to lose a packet, and one (or both) of the lost packet scenarios is played out. Then the lost packet arrives, and the duplicate packet scenario is played out.
  • Data corruption in memory or network: There are too many potential failure points to list, however if the shadow checksums become out of sync, or one side's version number skips into the future, the system will reinitialize itself. This will result in data loss for one side, but it will never result in an infinite loop of polling.

Now that we have understood the both the approaches. Let's go for the system design based on OT (Google docs uses OT) now:

One thing to notice here is this is a real time operations where clients need to send the data to server and server also needs to send the data to clients so we are going to use web sockets here like we used in our What'sApp design. Obviously giving the load we would be needing a number of servers. Here we are using API gateway instead of load balancers as one request may result into multiple API calls which we don't want client to call instead client will just hit the API Gateway and in one call get all the results needed. For now our system looks like below:


We can safely assume that whenever a client loads a document, a web socket connection has been established. You can assume that server stores [user id] and [Server id, connection-id] mapping into cache and also it stores doc-id to List of user-ids into the cache. Here is our Server or Session Server looks like:



Our Session server mainly have three services:

  1. Document Update Service: This will apply the OT and save the document. It will also store the version history to DB. It will then call the below Group Operation Service with the operation.
  2. Group Operation Service: This service first get list of active user-ids given the document id and then it either queue the Operation and user id or call the below Send Operation service asynchronously for all the user ids and pass the operation.
  3. Send Operation Service: This will get the Server id and connection id using the user id and then just pass the operation to web socket. Web Socket pass the operation to client.

Other than main service Session Service. there could be more services like:
  1. Import / Export
  2. Comments (We have to use NoSQL DB here as comments data could be large)
  3. Permissions
That's all about the system design of Google docs or any collaborative editor. Please put your thoughts in the comments section if you see improvements here or have any query.

Monday, April 12, 2021

[LeetCode] Unique Binary Search Trees II

Problem: Given an integer n, return all the structurally unique BST's (binary search trees), which has exactly n nodes of unique values from 1 to n. Return the answer in any order.

Example:

Input: n = 3
Output: [[1,null,2,null,3],[1,null,3,2],[2,1,3],[3,1,null,null,2],[3,2,null,1]]
Input: n = 1
Output: [[1]]

Constraints:

  • 1 <= n <= 8


Approach: The intuition is to take each number i from 1...n as root and then because we need to create BST, we know that left subtree will have numbers 1...i - 1 and right subtrees will have i + 1...n. With this intuition, here is our algorithm:

  • DEFINE generate_tree (start, end)
    • result = new list()
    • IF start > end
      • RETURN new List {null}
    • FOR i = 1 TO n
      • leftTrees = generate_tree (start, i - 1) // recursively make all left subtrees
      • rightTrees = generate_tree (i + 1, end) // recursively make all right subtrees
      • FOR j = 0 TO leftTrees.Length
        • FOR k = 0 TO rightTrees.Length
          • Node node = new Node(i)
          • node.left =  leftTrees[j]
          • node.right = rightTrees[k]
          • result.Add(node)
    • return result


Implementation in C#:

    public IList<TreeNode> GenerateTrees(int n) 

    {

        if (n <= 0)

        {

            return new List<TreeNode>();

        }

        return this.GenerateTrees(1, n);

    }

    

    private List<TreeNode> GenerateTrees(int start, int end)

    {

        List<TreeNode> result = new List<TreeNode>();

        if (start > end)

        {

            result.Add(null);

            return result;

        }

        for (int i = start; i <= end; ++i)

        {

            List<TreeNode> leftTrees = this.GenerateTrees(start, i - 1);

            List<TreeNode> rightTrees = this.GenerateTrees(i + 1, end);

            

            for (int j = 0; j < leftTrees.Count; ++j)

            {

                for (int k = 0; k < rightTrees.Count; ++k)

                {

                    TreeNode node = new TreeNode(i);

                    node.left =  leftTrees[j];

                    node.right = rightTrees[k];

                    result.Add(node);

                }

            }

        }     

        return result;

    }


Complexity: O(number of BSTs) = O(nth Catalan number) = O((2n)!/((n+1)!*n!))

[LeetCode] Construct String from Binary Tree

Problem: You need to construct a string consists of parenthesis and integers from a binary tree with the preorder traversing way.

The null node needs to be represented by empty parenthesis pair "()". And you need to omit all the empty parenthesis pairs that don't affect the one-to-one mapping relationship between the string and the original binary tree.

Example:

Input: Binary tree: [1,2,3,4]
       1
     /   \
    2     3
   /    
  4     

Output: "1(2(4))(3)"

Explanation: Originallay it needs to be "1(2(4)())(3()())",
but you need to omit all the unnecessary empty parenthesis pairs.
And it will be "1(2(4))(3)".
Input: Binary tree: [1,2,3,null,4]
       1
     /   \
    2     3
     \  
      4 

Output: "1(2()(4))(3)"

Explanation: Almost the same as the first example,
except we can't omit the first parenthesis pair to break the one-to-one mapping relationship between the input and the output.


Approach: We can use post-order traversal here. We just need to handle the special condition where left subtree is null and right subtree is not.

Have a look at implementation to understand the approach. It's fairly easy to understand.

 

Implementation in C#:

    public string Tree2str(TreeNode node) 

    {

        if (node == null)

        {

            return string.Empty;

        }

        string leftStr = this.Tree2str(node.left);

        string rightStr = this.Tree2str(node.right);

        string result = node.val.ToString();

        if (leftStr == string.Empty && rightStr == string.Empty)

        {

            return result;

        }        

        result += $"({leftStr})";

        if (rightStr != string.Empty)

        {

            result += $"({rightStr})";

        }

        return result;

    }


Complexity: O(n) with assumption that string addition takes constant time.

[LeetCode] Subtree of Another Tree

Problem: Given two non-empty binary trees s and t, check whether tree t has exactly the same structure and node values with a subtree of s. A subtree of s is a tree consists of a node in s and all of this node's descendants. The tree s could also be considered as a subtree of itself.

Example:

Given tree s:

     3
    / \
   4   5
  / \
 1   2

Given tree t:

   4 
  / \
 1   2

Return true, because t has the same structure and node values with a subtree of s.

Given tree s:

     3
    / \
   4   5
  / \
 1   2
    /
   0

Given tree t:

   4
  / \
 1   2

Return false.


Approach: We use preorder traversal here. While doing preorder traversal on s whenever we see value at current node of s and value of root of t are same, we check if the tree started from current node of s and t are same or not. If yes, we return true otherwise we continue with our preorder traversal.


Implementation in C#:

    public bool IsSubtree(TreeNode s, TreeNode t) 

    {

        if (s == null)

        {

            return false;

        }

        if (s.val == t.val)

        {

            if (this.AreEqual(s, t))

            {

                return true;

            }

        }

        return this.IsSubtree(s.left, t) || this.IsSubtree(s.right, t);

    }

    

    private bool AreEqual(TreeNode t1, TreeNode t2)

    {

        if (t1 == null && t2 == null)

        {

            return true;

        }

        

        if (t1 == null || t2 == null)

        {

            return false;

        }

        return t1.val == t2.val && this.AreEqual(t1.left, t2.left) && this.AreEqual(t1.right, t2.right);

    }


Complexity: O(m * n)

[LeetCode] Binary Tree Tilt

Problem: Given the root of a binary tree, return the sum of every tree node's tilt.

The tilt of a tree node is the absolute difference between the sum of all left subtree node values and all right subtree node values. If a node does not have a left child, then the sum of the left subtree node values is treated as 0. The rule is similar if there the node does not have a right child.

Example:

Input: root = [1,2,3]
Output: 1
Explanation: 
Tilt of node 2 : |0-0| = 0 (no children)
Tilt of node 3 : |0-0| = 0 (no children)
Tilt of node 1 : |2-3| = 1 (left subtree is just left child, so sum is 2; right subtree is just right child, so sum is 3)
Sum of every tilt : 0 + 0 + 1 = 1

Input: root = [4,2,9,3,5,null,7]
Output: 15
Explanation: 
Tilt of node 3 : |0-0| = 0 (no children)
Tilt of node 5 : |0-0| = 0 (no children)
Tilt of node 7 : |0-0| = 0 (no children)
Tilt of node 2 : |3-5| = 2 (left subtree is just left child, so sum is 3; right subtree is just right child, so sum is 5)
Tilt of node 9 : |0-7| = 7 (no left child, so sum is 0; right subtree is just right child, so sum is 7)
Tilt of node 4 : |(3+5+2)-(9+7)| = |10-16| = 6 (left subtree values are 3, 5, and 2, which sums to 10; right subtree values are 9 and 7, which sums to 16)
Sum of every tilt : 0 + 0 + 0 + 2 + 7 + 6 = 15

Input: root = [21,7,14,1,1,2,2,3,3]
Output: 9

Constraints:

  • The number of nodes in the tree is in the range [0, 104].
  • -1000 <= Node.val <= 1000


Approach: We can use post-order traversal here. If you see it is kind of bottom up as first we need to calculate left and right subtree sum then only we can calculate the tilt value of a node. 

Please directly look at implementation to understand the approach in details as it is fairly easy to understand.


Implementation in C#:

    public int FindTilt(TreeNode root) 

    {

        if (root == null)

        {

            return 0;

        }

        int tiltVal = 0;

        this.FindSumAndGetTilt(root, ref tiltVal);

        return tiltVal;

    }

    

    private int FindSumAndGetTilt(TreeNode node, ref int tilt)

    {

        if (node == null)

        {

            return 0;

        }

        int leftSum = this.FindSumAndGetTilt(node.left, ref tilt);

        int rightSum = this.FindSumAndGetTilt(node.right, ref tilt);

        tilt += Math.Abs(leftSum - rightSum);

        return leftSum + rightSum + node.val;

    }


Complexity: O(n)

[GFG] Check if there is a root to leaf path with given sequence

Problem: Given a binary tree and an array, the task is to find if the given array sequence is present as a root to leaf path in given tree.

Example:



Input: arr= [5,2,4,8]
Output: true
Input: root = [5,2,4]
Output: false


Approach: We can use preorder traversal here. Please have a look at implementation to understand the approach as it is straight forward problem to solve.


Implementation in C#:

        public bool IsValidSequence(int[] arr)

        {

            int length = arr?.Length ?? 0;

            if (length == 0)

            {

                return true;

            }

            if (this.Root == null)

            {

                return false;

            }

            return IsValidSequence(this.Root, arr, 0);

        }


        private bool IsValidSequence(BinaryTreeNode node, int[] arr, int index)

        {

            if (index == arr.Length)

            {

                return node == null;

            }

            if (node == null || node.Value != arr[index])

            { 

                return false;

            }

            return this.IsValidSequence(node.LeftNode, arr, index + 1) || this.IsValidSequence(node.RightNode, arr, index + 1);

        }


Complexity: O(n)

Sunday, April 11, 2021

[Google Question][LeetCode] Deepest Leaves Sum

Problem: Given the root of a binary tree, return the sum of values of its deepest leaves.

Example:

Input: root = [1,2,3,4,5,null,6,7,null,null,null,null,8]
Output: 15
Input: root = [6,7,8,2,7,1,3,9,null,1,4,null,null,null,5]
Output: 19

Constraints:

  • The number of nodes in the tree is in the range [1, 104].
  • 1 <= Node.val <= 100


Approach: We can use preorder traversal. We can take current level and max depth as parameters in the recursive method:

  • If current level is equal to max depth, add the node's value to result sum. 
  • if current level is more than max depth then that means we have calculated sum for wrong level so we make the result sum as 0 and assign current level to max depth.
  • if current level is less than max depth, we don't need to change result sum or max depth.

That's all.


Implementation in C#:

    public int DeepestLeavesSum(TreeNode root) 

    {

        if (root == null)

        {

            return 0;

        }

        int maxDepth = 0;

        int leavesSum = 0;

        this.DeepestLeavesSum(root, 0, ref maxDepth, ref leavesSum);

        return leavesSum;

    }

    

    private void DeepestLeavesSum(TreeNode node, int currLevel, ref int maxDepth, ref int leavesSum)

    {

        if (node == null)

        {

            return;

        }

        if (currLevel > maxDepth)

        {

            maxDepth = currLevel;

            leavesSum = 0;

        }

        if (currLevel == maxDepth)

        {

            leavesSum += node.val;

        }

        this.DeepestLeavesSum(node.left, currLevel + 1, ref maxDepth, ref leavesSum);

        this.DeepestLeavesSum(node.right, currLevel + 1, ref maxDepth, ref leavesSum);

    }


Complexity: O(n)

Friday, April 9, 2021

[Facebook Question][LeetCode] Verifying an Alien Dictionary

Problem: In an alien language, surprisingly they also use English lowercase letters, but possibly in a different order. The order of the alphabet is some permutation of lowercase letters.

Given a sequence of words written in the alien language, and the order of the alphabet, return true if and only if the given words are sorted lexicographically in this alien language.

Example:

Input: words = ["hello","leetcode"], order = "hlabcdefgijkmnopqrstuvwxyz"
Output: true
Explanation: As 'h' comes before 'l' in this language, then the sequence is sorted.
Input: words = ["word","world","row"], order = "worldabcefghijkmnpqstuvxyz"
Output: false
Explanation: As 'd' comes after 'l' in this language, then words[0] > words[1], hence the sequence is unsorted.
Input: words = ["apple","app"], order = "abcdefghijklmnopqrstuvwxyz"
Output: false
Explanation: The first three characters "app" match, and the second string is shorter (in size.) According to lexicographical rules "apple" > "app", because 'l' > '∅', where '∅' is defined as the blank character which is less than any other character (More info).

Constraints:

  • 1 <= words.length <= 100
  • 1 <= words[i].length <= 20
  • order.length == 26
  • All characters in words[i] and order are English lowercase letters.


Approach: The approach is straight forward. We take first two words and see if they are sorted using the order string, then we check the next two words see if they are sorted. In the same way we check it for all the pairs. Here is how we can check if the pair of words say word1 and word2 are sorted or not:

  • Get the index 'i'  where word1 and word2 character where the words are different i.e. word1[i] != word2[i].
  • If word1[i]'s index is lesser than the word2[i]'s index in the order string then we can say the words are sorted otherwise not.

That's all!


Implementation in C#:

    public bool IsAlienSorted(string[] words, string order) 

    {

        int[] orderIndices = new int[order.Length];

        for (int i = 0; i < order.Length; ++i)

        {

            orderIndices[order[i] - 'a'] = i;

        }

        for (int i = 0; i < words.Length - 1; ++i)

        {

            string word1 = words[i];

            string word2 = words[i + 1];

            if (word1.Length > word2.Length && word1.StartsWith(word2))

            {

                return false;

            }

            int j = 0;            

            while (j < word1.Length && word1[j] == word2[j])

            {

                ++j;

            }

            if (j != word1.Length)

            {

                if (orderIndices[word1[j] - 'a'] > orderIndices[word2[j] - 'a'])

                {

                    return false;

                }

            }

        }

        return true;

    }


Complexity: O(n) where n are the number of characters in all the words in input words array.

[Google Question][LeetCode] Split BST

Problem: Given a Binary Search Tree (BST) with root node root, and a target value V, split the tree into two subtrees where one subtree has nodes that are all smaller or equal to the target value, while the other subtree has all nodes that are greater than the target value.  It's not necessarily the case that the tree contains a node with value V.

Additionally, most of the structure of the original tree should remain.  Formally, for any child C with parent P in the original tree, if they are both in the same subtree after the split, then node C should still have the parent P.

You should output the root TreeNode of both subtrees after splitting, in any order.

Example:

Input: root = [4,2,6,1,3,5,7], V = 2
Output: [[2,1],[4,3,6,null,null,5,7]]
Explanation:
Note that root, output[0], and output[1] are TreeNode objects, not arrays.

The given tree [4,2,6,1,3,5,7] is represented by the following diagram:

          4
        /   \
      2      6
     / \    / \
    1   3  5   7

while the diagrams for the outputs are:

          4
        /   \
      3      6      and    2
            / \           /
           5   7         1


Note:

  • The size of the BST will not exceed 50.
  • The BST is always valid and each node's value is different.


Approach: Let's take a hint from the question description. At any node there are two conditions:

  1. curr_node.Value <= V: In that case we know that the whole left subtree of curr_node including the curr_node is going to be part of first half but there is possibility that there are few nodes in the right subtree of curr_node which belongs to first half and other nodes of right subtree belongs to the second half. That means we need to make a recursive call for the right subtree,
  2. curr_node.Value > V: In this case we are sure that right subtree of curr_node including the curr_node is part of second half but we are not sure about left subtree yet. That means we need to make recursive call for left subtree.

If you understand the above two points the implementation is simple. Just have a look!


Implementation in C#:

    public TreeNode[] SplitBST(TreeNode root, int V) 

    {

        if (root == null)

        {

            return new TreeNode[] { null, null };

        }    

        if (root.val <= V)

        {

            TreeNode[] bsts = this.SplitBST(root.right, V);

            root.right = bsts[0];

            bsts[0] = root;

            return bsts;

        }

        else

        {

            TreeNode[] bsts = this.SplitBST(root.left, V);

            root.left = bsts[1];

            bsts[1] = root;

            return bsts;

        }

    }


Complexity: O(height of BST)

[Microsoft Question] Inorder Successor in Binary Search Tree

Problem: Given a binary search tree and a node in it, return the in-order successor of that node in the binary search tree. If the given node has no in-order successor in the tree, return null.

The successor of a node is the node with the smallest value greater than value of the given input node.

Example:



Input: Above tree, input = 6
Output: 8


Approach: We can use the similar approach which we used which we used to set the inorder successor of every node in Binary tree but that will take linear time as we are not using Binary Search Tree's property in this approach.

Let's make the solution efficient by using the BST's property. We can do preorder traversal here. There can be two conditions at every node:

  1. curr_node.Value <= input.Value: In this case we just discard the left subtree of the curr_node.
  2. curr_node.Value > input.Value: In this case we can discard the right subtree of curr_node, but if we see curr_node can also be a candidate of being a inorder successor of the input node.

With this understanding, look at the implementation to understand the whole solution.


Implementation in C#:

    public BinaryTreeNode InorderSuccessor(BinaryTreeNode input) 

    {

        if (this.Root == null)

        {

            return null;

        }

        BinaryTreeNode successor = null;

        BinaryTreeNode node = this.Root;

        while (node != null)

        {

            if (node.Value <= input.Value)

            {

                node = node.RightNode;

            }

            else

            {

                successor = node;

                node = node.LeftNode;

            }

        }

        return successor;

    }


Complexity: O(logn)


Thursday, April 8, 2021

[Google Question][LeetCode] Bomb Enemy

Problem: Given an m x n matrix grid where each cell is either a wall 'W', an enemy 'E' or empty '0', return the maximum enemies you can kill using one bomb. You can only place the bomb in an empty cell.

The bomb kills all the enemies in the same row and column from the planted point until it hits the wall since it is too strong to be destroyed.

Example:

Input: grid = [["0","E","0","0"],["E","0","W","E"],["0","E","0","0"]]
Output: 3

Input: grid = [["W","W","W"],["0","0","0"],["E","E","E"]]
Output: 1

Constraints:

  • m == grid.length
  • n == grid[i].length
  • 1 <= m, n <= 500
  • grid[i][j] is either 'W', 'E', or '0'.


Approach: The immediate approach which comes to mind is to place bomb at every cell and count the number of enemies which we can kill. We just take the maximum of these counts. Have a look at implementation to understand the approach in detail. 

The above approach will solve the problem but it is expensive as it will take O (m * n * (m + n)) ~= O(n^3) time. Let's try to improve it.

If you see in the above approach, we are solving same subproblem multiple times. Let's look at the following row:

['0', '0', 'E', '0']

If you see the number of enemies killed in this row will always be 1 regardless the cell where we can place the bomb. If we see the problem closely we can find a pattern that then number of enemies killed in a row will be changed only if it is the first cell of the row (0th column) or the previous cell (curr_col - 1) is a wall.

That means we can have one variable say rowHits for every row which will be changed only in case of first cell or cell after the wall.

We can do the same for the columns also but with one change; we need to have colHits array instead of one variable like rowHits. Why? This is because we have loops like following:

  • FOR row = 0 To m - 1
    • For col = 0  TO n - 1

If you see the loops we are completing one row first and then going to second row so one variable is okay to have but in case of columns we need to go over all the columns for each row. That's why we need  colHits as an array to keep track of all the colHits.

That's all! Look at the implementation for the details.


Implementation in C#:

Approach 1: Brute-force:

    public int MaxKilledEnemies(char[][] grid) 

    {

        int rows = grid.Length;

        int cols = grid[0].Length;

        int maxKilledEnemies = 0;

        for (int i = 0; i < rows; ++i)

        {

            for (int j = 0; j < cols; ++j)

            {

                if (grid[i][j] == '0')

                {

                    maxKilledEnemies = Math.Max(maxKilledEnemies, this.CountReachableEnemies(grid, i, j));

                }

            }

        }

        return maxKilledEnemies;

    }

    

    private int CountReachableEnemies(char[][] grid, int row, int col)

    {

        int rows = grid.Length;

        int cols = grid[0].Length;

        int totalEnemies = this.CountReachableEnemiesInRow(grid, row, col);

        totalEnemies += this.CountReachableEnemiesInColumn(grid, row, col);

        return totalEnemies;

    }


    private int CountReachableEnemiesInRow(char[][] grid, int row, int col)

    {

        int cols = grid[0].Length;

        int totalEnemies = 0;

        for (int i = col - 1; i >= 0; --i)

        {

            if (grid[row][i] == 'W')

            {

                break;

            }

            if (grid[row][i] == 'E')

            {

                ++totalEnemies;

            }

        }

        for (int i = col + 1; i < cols; ++i)

        {

            if (grid[row][i] == 'W')

            {

                break;

            }

            if (grid[row][i] == 'E')

            {

                ++totalEnemies;

            }

        }

        return totalEnemies;

    }

    

    private int CountReachableEnemiesInColumn(char[][] grid, int row, int col)

    {

        int rows = grid.Length;

        int totalEnemies = 0;

        

        for (int i = row - 1; i >= 0; --i)

        {

            if (grid[i][col] == 'W')

            {

                break;

            }

            if (grid[i][col] == 'E')

            {

                ++totalEnemies;

            }

        }

        for (int i = row + 1; i < rows; ++i)

        {

            if (grid[i][col] == 'W')

            {

                break;

            }

            if (grid[i][col] == 'E')

            {

                ++totalEnemies;

            }

        }

        return totalEnemies;

    }

Approach 2: Dynamic Programming:

    public int MaxKilledEnemies(char[][] grid) 

    {

        int rows = grid.Length;

        int cols = grid[0].Length;

        int maxKilledEnemies = 0, rowHits = 0;

        int[] colHits = new int[cols];

        for (int i = 0; i < rows; ++i)

        {

            for (int j = 0; j < cols; ++j)

            {

                if (j == 0 || grid[i][j - 1] == 'W')

                {

                    rowHits = 0;

                    for (int k = j; k < cols; ++k)

                    {

                        if (grid[i][k] == 'W')

                        {

                            break;

                        }

                        if (grid[i][k] == 'E')

                        {

                            ++rowHits;

                        }

                    }

                }

                if (i == 0 || grid[i - 1][j] == 'W')

                {

                    colHits[j] = 0;

                    for (int k = i; k < rows; ++k)

                    {

                        if (grid[k][j] == 'W')

                        {

                            break;

                        }

                        if (grid[k][j] == 'E')

                        {

                            ++colHits[j];

                        }

                    }

                }

                if (grid[i][j] == '0')

                {

                    maxKilledEnemies = Math.Max(maxKilledEnemies, rowHits + colHits[j]);

                }

            }

        }

        return maxKilledEnemies;

    }


Complexity: Approach 1: O(m * n * ( m + n),  Approach 2: O(3 * m * n) = O(m * n)