Friday, February 19, 2021

[Google Question][LeetCode] Alien Dictionary

Problem: here is a new alien language that uses the English alphabet. However, the order among letters are unknown to you.

You are given a list of strings words from the dictionary, where words are sorted lexicographically by the rules of this new language.

Derive the order of letters in this language, and return it. If the given input is invalid, return "". If there are multiple valid solutions, return any of them.

Example:

Input: words = ["wrt","wrf","er","ett","rftt"]
Output: "wertf"
Input: words = ["z","x"]
Output: "zx"
Input: words = ["z","x","z"]
Output: ""
Explanation: The order is invalid, so return "".


Approach: We can follow below steps:

  1. Build a graph based on the dependency rules we get from the input string array.
  2. Apply Topological sorting.

Now let's see how to build the dependencies graph:

Let's look at the example: ["wrt","wrf","er","ett","rftt"] from this we can say by just looking at the first characters of words that w should come before e and e should come before r. So it looks like w -> e and e -> r. Yes there is one more relation which w -> r but we don't have to say that because if w -> e and e -> r then we can safely say w -> e -> r.

Now what about the other characters, let's see with first two words ["wrt", "wrf"]. Since the first 2 characters are same, we can get the say that t should come before f so we can say t-> f. Similarly we can take every pair of words and see where the first characters are different and create the dependency between those two characters. If we find no such character that means either the words are same or one word starts with another word. 

If the words are same we have no problems at all as in case of ["a", "a"] even there are no dependency but we can still return "a" as "a" will be a single node in the graph.

If there is case like ["abc", "abcde"], there is no problem still because we have no dependencies here and all the nodes are independent and we can return them in any order but in case  of ["abcde", "abc"], we have a problem as we can't get any order here so we will return empty string in such case.

Now let's apply the topological sort:

At first look, we think that if we apply a general topological sort here, we are done but we need to be careful about it as topological sort can be applied only on DAG (Directed Acyclic Graph) but we have no guarantee that there will be no cycle here. If we find any cycle in the above created graph, we will return the empty string. 

That's all about the approach.


Implementation in C#:

        public static string AlienOrder(string[] words)

        {

            Dictionary<char, List<char>> graph = new Dictionary<char, List<char>>();

            InitializeGraph(words, graph);

            if (!BuildGraph(words, graph))

            {

                return string.Empty;

            }

            HashSet<char> visited = new HashSet<char>();

            string result = string.Empty;

            Stack<char> stack = new Stack<char>(); 

            foreach (char key in graph.Keys)

            {

                if (!visited.Contains(key))

                {

                    HashSet<char> onCycle = new HashSet<char>();

                    bool isCycle = false;

                    TopologicalSort(key, graph, visited, onCycle, ref isCycle, stack);


                    if (isCycle)

                    {

                        return string.Empty;

                    }

                }

            }

            while (stack.Count > 0)

            {

                result += stack.Pop();

            }

            return result;

        }


        private static bool BuildGraph(string[] words, Dictionary<char, List<char>> graph)

        {

            for (int i = 1; i < words.Length; ++i)

            {

                string word1 = words[i - 1];

                string word2 = words[i];

                if (word1.Length > word2.Length && word1.StartsWith(word2))

                {

                    return false;

                }

                int lengthToCheck = Math.Min(word1.Length, word2.Length);

                for (int j = 0; j < lengthToCheck; ++j)

                {

                    if (word1[j] != word2[j])

                    {

                        graph[word1[j]].Add(word2[j]);

                        break;

                    }

                }

            }

            return true;

        }


        private static void InitializeGraph(string[] words, Dictionary<char, List<char>> graph)

        {

            foreach (string word in words)

            {

                foreach (char ch in word)

                {

                    if (!graph.ContainsKey(ch))

                    {

                        graph[ch] = new List<char>();

                    }

                }

            }

        }


        private static void TopologicalSort(char node, Dictionary<char, List<char>> graph, HashSet<char> visited, HashSet<char> onCycle, ref bool isCycle, Stack<char> stack)

        {

            visited.Add(node);

            onCycle.Add(node);

            foreach (char neighbor in graph[node])

            {

                if (onCycle.Contains(neighbor))

                {

                    isCycle = true;

                    return;

                }

                if (!visited.Contains(neighbor))

                {

                    TopologicalSort(neighbor, graph, visited, onCycle, ref isCycle, stack);

                }

            }

            onCycle.Remove(node);

            stack.Push(node);

        }


Complexity: O(n) where n is sum of the length of all the words.

No comments:

Post a Comment