Friday, September 4, 2020

[Uber] Group Anagrams

Problem: Given an array of strings strs, group the anagrams together. You can return the answer in any order. It is given that these strings have only lower case characters (a...z).

An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

Example:

Input: strs = ["cat","tac","ten","act","net"]
Output: [["net","ten"],["act","cat","tac"]]


Approach: We can use hashing here where key is the sorted string and value is the list of corresponding words so basically we will have a loop for the strings in strs in which we will sort the string ans use it as a key and then we add this string in the list which is value of the hash. In the end all the values (lists) of the hash is our answer.

The above approach will work but it will take O (n * klogk) time where n is the number of strings in strs and k is the length of largest string in strs as we sort every string in strs. Let's try to do better; if we look at the definition anagram, we will find out that two words are anagram if they have the same characters with same number of occurrences, right?

That means if we can count the characters count efficiently, we can solve this question faster than the first approach. We will still use the hash and we will have almost same approach except the key generation of hash. We can use an array of 26 integers and then based on character count we can modify the arrays values, like in case of string "act", we will have following array:

1 | 0 | 1 | ... | 1 | 0 | ... 

0.  1  2  ...   19  (Indices)

Now the key will be as follows:

1-0-1-0-0...-1-0-0... 

So basically key will be concatenation of all the values in the above integer array separated by a '-'. That's all as all the other things will remain same.


Implementation in C#:

Approach 1 (Key generation using sorting):

        public static IList<IList<string>> GroupAnagrams(string[] strs)

        {

            IList<IList<string>> result = new List<IList<string>>();

            Dictionary<string, List<string>> hash = new Dictionary<string, List<string>>();

            foreach(string str in strs)

            {

                char[] tempKey = str.ToCharArray();

                Array.Sort(tempKey);

                string key = new string(tempKey);

                if (!hash.ContainsKey(key))

                {

                    hash.Add(key, new List<string>());

                }

                hash[key].Add(str);

            }

            foreach(string key in hash.Keys)

            {

                result.Add(hash[key]);               

            }

            return result;

        }


Approach 2 (Key generation using counting):

    public IList<IList<string>> GroupAnagrams(string[] strs) 

    {

        Dictionary<string, List<string>> freqMap = new Dictionary<string, List<string>>();

        foreach (string str in strs)

        {

            string freqStr = this.GetKey(str);

            if (!freqMap.ContainsKey(freqStr))

            {

                freqMap[freqStr] = new List<string>();

            }  

            freqMap[freqStr].Add(str);  

        }

        List<IList<string>> result = new List<IList<string>>();

        foreach (string key in freqMap.Keys)

        {

            result.Add(freqMap[key]);

        }    

        return result;

    }

    

    private string GetKey(string str)

    {

        int[] freq = new int[26];        

        foreach (char ch in str)

        {

            ++freq[ch - 'a'];

        }

        StringBuilder sb = new StringBuilder();

        for (int i = 0; i < 26; ++i)

        {

            sb.Append(freq[i]);

            sb.Append('-');

        }

        return sb.ToString();

    }


Complexity: Approach 1: O (n * klogk)

                     Approach 2: O (n * k)

                     Here n is the number of strings in strs and k is the length of the largest string in strs.

No comments:

Post a Comment