Tuesday, February 2, 2021

LFU Cache

Problem: Design and implement a Least Frequently Used (LFU) cache. The key and value are integers here. Here are the methods which we need to implement:

  1. int get(int key): Return the value of the key if the key exists. Otherwise, returns -1.
  2. void put(int key, int value): Update the value of the key if present, or inserts the key if not already present. When the cache reaches its capacity, it should remove the least frequently used key before inserting a new item. In case of a tie (i.e., two or more keys with the same frequency), the least recently used key would be invalidated.


Approach: If you see this problem is similar to designing data structure with increment, decrement, max and min at constant time. Get and Put operation are also doing kind of Increment operation of the other problem(increasing a counter) but Put is doing one more thing, it is also removing a key which is similar to a part of decrement but here it just does not know the key in advance. That means we just can't save value only we need to save a key value pair corresponding a key.

To know which key to remove in case of overflow, its kind of min operation. With this understanding, here are the modified members:

  1. LinkedList<int> countList : It will maintain all the unique counts in sorted order.
  2. Dictionary<int, LinkedList<KeyValuePair<int, int>>> buckets: This will maintain buckets of key value pairs corresponding to their count. We can't have just key or just value here as we need key for reverse mapping, value for storing the value corresponding to a key. We can't use HashMap here as we need to maintain LRU too in case of tie.
  3. Dictionary<int, Tuple<LinkedListNode<KeyValuePair<int, int>>, LinkedListNode<int>>> keyValueHash: This is our primary hash which will contain the input string as key and value as the tuple of nodes. One node is linked to buckets' linked list node and another node is the node of countList so basically we are maintaining a [key -> value, count] relationship.
  4. currentCount: Number of elements in cache
  5. capacity: Capacity of the cache.

For example let's have a cache of capacity 2. countList, buckets, keyValueHash are all empty. Let's see how following operation will make the changes in these members: 

  • Put(1, 1): 
    • countList:  [{CNode1: 1}] 
    • buckets: [1 => [{B1Node1: {1,1}}]] 
    • keyValueHash: [1 => {B1Node1, CNode1}]
  • Get(1): 
    • countList: [{CNode2: 2}]
    • buckets:[2 => [{B2Node1: {1,1}}]]
    • keyValueHash: [1 => {B2Node1, CNode2}]
  • Put(2,2): 
    • countList: [{CNode1A: 1}->{CNode2: 2}]
    • buckets: [1 => [{B1Node1A: {2,2}}], 2 =>[{B2Node1:{1,1}}]]
    • keyValueHash: [1=>{B2Node1, CNode2}, 2=> {B1Node1A, CNode1A}]
  • Put(3,3): 
    • countList:[{CNode1B: 1}->{CNode2: 2}]
    • buckets:[1 => [{B1Node1B: {3,3}}], 2=> [{B2Node1:{1,1}}]]
    • keyValueHash: [1=>{B2Node1, CNode2}, 3=> {B1Node1B, CNode1B}]
  • Put(3,4): 
    • countList:  [{CNode2: 2}]
    • buckets:[2 => [{B2Node1:{1,1}} -> {B2Node2:{3,4}}]
    • keyValueHash: [1=>{B2Node1, CNode2}, 3=> {B2Node2, CNode2}]               

Hopefully the algorithm is also clear after running above examples. Let's now look at the algorithm of the methods:

  1. Put(key, value):
    • IF key does not exist in keyValueHash
      • IF currentCount and capacity are equal
        • minCount = value at the head of the countList
        • key = buckets[minCount].Head.Value.Key // Maintaining LRU, taking key from first node
        • keyValueHash.Remove(key)
        • buckets[minCount].RemoveFirst()
        • IF buckets[minCount] is empty
          • buckets.Remove(minCount)
          • countList.RemoveFirst()
        • currentCount = currentCount - 1
      • IF buckets does not contain 1
        • countList.AddFirst(1)
      • buckets[1].AddLast({key, value})
      • keyValueHash.Add(key, {bucket[1].Tail, countList.Head})
    • ELSE
      • countNode = keyValueHash[key].Item2 // Item2 is node in the countList
      • currentCount = currNode.Value
      • newCount = currentCount + 1
      • IF buckets does not contain newCount
        • countList.AddAfter(countNode, newCount)
      • buckets[newCount].AddLast({key, value})
      • buckets[currCount].Remove(keyValueHash[key].Item1) // Item1 is the node in one of the buckets' linked list according to count
      • keyValueHash[key] = {buckets[newCount].Tail, countNode.Next}// Pointing to right nodes
      • IF buckets[currCount] is empty
        • buckets.Remove(currentCount)
        • countList.Remove(countNode)
  2. Get(key):
    • IF keyValueHash does not contain key
      • RETURN -1
    • countNode = keyValueHash[key].Item2 // Item2 is node in the countList
    • currentCount = currNode.Value
    • newCount = currentCount + 1
    • IF buckets does not contain newCount
      • countList.AddAfter(countNode, newCount)
    • buckets[newCount].AddLast({key, keyValueHash[key].Item1.Value.Value}) // taking existing value as there is no change 
    • buckets[currCount].Remove(keyValueHash[key].Item1) // Item1 is the node in one of the buckets' linked list according to count
    • keyValueHash[key] = {buckets[newCount].Tail, countNode.Next}// Pointing to right nodes
    • IF buckets[currCount] is empty
      • buckets.Remove(currentCount)
      • countList.Remove(countNode)
    • RETURN keyValueHash[key].Item1.Value.Value


You can see PUT and GET both follows the same steps in case of key exist as they have to increment the count. Hopefully the things are clear now. That's all!


Implementation in C#:

public class LFUCache

    {

        private Dictionary<int, Tuple<LinkedListNode<KeyValuePair<int, int>>, LinkedListNode<int>>> keyValueHash;

        private Dictionary<int, LinkedList<KeyValuePair<int, int>>> buckets;

        private LinkedList<int> countList;

        private int capacity;

        private int currentCount;


        public LFUCache(int capacity)

        {

            this.keyValueHash = new Dictionary<int, Tuple<LinkedListNode<KeyValuePair<int, int>>, LinkedListNode<int>>>();

            this.buckets = new Dictionary<int, LinkedList<KeyValuePair<int, int>>>();

            this.countList = new LinkedList<int>();

            this.capacity = capacity;

            this.currentCount = 0;

        }


        public int Get(int key)

        {

            if (!this.keyValueHash.ContainsKey(key))

            {

                return -1;

            }

            this.ModifyCount(key);

            return this.keyValueHash[key].Item1.Value.Value;

        }


        private void ModifyCount(int key)

        {

            LinkedListNode<int> countNode = this.keyValueHash[key].Item2;

            int currCount = countNode.Value;

            int newCount = currCount + 1;

            if (!this.buckets.ContainsKey(newCount))

            {

                this.buckets.Add(newCount, new LinkedList<KeyValuePair<int, int>>());

                this.countList.AddAfter(countNode, newCount);

            }

            this.buckets[newCount].AddLast(new KeyValuePair<int, int>(key, this.keyValueHash[key].Item1.Value.Value));

            this.buckets[currCount].Remove(this.keyValueHash[key].Item1);

            this.keyValueHash[key] = new Tuple<LinkedListNode<KeyValuePair<int, int>>, LinkedListNode<int>>(this.buckets[newCount].Last, countNode.Next);

            if (this.buckets[currCount].Count == 0)

            {

                this.buckets.Remove(currCount);

                this.countList.Remove(countNode);

            }

        }


        public void Put(int key, int value)

        {

            if (this.keyValueHash.ContainsKey(key))

            {

                this.keyValueHash[key].Item1.Value = new KeyValuePair<int, int>(key, value);

                this.ModifyCount(key);

            }

            else

            {

                this.AddKey(key, value);

            }

        }


        private void AddKey(int key, int value)

        {

            if (this.currentCount == this.capacity)

            {

                if (this.capacity == 0)

                {

                    return;

                }

                this.RemoveLeastFrequentlyUsed();

            }

            if (!this.buckets.ContainsKey(1))

            {

                this.buckets.Add(1, new LinkedList<KeyValuePair<int, int>>());

                this.countList.AddFirst(1);

            }

            this.buckets[1].AddLast(new KeyValuePair<int, int>(key, value));

            this.keyValueHash.Add(key, new Tuple<LinkedListNode<KeyValuePair<int, int>>, LinkedListNode<int>>(this.buckets[1].Last, this.countList.First));

            ++this.currentCount;

        }


        private void RemoveLeastFrequentlyUsed()

        {

            int minCount = this.countList.First.Value;

            this.keyValueHash.Remove(this.buckets[minCount].First.Value.Key);

            this.buckets[minCount].RemoveFirst();

            if (this.buckets[minCount].Count == 0)

            {

                this.buckets.Remove(minCount);

                this.countList.RemoveFirst();

            }

            --this.currentCount;

        }

    }


Complexity: O(1) for both the operations

No comments:

Post a Comment