LFU/All O one Data Structure

Leave a comment

August 31, 2016 by oneOokay

LFU和All O`one Data Structure本质上都是一个double linkedlist和hashmap的结合体


Implement a data structure supporting the following operations:

  1. Inc(Key) – Inserts a new key with value 1. Or increments an existing key by 1. Key is guaranteed to be a non-empty string.
  2. Dec(Key) – If Key’s value is 1, remove it from the data structure. Otherwise decrements an existing key by 1. If the key does not exist, this function does nothing. Key is guaranteed to be a non-empty string.
  3. GetMaxKey() – Returns one of the keys with maximal value. If no element exists, return an empty string "".
  4. GetMinKey() – Returns one of the keys with minimal value. If no element exists, return an empty string "".

Challenge: Perform all these in O(1) time complexity.


创建一个新的class: Node.其中包含:

  • Node prev
  • Node next
  • int val:代表次数
  • List keys: 存keys
  • method: removeKey(String key)
  • method: insertBeforeNode(Node n):在n之前插入当前的node
  • Node(int value):这里不用Node(int value, String key)是为了之后的inc函数方便.

整个AllOne()里:

  • Node head
  • Node tail
  • HashMap<String, Node>:用来确认是否存在key以及找到key的次数

注意尽量general化,可以让代码简洁…以及不要忘了更新所有的地方.

public class AllOne {
    private Node head;
    private Node tail;
    private HashMap<String, Node> keys;
    public AllOne() {
        head = new Node(0);//注意这里head要初始化为0
        tail = new Node(0); 
        head.next = tail;
        tail.prev = head;
        keys = new HashMap<>();
    }
    
     public void inc(String key) {
        Node node = null;
        //尽量得到一个general的node.如果熟悉map的话,就可以用:
        //map.getOrDefault(key, head);
        if (keys.size() == 0 || !keys.containsKey(key)){
            node = head;
        }else {
            node = keys.get(key);
            node.removeKey(key);
        }
        Node next = node.next;//也是做一个general next的处理
        if(next.val != node.val + 1){
            next = new Node(node.val + 1);
//这里决定了val小的node靠近head,val大的node靠近tail
            next.insertBeforeNode(node.next);
        }
//这里统一把key加在next上,所以Node(int i)就可以了
        next.keys.add(key);
        keys.put(key, next);
    }
    
    public void dec(String key) {
        if (!keys.containsKey(key)) return;
        Node n = keys.get(key);
        if (n.val == 1){
            keys.remove(key);
            n.removeKey(key);
            return;
        }
        Node prev = n.prev;
        if (prev.val != n.val -1){
            prev = new Node(n.val - 1);
            prev.insertBeforeNode(n);
        }
        n.removeKey(key);
        prev.keys.add(key);
        keys.put(key, prev);
    }
    
    public String getMaxKey() {
        if(keys.size() == 0) return "";
        List list = tail.prev.keys;
        return list.get(list.size() - 1);
    }
    
    public String getMinKey() {
        if (keys.size() == 0) return "";
        List list = head.next.keys;
        return list.get(list.size() - 1);
    }
    
    class Node{
        Node prev;
        Node next;
        int val;
        List keys;
        
        Node(int v){
            val = v;
            keys = new ArrayList();
        }
        
        void insertBeforeNode(Node n){
            this.next = n;
            n.prev.next = this;
            this.prev = n.prev;
            n.prev = this;
        }
        
        void removeKey(String key){
            keys.remove(key);
            if (keys.size() == 0){
                prev.next = next;
                next.prev = prev;
            }
        }
    }
}

Design and implement a data structure for Least Frequently Used (LFU) cache. It should support the following operations: get and set.

get(key) – Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.
set(key, value) – Set or insert the value if the key is not already present. When the cache reaches its capacity, it should invalidate the least frequently used item before inserting a new item. For the purpose of this problem, when there is a tie (i.e., two or more keys that have the same frequency), the least recently used key would be evicted.

Follow up:
Could you do both operations in O(1) time complexity?

Example:

LFUCache cache = new LFUCache( 2 /* capacity */ );

cache.set(1, 1);
cache.set(2, 2);
cache.get(1);       // returns 1
cache.set(3, 3);    // evicts key 2
cache.get(2);       // returns -1 (not found)
cache.get(3);       // returns 3.
cache.set(4, 4);    // evicts key 1.
cache.get(1);       // returns -1 (not found)
cache.get(3);       // returns 3
cache.get(4);       // returns 4


Aug 31

这个题搞了我三天,都要吐了=.=

这题看这个印度人发的paper稍微好懂一些:http://dhruvbird.com/lfu.pdf

放一些截图在这里,以防链接失效。。。<侵权删>

基本思想是:

新建一个CacheNode class,属性包括key,value和frequency

  • 一个hashMap来作为cache,这样当get的时候能够迅速的找到它;get的同时,该值被引用,所以应该update frequency。
  • 一个LinkedHashSet[] 数组来表示frequency,linkedHashSet[i] 表示 frequency = i
  • N个linkedHashSet,每个linkedHashSet[i]里挂一个,来存frequency = i 的nodes。具体见上面的图,比较好理解。

主要的attribute:

  • maxFrequency: 定义为capacity * 2 – 1,此处不能让它渐长为无穷大
  • lowestFrequency: 记录当前的最小的frequency,当插入一个新的值时或者删掉一个值,lowestFrequency根据情况更新。

几个主要的函数:

  • addFrequency, moveToNextFrequency: 当一个node frequency增加时:要把这个node挪到另一个linkedHashSet[i]中去,并且update Cache,并且注意lowestFrequency
  • doEviction: 当cache满的时候,需要移除node。

其他的看代码吧。。。这次只要求实现了get与set,九章上面是一个更为复杂全面的LFUCache,支持remove,当remove引入时,需要引入FindNextLowestFrequency。这个时候做doEviction的时候,不仅仅删一个数,可能会删多个数字,会出现删完了lowestFrequency所有的node还不够的情况,这时候就需要FindNextLowestFrequency了。以后再研究吧。。。

public class LFUCache {
private class CacheNode{
int val;
int key;
int freq;

public CacheNode(int key, int value, int frequency){
this.key = key;
this.val = value;
this.freq = frequency;
}
}

private Map<Integer, CacheNode> cache; //Cache to store values.
private LinkedHashSet[] frequencyList; //frequencyList[i] = node.freq = i 的集合,顺序为出现的先后顺序。
private int maxCacheSize;
private int lowestFrequency;
private int maxFrequency; //设置一个最大的frequency值,使得frequency不无限增长下去

// @param capacity, an integer
public LFUCache(int capacity) {
//Initiate
this.cache = new HashMap<Integer, CacheNode>(capacity);
//frequencyList size的大小:太小了不行,frequency不够用,太大了浪费空间。于是取 capacity * 2,事实上应该capacity * 3 都行。
//frequency: 0 – (capacity * 2 – 1): i 代表着frequency
this.frequencyList = new LinkedHashSet[capacity * 2];
this.lowestFrequency = 0;
this.maxFrequency = capacity * 2 – 1; // 由frequencyList size决定
this.maxCacheSize = capacity;
initFrequencyList();
}

private void initFrequencyList(){
for (int i = 0; i <= maxFrequency; i ++) { //注意此处应该为=号
frequencyList[i] = new LinkedHashSet();
}
}

// @param key, an integer
// @param value, an integer
// @return nothing
public void set(int key, int value) {
// Write your code here
CacheNode current = cache.get(key);
// current key does not exist 意味着这是一个新的需要插入的值
if (current == null) {
if(cache.size() == maxCacheSize)
{
doEviction();
}
//insertion. reset lowest frequency to 0
//由于最后统一addFrequency,所以这里作为一个刚插入的值,frequency赋为0
LinkedHashSet nodes = frequencyList[0]; //frequencyList[0]始终为一个空,是一个temporary放node的地方,当addFrequency之后,这个node就会被移走
current = new CacheNode(key, value, 0);
nodes.add(current);
cache.put(key, current); //cache里面也先加上该node,frequency为0。addFrequency中会update cache中frequency的值。
lowestFrequency = 0; //reset lowestFrequency value to 0
}else
{
current.val = value;
}
addFrequency(current); //AddFrequency() for sets.
}

//拿掉所有在lowestFrequency之下的node;这个method里不需要更新lowestFrequency,it has been taken care of latter.
private void doEviction(){
LinkedHashSet nodes = frequencyList[lowestFrequency];
Iterator iter = nodes.iterator();
//当lowestFrequency list上有多个node时,只删掉第一个,并不能删掉所有。
if(iter.hasNext()){
CacheNode node = iter.next();
iter.remove();
cache.remove(node.key);
}
//这里并不需要FindNextLowestFrequency,因为这里只删除一个,进入删除的前提是cache满了,doEviction之外有reset lowest Frequency to 0 的逻辑。
}

public int get(int key) {
// Write your code here
CacheNode current = cache.get(key);
if (current != null) {
addFrequency(current);
return current.val;
}else{
return -1;
}
}

//update frequencyList
//update value in the cache
private void addFrequency(CacheNode node){
int currentFrequency = node.freq;
if (currentFrequency < maxFrequency){
int nextFrequency = currentFrequency + 1;
LinkedHashSet currentNodes = frequencyList[currentFrequency];
LinkedHashSet newNodes = frequencyList[nextFrequency];
//把node从current frequency list 里头移除,加入到next frequency list 里。
moveToNextFrequency(node, nextFrequency, currentNodes, newNodes);
cache.put(node.key, node); //该node的frequency已经在moveToNextFrequency中+1了,所以需要update cache 中的node,同步更新它的frequency.

//if current node is the last one node of the lowest frequency node list
//it is moved to newNodes, so currentNodes list is empty.//此时需要更新LowestFrequency value.
if ((lowestFrequency == currentFrequency) && currentNodes.isEmpty()){
lowestFrequency = nextFrequency;
}//否则:另一个LowestFrequency 需要更新的情况是:新插入一个值,lowestFrequency重置,这个已经在set里面处理过了。所以这里不需要。
}else{ // 一个有着maxFrequency的node又被访问了一次,frequency不需要再+1,只需要把给node放到list的第一个。
LinkedHashSet nodes = frequencyList[currentFrequency];
nodes.remove(node);
nodes.add(node);
}
}

//把node从current frequency list 里头移除,加入到next frequency list 里。
//这里不能省略掉nextFrequency,不能把nextFrequency 默认为node.frequency + 1. 这是为了防止累计产生无穷大的frequency
private void moveToNextFrequency(CacheNode node, int nextFrequency, LinkedHashSet currentNodes, LinkedHashSet newNodes){
currentNodes.remove(node);
node.freq = nextFrequency; //dont use + 1 here, it will grow infinitely
newNodes.add(node);
}
}

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: