LRUCache的實現原理及利用python實現的方法

2020-02-16 10:46:30

字體：大中小

來源：轉載

供稿：網友

簡介

LRU(Least Recently Used)最近最少使用，最近有時間和空間最近的歧義，所以我更喜歡叫它近期最少使用算法。它的核心思想是，如果一個數據被訪問過，我們有理由相信它在將來被訪問的概率就越高。于是當LRU緩存達到設定的最大值時將緩存中近期最少使用的對象移除。LRUCache內部使用LinkedHashMap來存儲key-value鍵值對，并將LinkedHashMap設置為訪問順序來體現LRU算法。

無論是對某個key的get，還是set都算做是對該key的一次使用。當set一個不存在的key，并且LRU Cache中key的數量超過cache size的時候，需要將使用時間距離現在最長的那個key從LRU Cache中清除。

LRU Cache實現

在Java中，LRUCache是通過LinkedHashMap實現的。鄙人照貓畫虎，實現一個Python版的LRU Cache（可能和其他大神的實現有所區別）。

首先，需要說明的是：

LRU Cache對象內部會維護一個雙端循環鏈表的頭節點

LRU Cache對象內部會維護一個dict

內部dict的value都是Entry對象，每個Entry對象包含：

key的hash_code（hash_code = hash(key)，在本實現中，hash_code相同的不同key，會被當作一個key來處理。因此，對于自定義類，應該實現魔術方法：__hash__） v - (key, value)對中的value prev - 前一個對象 next - 后一個對象

具體實現是：

當從LRU Cache中get一個key的時候：

計算該key的hash_code 從內部dict中獲取到entry 將該entry移動到雙端循環鏈表的第一個位置返回entry.value

當向LRU Cache中set一個(key, value)對的時候：

計算該key的hash_code，

從LRU Cache的內部dict中，取出該hash_code對應的old_entry（可能不存在），然后根據(key, value)對生成一個new_entry，之后執行：

dict[hash_code] = new_entry 將new_entry提到雙端循環鏈表的第一個位置如果old_entry存在，則從鏈表中刪除old_entry 如果是新增了一個(key, value)對，并且cache中key的數量超過了cache size，那么將雙端鏈表的最后一個元素刪除（該元素就是那個最近最少被使用的元素），并且從內部dict中刪除該元素

HashMap的實現原理

（面試過程中也經常會被問到）：數組和鏈表組合成的鏈表散列結構，通過hash算法，盡量將數組中的數據分布均勻，如果hashcode相同再比較equals方法，如果equals方法返回false，那么就將數據以鏈表的形式存儲在數組的對應位置，并將之前在該位置的數據往鏈表的后面移動，并記錄一個next屬性，來指示后移的那個數據。

注意：數組中保存的是entry（其中保存的是鍵值）

Python實現

class Entry: def __init__(self, hash_code, v, prev=None, next=None): self.hash_code = hash_code self.v = v self.prev = prev self.next = next def __str__(self): return "Entry{hash_code=%d, v=%s}" % (  self.hash_code, self.v) __repr__ = __str__class LRUCache: def __init__(self, max_size): self._max_size = max_size self._dict = dict() self._head = Entry(None, None) self._head.prev = self._head self._head.next = self._head def __setitem__(self, k, v): try:  hash_code = hash(k) except TypeError:  raise old_entry = self._dict.get(hash_code) new_entry = Entry(hash_code, v) self._dict[hash_code] = new_entry if old_entry:  prev = old_entry.prev  next = old_entry.next  prev.next = next  next.prev = prev head = self._head head_prev = self._head.prev head_next = self._head.next head.next = new_entry if head_prev is head:  head.prev = new_entry head_next.prev = new_entry new_entry.prev = head new_entry.next = head_next if not old_entry and len(self._dict) > self._max_size:  last_one = head.prev  last_one.prev.next = head  head.prev = last_one.prev  self._dict.pop(last_one.hash_code) def __getitem__(self, k): entry = self._dict[hash(k)] head = self._head head_next = head.next prev = entry.prev next = entry.next if entry.prev is not head:  if head.prev is entry:  head.prev = prev  head.next = entry  head_next.prev = entry  entry.prev = head  entry.next = head_next  prev.next = next  next.prev = prev return entry.v def get_dict(self): return self._dictif __name__ == "__main__": cache = LRUCache(2) inner_dict = cache.get_dict() cache[1] = 1 assert inner_dict.keys() == [1], "test 1" cache[2] = 2 assert sorted(inner_dict.keys()) == [1, 2], "test 2" cache[3] = 3 assert sorted(inner_dict.keys()) == [2, 3], "test 3" cache[2] assert sorted(inner_dict.keys()) == [2, 3], "test 4" assert inner_dict[hash(2)].next.v == 3 cache[4] = 4 assert sorted(inner_dict.keys()) == [2, 4], "test 5" assert inner_dict[hash(4)].v == 4, "test 6"

上一篇：Python2與python3中 for 循環語句基礎與實例分析

下一篇：python中urlparse模塊介紹與使用示例