缓存的本质是用有限空间存储高频访问的数据,减少对低速存储 (如磁盘、数据库) 或计算 (如自注意力$QK$投影计算) 的依赖. 但缓存空间有限,当新数据需要加入时,必须淘汰部分旧数据, 这就涉及缓存淘汰策略. 而 LRU (Least Recently Used) 缓存算法的核心是优先淘汰最近最少使用的数据,通过高效管理缓存空间,确保常用数据留存,提升系统性能。它在操作系统、数据库、Web 服务等场景中应用广泛,其巧妙之处在于平衡了缓存命中率与操作效率. LRU 的设计是基于一个假设: 时间局部性原理, 该假设认为, 如果一个数据最近被访问过,那么未来被访问的概率更高。反之,长期未被访问的数据,未来被用到的可能性低,优先淘汰.
Paper: A study of replacement algorithms for a virtual-storage computer
LRU 算法原理
-
核心假设: 基于 “局部性原理”—— 最近被访问的数据,在未来一段时间内被访问的概率更高.
-
淘汰规则: 当缓存达到容量上限时,移除最久未被使用的缓存项.
-
关键操作:
-
每次访问(读 / 写)数据时,将该数据标记为 “最近使用”.
-
当缓存满时,移除最久未被标记的项.
-
LRU 缓存实现
常见的 LRU 实现是基于两个数据结构: 双向链表
哈希表
. 双向链表负责维护数据的访问顺序: 靠近表头的数据为 “最近使用” 的数据, 而靠近表尾的数据则为 “最久未使用” 的数据, 依次排列; 哈希表则负责缓存 KV 存储.
以下是基于 Python 的 LRU 缓存实现:
# 基本缓存实现
import threading
from typing_extensions import Any, OrderedDict
class BaseCache:
def __init__(self):
self._cache = OrderedDict()
self._cache_lock = threading.RLock()
def read(self, key: Any) -> Any | None:
value = None
with self._cache_lock:
value = self._cache.get(key, None)
return value
def write(self, key: Any, value: Any) -> Any | None:
with self._cache_lock:
self._cache[key] = value
return value
def flush(self) -> None:
with self._cache_lock:
self._cache.clear()
return
def delete(self, key: Any) -> Any | None:
with self._cache_lock:
if key in self._cache.keys():
value = self._cache[key]
del self._cache[key]
return value
return
def __contains__(self, key: Any) -> bool:
contains = False
with self._cache_lock:
contains = (key in self._cache.keys())
return contains
def __getitem__(self, key: Any) -> Any:
return self.read(key=key)
def __setitem__(self, key: Any, value: Any) -> Any | None:
return self.write(key=key, value=value)
def __delitem__(self, key: Any) -> Any | None:
return self.delete(key=key)
def __str__(self) -> str:
stringify = ''
with self._cache_lock:
_max_len = 8
for key, value in self._cache.items():
_max_len = max(_max_len, len(str(key)))
_max_len = max(_max_len, len(str(value)))
_padding = 2
_max_len += _padding
line_spliter = f'\n{"+" + "=" * (_max_len * 2 + 1) + "+"}\n'
header = f'{self.__class__.__name__}(capacity={len(self._cache.keys())})'
stringify = (f'{header:^{_max_len * 2 + 2}}{line_spliter}'
+ f'|{"Key":^{_max_len}}|{"Value":^{_max_len}}|{line_spliter}'
+ line_spliter.join([f'|{str(k):^{_max_len}}|{str(v):^{_max_len}}|' for k, v in self._cache.items()])
+ (line_spliter if len(self) else ''))
return stringify
def __repr__(self) -> str:
return self.__str__()
def __len__(self) -> int:
length = 0
with self._cache_lock:
length = len(self._cache.keys())
return length
def __eq__(self, other) -> bool:
stringify = ''
with self._cache_lock:
stringify = str(self)
return stringify == str(other)
def __call__(self, key: Any) -> Any:
return self.read(key=key)
# 对基本缓存进行增强, 实现 LRU 缓存策略
import threading
from typing_extensions import Any
from overrides import override
from vortezwohl.cache.base_cache import BaseCache
class LRUCache(BaseCache):
def __init__(self, capacity: int):
super().__init__()
self._capacity = capacity
self._recently_used = list()
self._recently_used_lock = threading.RLock()
def expand_to(self, capacity: int) -> None:
self._capacity = capacity
def __recently_used_enqueue(self, key: Any) -> Any:
with self._recently_used_lock:
if key in self._recently_used:
self._recently_used.remove(key)
self._recently_used.append(key)
while len(self._recently_used) > self._capacity:
self.delete(self.__recently_used_dequeue())
return key
def __recently_used_dequeue(self) -> None:
key = None
with self._recently_used_lock:
if len(self._recently_used) > 0:
key = self._recently_used.pop(0)
return key
@override
def read(self, key: Any) -> Any | None:
self.__recently_used_enqueue(key=key)
return super().read(key=key)
@override
def write(self, key: Any, value: Any) -> Any | None:
self.__recently_used_enqueue(key=key)
return super().write(key=key, value=value)