缓存的本质是用有限空间存储高频访问的数据,减少对低速存储 (如磁盘、数据库) 或计算 (如自注意力$QK$投影计算) 的依赖. 但缓存空间有限,当新数据需要加入时,必须淘汰部分旧数据, 这就涉及缓存淘汰策略. 而 LRU (Least Recently Used) 缓存算法的核心是优先淘汰最近最少使用的数据,通过高效管理缓存空间,确保常用数据留存,提升系统性能。它在操作系统、数据库、Web 服务等场景中应用广泛,其巧妙之处在于平衡了缓存命中率与操作效率. LRU 的设计是基于一个假设: 时间局部性原理, 该假设认为, 如果一个数据最近被访问过,那么未来被访问的概率更高。反之,长期未被访问的数据,未来被用到的可能性低,优先淘汰.

Paper: A study of replacement algorithms for a virtual-storage computer

LRU 算法原理

  • 核心假设: 基于 “局部性原理”—— 最近被访问的数据,在未来一段时间内被访问的概率更高.

  • 淘汰规则: 当缓存达到容量上限时,移除最久未被使用的缓存项.

  • 关键操作:

    1. 每次访问(读 / 写)数据时,将该数据标记为 “最近使用”.

    2. 当缓存满时,移除最久未被标记的项.

LRU 缓存实现

常见的 LRU 实现是基于两个数据结构: 双向链表 哈希表. 双向链表负责维护数据的访问顺序: 靠近表头的数据为 “最近使用” 的数据, 而靠近表尾的数据则为 “最久未使用” 的数据, 依次排列; 哈希表则负责缓存 KV 存储.

以下是基于 Python 的 LRU 缓存实现:

# 基本缓存实现
import threading

from typing_extensions import Any, OrderedDict


class BaseCache:
    def __init__(self):
        self._cache = OrderedDict()
        self._cache_lock = threading.RLock()

    def read(self, key: Any) -> Any | None:
        value = None
        with self._cache_lock:
            value = self._cache.get(key, None)
        return value

    def write(self, key: Any, value: Any) -> Any | None:
        with self._cache_lock:
            self._cache[key] = value
        return value

    def flush(self) -> None:
        with self._cache_lock:
            self._cache.clear()
        return

    def delete(self, key: Any) -> Any | None:
        with self._cache_lock:
            if key in self._cache.keys():
                value = self._cache[key]
                del self._cache[key]
                return value
        return

    def __contains__(self, key: Any) -> bool:
        contains = False
        with self._cache_lock:
            contains = (key in self._cache.keys())
        return contains

    def __getitem__(self, key: Any) -> Any:
        return self.read(key=key)

    def __setitem__(self, key: Any, value: Any) -> Any | None:
        return self.write(key=key, value=value)

    def __delitem__(self, key: Any) -> Any | None:
        return self.delete(key=key)

    def __str__(self) -> str:
        stringify = ''
        with self._cache_lock:
            _max_len = 8
            for key, value in self._cache.items():
                _max_len = max(_max_len, len(str(key)))
                _max_len = max(_max_len, len(str(value)))
            _padding = 2
            _max_len += _padding
            line_spliter = f'\n{"+" + "=" * (_max_len * 2 + 1) + "+"}\n'
            header = f'{self.__class__.__name__}(capacity={len(self._cache.keys())})'
            stringify = (f'{header:^{_max_len * 2 + 2}}{line_spliter}'
                         + f'|{"Key":^{_max_len}}|{"Value":^{_max_len}}|{line_spliter}'
                         + line_spliter.join([f'|{str(k):^{_max_len}}|{str(v):^{_max_len}}|' for k, v in self._cache.items()])
                         + (line_spliter if len(self) else ''))
        return stringify

    def __repr__(self) -> str:
        return self.__str__()

    def __len__(self) -> int:
        length = 0
        with self._cache_lock:
            length = len(self._cache.keys())
        return length

    def __eq__(self, other) -> bool:
        stringify = ''
        with self._cache_lock:
            stringify = str(self)
        return stringify == str(other)

    def __call__(self, key: Any) -> Any:
        return self.read(key=key)
# 对基本缓存进行增强, 实现 LRU 缓存策略
import threading

from typing_extensions import Any
from overrides import override

from vortezwohl.cache.base_cache import BaseCache


class LRUCache(BaseCache):
    def __init__(self, capacity: int):
        super().__init__()
        self._capacity = capacity
        self._recently_used = list()
        self._recently_used_lock = threading.RLock()

    def expand_to(self, capacity: int) -> None:
        self._capacity = capacity

    def __recently_used_enqueue(self, key: Any) -> Any:
        with self._recently_used_lock:
            if key in self._recently_used:
                self._recently_used.remove(key)
            self._recently_used.append(key)
            while len(self._recently_used) > self._capacity:
                self.delete(self.__recently_used_dequeue())
        return key

    def __recently_used_dequeue(self) -> None:
        key = None
        with self._recently_used_lock:
            if len(self._recently_used) > 0:
                key = self._recently_used.pop(0)
        return key

    @override
    def read(self, key: Any) -> Any | None:
        self.__recently_used_enqueue(key=key)
        return super().read(key=key)

    @override
    def write(self, key: Any, value: Any) -> Any | None:
        self.__recently_used_enqueue(key=key)
        return super().write(key=key, value=value)