Redis中LRU算法的案例

發(fā)布時(shí)間：2020-08-28 14:22:39 來(lái)源：億速云閱讀：142 作者：小新欄目：關(guān)系型數(shù)據(jù)庫(kù)

這篇文章將為大家詳細(xì)講解有關(guān)Redis中LRU算法的案例，小編覺(jué)得挺實(shí)用的，因此分享給大家做個(gè)參考，希望大家閱讀完這篇文章后可以有所收獲。

Redis的LRU算法

LRU算法背后的的思想在計(jì)算機(jī)科學(xué)中無(wú)處不在，它與程序的"局部性原理"很相似。在生產(chǎn)環(huán)境中，雖然有Redis內(nèi)存使用告警，但是了解一下Redis的緩存使用策略還是很有好處的。下面是生產(chǎn)環(huán)境下Redis使用策略：最大可用內(nèi)存限制為4GB，采用 allkeys-lru 刪除策略。所謂刪除策略：當(dāng)redis使用已經(jīng)達(dá)到了最大內(nèi)存，比如4GB時(shí)，如果這時(shí)候再往redis里面添加新的Key，那么Redis將選擇一個(gè)Key刪除。那如何選擇合適的Key刪除呢？

CONFIG GET maxmemory
"maxmemory"
"4294967296"
CONFIG GET maxmemory-policy
"maxmemory-policy"
"allkeys-lru"

在官方文檔Using Redis as an LRU cache描述中，提供了好幾種刪除策略，比如 allkeys-lru、volatile-lru等。在我看來(lái)按選擇時(shí)考慮三個(gè)因素：隨機(jī)、Key最近被訪問(wèn)的時(shí)間、Key的過(guò)期時(shí)間(TTL)

比如：allkeys-lru，"統(tǒng)計(jì)一下所有的"Key歷史訪問(wèn)的時(shí)間，把"最老"的那個(gè)Key移除。注意：我這里加了引號(hào)，其實(shí)在redis的具體實(shí)現(xiàn)中，要統(tǒng)計(jì)所有的Key的最近訪問(wèn)時(shí)間代價(jià)是很大的。想想，如何做到呢？

evict keys by trying to remove the less recently used (LRU) keys first, in order to make space for the new data added.

再比如：allkeys-random，就是隨機(jī)選擇一個(gè)Key，將之移除。

evict keys randomly in order to make space for the new data added.

再比如：volatile-lru，它只移除那些使用 expire 命令設(shè)置了過(guò)期時(shí)間的Key，根據(jù)LRU算法來(lái)移除。

evict keys by trying to remove the less recently used (LRU) keys first, but only among keys that have an expire set, in order to make space for the new data added.

再比如：volatile-ttl，它只移除那些使用 expire 命令設(shè)置了過(guò)期時(shí)間的Key，哪個(gè)Key的存活時(shí)間(TTL KEY 越小)越短，就優(yōu)先移除。

evict keys with an expire set, and try to evict keys with a shorter time to live (TTL) first, in order to make space for the new data added.

volatile 策略(eviction methods) 作用的Key 是那些設(shè)置了過(guò)期時(shí)間的Key。在redisDb結(jié)構(gòu)體中，定義了一個(gè)名為 expires 字典(dict)保存所有的那些用expire命令設(shè)置了過(guò)期時(shí)間的key，其中expires字典的鍵指向redis 數(shù)據(jù)庫(kù)鍵空間(redisServer--->redisDb--->redisObject)中的某個(gè)鍵，而expires字典的值則是這個(gè)鍵的過(guò)期時(shí)間(long類型整數(shù))。
額外提一下：redis 數(shù)據(jù)庫(kù)鍵空間是指：在結(jié)構(gòu)體redisDb中定義的一個(gè)名為"dict"，類型為hash字典的一個(gè)指針，它用來(lái)保存該redis DB中的每一個(gè)鍵對(duì)象、以及相應(yīng)的值對(duì)象。

既然有這么多策略，那我用哪個(gè)好呢？這就涉及到Redis中的Key的訪問(wèn)模式了(access-pattern)，access-pattern與代碼業(yè)務(wù)邏輯相關(guān)，比如說(shuō)符合某種特征的Key經(jīng)常被訪問(wèn)，而另一些Key卻不怎么用到。如果所有的Key都可能機(jī)會(huì)均等地被我們的應(yīng)用程序訪問(wèn)，那它的訪問(wèn)模式服從均勻分布；而大部分情況下，訪問(wèn)模式服從冪指分布(power-law distribution)，另外Key的訪問(wèn)模式也有可能是隨著時(shí)間變化的，因此需要一種合適的刪除策略能夠catch 住 (捕獲住)各種情形。而在冪指分布下，LRU是一種很好的策略：

While caches can’t predict the future, they can reason in the following way: keys that are likely to be requested again are keys that were recently requested often. Since usually access patterns don’t change very suddenly, this is an effective strategy.

Redis中LRU策略的實(shí)現(xiàn)

最直觀的想法：LRU啊，記錄下每個(gè)key 最近一次的訪問(wèn)時(shí)間（比如unix timestamp），unix timestamp最小的Key，就是最近未使用的，把這個(gè)Key移除?？聪聛?lái)一個(gè)HashMap就能搞定啊。是的，但是首先需要存儲(chǔ)每個(gè)Key和它的timestamp。其次，還要比較timestamp得出最小值。代價(jià)很大，不現(xiàn)實(shí)啊。

第二種方法：換個(gè)角度，不記錄具體的訪問(wèn)時(shí)間點(diǎn)(unix timestamp)，而是記錄idle time：idle time越小，意味著是最近被訪問(wèn)的。

The LRU algorithm evicts the Least Recently Used key, which means the one with the greatest idle time.

Redis中LRU算法的案例

比如A、B、C、D四個(gè)Key，A每5s訪問(wèn)一次，B每2s訪問(wèn)一次，C和D每10s訪問(wèn)一次。（一個(gè)波浪號(hào)代表1s），從上圖中可看出：A的空閑時(shí)間是2s，B的idle time是1s，C的idle time是5s，D剛剛訪問(wèn)了所以idle time是0s

這里，用一個(gè)雙向鏈表(linkedlist)把所有的Key鏈表起來(lái)，如果一個(gè)Key被訪問(wèn)了，將就這個(gè)Key移到鏈表的表頭，而要移除Key時(shí)，直接從表尾移除。

It is simple to implement because all we need to do is to track the last time a given key was accessed, or sometimes this is not even needed: we may just have all the objects we want to evict linked in a linked list.

但是在redis中，并沒(méi)有采用這種方式實(shí)現(xiàn)，它嫌LinkedList占用的空間太大了。Redis并不是直接基于字符串、鏈表、字典等數(shù)據(jù)結(jié)構(gòu)來(lái)實(shí)現(xiàn)KV數(shù)據(jù)庫(kù)，而是在這些數(shù)據(jù)結(jié)構(gòu)上創(chuàng)建了一個(gè)對(duì)象系統(tǒng)Redis Object。在redisObject結(jié)構(gòu)體中定義了一個(gè)長(zhǎng)度24bit的unsigned類型的字段，用來(lái)存儲(chǔ)對(duì)象最后一次被命令程序訪問(wèn)的時(shí)間：

By modifying a bit the Redis Object structure I was able to make 24 bits of space. There was no room for linking the objects in a linked list (fat pointers!)

畢竟，并不需要一個(gè)完全準(zhǔn)確的LRU算法，就算移除了一個(gè)最近訪問(wèn)過(guò)的Key，影響也不太。

To add another data structure to take this metadata was not an option, however since LRU is itself an approximation of what we want to achieve, what about approximating LRU itself?

最初Redis是這樣實(shí)現(xiàn)的：

隨機(jī)選三個(gè)Key，把idle time最大的那個(gè)Key移除。后來(lái)，把3改成可配置的一個(gè)參數(shù)，默認(rèn)為N=5：maxmemory-samples 5

when there is to evict a key, select 3 random keys, and evict the one with the highest idle time

就是這么簡(jiǎn)單，簡(jiǎn)單得讓人不敢相信了，而且十分有效。但它還是有缺點(diǎn)的：每次隨機(jī)選擇的時(shí)候，并沒(méi)有利用歷史信息。在每一輪移除(evict)一個(gè)Key時(shí)，隨機(jī)從N個(gè)里面選一個(gè)Key，移除idle time最大的那個(gè)Key；下一輪又是隨機(jī)從N個(gè)里面選一個(gè)Key...有沒(méi)有想過(guò)：在上一輪移除Key的過(guò)程中，其實(shí)是知道了N個(gè)Key的idle time的情況的，那我能不能在下一輪移除Key時(shí)，利用好上一輪知曉的一些信息？

However if you think at this algorithm across its executions, you can see how we are trashing a lot of interesting data. Maybe when sampling the N keys, we encounter a lot of good candidates, but we then just evict the best, and start from scratch again the next cycle.

start from scratch太傻了。于是Redis又做出了改進(jìn)：采用緩沖池(pooling)

當(dāng)每一輪移除Key時(shí)，拿到了這個(gè)N個(gè)Key的idle time，如果它的idle time比 pool 里面的 Key的idle time還要大，就把它添加到pool里面去。這樣一來(lái)，每次移除的Key并不僅僅是隨機(jī)選擇的N個(gè)Key里面最大的，而且還是pool里面idle time最大的，并且：pool 里面的Key是經(jīng)過(guò)多輪比較篩選的，它的idle time 在概率上比隨機(jī)獲取的Key的idle time要大，可以這么理解：pool 里面的Key 保留了"歷史經(jīng)驗(yàn)信息"。

Basically when the N keys sampling was performed, it was used to populate a larger pool of keys (just 16 keys by default). This pool has the keys sorted by idle time, so new keys only enter the pool when they have an idle time greater than one key in the pool or when there is empty space in the pool.

采用"pool"，把一個(gè)全局排序問(wèn)題轉(zhuǎn)化成為了局部的比較問(wèn)題。(盡管排序本質(zhì)上也是比較，囧)。要想知道idle time 最大的key，精確的LRU需要對(duì)全局的key的idle time排序，然后就能找出idle time最大的key了。但是可以采用一種近似的思想，即隨機(jī)采樣(samping)若干個(gè)key，這若干個(gè)key就代表著全局的key，把samping得到的key放到pool里面，每次采樣之后更新pool，使得pool里面總是保存著隨機(jī)選擇過(guò)的key的idle time最大的那些key。需要evict key時(shí)，直接從pool里面取出idle time最大的key，將之evict掉。這種思想是很值得借鑒的。

至此，基于LRU的移除策略就分析完了。Redis里面還有一種基于LFU（訪問(wèn)頻率）的移除策略，后面有時(shí)間再說(shuō)。

JAVA 里面的LRU實(shí)現(xiàn)

JDK中的LinkedHashMap實(shí)現(xiàn)了LRU算法，使用如下構(gòu)造方法，accessOrder 表示"最近最少未使用"的衡量標(biāo)準(zhǔn)。比如accessOrder=true，當(dāng)執(zhí)行java.util.LinkedHashMap#get元素時(shí)，就表示這個(gè)元素最近被訪問(wèn)了。

/**
     * Constructs an empty <tt>LinkedHashMap</tt> instance with the
     * specified initial capacity, load factor and ordering mode.
     *
     * @param  initialCapacity the initial capacity
     * @param  loadFactor      the load factor
     * @param  accessOrder     the ordering mode - <tt>true</tt> for
     *         access-order, <tt>false</tt> for insertion-order
     * @throws IllegalArgumentException if the initial capacity is negative
     *         or the load factor is nonpositive
     */
    public LinkedHashMap(int initialCapacity,
                         float loadFactor,
                         boolean accessOrder) {
        super(initialCapacity, loadFactor);
        this.accessOrder = accessOrder;
    }

再重寫：java.util.LinkedHashMap#removeEldestEntry方法即可。

The {@link #removeEldestEntry(Map.Entry)} method may be overridden to impose a policy for removing stale mappings automatically when new mappings are added to the map.

為了保證線程安全，用Collections.synchronizedMap將LinkedHashMap對(duì)象包裝起來(lái):

Note that this implementation is not synchronized. If multiple threads access a linked hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. This is typically accomplished by synchronizing on some object that naturally encapsulates the map.

實(shí)現(xiàn)如下：(org.elasticsearch.transport.TransportService)

final Map<Long, TimeoutInfoHolder> timeoutInfoHandlers =
        Collections.synchronizedMap(new LinkedHashMap<Long, TimeoutInfoHolder>(100, .75F, true) {
            @Override
            protected boolean removeEldestEntry(Map.Entry eldest) {
                return size() > 100;
            }
        });

當(dāng)容量超過(guò)100時(shí)，開(kāi)始執(zhí)行LRU策略：將最近最少未使用的 TimeoutInfoHolder 對(duì)象 evict 掉。

關(guān)于Redis中LRU算法的案例就分享到這里了，希望以上內(nèi)容可以對(duì)大家有一定的幫助，可以學(xué)到更多知識(shí)。如果覺(jué)得文章不錯(cuò)，可以把它分享出去讓更多的人看到。

向AI問(wèn)一下細(xì)節(jié)

Redis中LRU算法的案例

Redis的LRU算法

Redis中LRU策略的實(shí)現(xiàn)

JAVA 里面的LRU實(shí)現(xiàn)

猜你喜歡

最新資訊

相關(guān)推薦

相關(guān)標(biāo)簽