Redis is often used for caching, in a setup where a fixed maximum memory to use is specified. When new data arrives, 
we need to make space by removing old data. The efficiency of Redis as a cache is related to how good decisions it 
makes about what data to evict: 
deleting data that is going to be needed soon is a poor strategy, 
while deleting data that is unlikely to be requested again is a good one.


redis 用作缓存，当最大的内存被设置，当内存满了的时候，redis需要决定删除
什么样的数据。如果要删除的数据马上就要被访问则是一个很差的策略， 被删除的
数据应该是接近不会再次被访问的情况。

In other terms every cache has an hits/misses ratio, which is, in qualitative terms, 
just the percentage of read queries that the cache is able to serve. 

换句话说，每个缓存都有命中/未命中比率，用量化术语来说，就是缓存能够服务的读查询的百分比。
Accesses to the keys of a cache are not distributed evenly among the data set in most workloads. 
在大多数工作负载中，对缓存键的访问并不均匀地分布在数据集中。

Often a small percentage of keys get a very large percentage of all the accesses.
Moreover the access pattern often changes over time, which means that as time passes 
certain keys that were very requested may no longer be accessed often, and conversely, 
keys that once were not popular may turn into the most accessed keys.
一小部分数据获得了非常大的访问量，而且访问模式也经常随着时间变化而变化，意味着，某段时间内被频繁访问的键不再频繁被访问。 
相反，之前不是热点数据的key，转变为访问最多的key。


So in general what a cache should try to do is to retain the keys that have the highest probability of 
being accessed in the future. 

所以缓存要做的一件事情就是，要保留未来最有可能被访问的key。

From the point of view of an eviction policy (the policy used to make space to allow new data to enter) 
this translates into the contrary: the key with the least probability of being accessed in the future 
should be removed from the data set. There is only one problem: Redis and other caches are not able to predict the future.

从缓存驱逐策略的角度来说，可以这样理解， 在未来某些可以如果很小的可能性被访问
就应该从数据库汇总删除。  当然，还有一个问题就是缓存和 Redis都不能预测未来。


The LRU algorithm
===

While caches can’t predict the future, they can reason in the following way: 
虽然缓存不能预测未来，但它们可以通过以下方式进行推理
keys that are likely to be requested again are keys that were recently requested often. 

最近被频繁访问的 key， 可能会被再次访问。 

Since usually access patterns don’t change very suddenly, this is an effective strategy. 

由于通常情况下访问模式不会立即改变，这是一种有效的策略。 

However the notion of “recently requested often” is more insidious that it may
 look at a first glance (we’ll return shortly on this). 

然而，“最近请求频繁”的概念可能乍一看更为晦涩

So this concept is simplified into an algorithm that is called LRU, which 
instead just tracks the *last time* a key was requested.

因此，这个概念被简化为一个称为LRU的算法，它只跟踪*上次*请求 Key 的时间。

Keys that are accessed with an higher frequency have a greater probability of being idle (not accessed) 
for a shorter time compared to keys that are rarely accessed.

与很少访问的键相比，以更高的频率访问的键在更短的时间内处于空闲(未访问)的可能性更大【  意味着，频繁访问的键有可能在短期内再次被访问】。


For instance this is a representation of four different keys accesses over time. 
Each “~” character is one second, while the “|” line at the end is the current instant.

~~~~~A~~~~~A~~~~~A~~~~A~~~~~A~~~~~A~~|
~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~|
~~~~~~~~~~C~~~~~~~~~C~~~~~~~~~C~~~~~~|
~~~~~D~~~~~~~~~~D~~~~~~~~~D~~~~~~~~~D|

Key A is accessed one time every 5 seconds, 
key B once every 2 seconds
and key C and D are both accessed every 10 seconds.

Given the high frequency of accesses of key B, it has among the lowest idle  

times, which means its last access time is the second most recent among all the
four keys.

B: 最短的空闲时间 ,但是从时间线上来看，最近使用的Key排名第二。



Similarly A and C idle time of 2 and 6 seconds well reflect the access
frequency of both those keys. However as you can see this trick does not
always work: key D is accessed every 10 seconds, however it has the most
recent access time of all the keys.

严格按照LRU算法的话，这种方式运行的不是很好， A,C 的空闲时间为 2s 和 6s 
D 访问频率是 10s/次 , 但是按时间追踪却是最新的




Still, in the long run, this algorithm works well enough. Usually keys
with a greater access frequency have a smaller idle time.
长期来看这种方式运行的很好， 被频繁访问的key 有更小的空闲时间。
The LRU algorithm evicts the Least Recently Used key, which means the one with
the greatest idle time. 
更长空闲时间的key会被 删除掉 
It is simple to implement because all we need to do is to track the 
last time a given key was accessed, or sometimes
this is not even needed: 
这很容易实现，我们需要做的就是去追踪每个key最后的访问时间，甚至不需要追踪时间，比如通过一个链表来维护

【链表实现方式】
we may just have all the objects we want to
evict linked in a linked list. When an object is accessed we move it
to the top of the list. When we want to evict objects, we evict from
the tail of the list. Tada! Win.

LRU in Redis: the genesis
===

Initially Redis had no support for LRU eviction. It was added later, when memory efficiency was a big concern. 
By modifying a bit the Redis Object structure I was able to make 24 bits of space. 
There was no room for linking the objects in a linked list (fat pointers!), 
moreover the implementation needed to be efficient, since the server performance 
should not drop too much because of the selection of the key to evict.

最初，Redis不支持驱逐LRU。它是后来添加的，当内存效率是一个大问题。
通过修改一点Redis对象结构，我可以创建24位的空间。没有空间来链接链表中的对象(胖指针!)
此外，实现需要高效，因为服务器性能不应因选择退出键而跌得太多。

The 24 bits in the object are enough to store the least significant
bits of the current unix time in seconds. This representation, called
“LRU clock” inside the source code of Redis, takes 194 days to overflow. 
Keys metadata are updated much often, so this was good enough.

对象中的24位足以存储当前unix时间的位(以秒为单位)。这表示,称为
“LRU clock” ，需要194天才能溢出。缓存数据经常更新，所以这已经足够好了。

However there was another more complex problem to solve, how to select
the key with the greatest idle time in order to evict it? The Redis
key space is represented via a flat hash table. To add another data
structure to take this metadata was not an option, however since
LRU is itself an approximation of what we want to achieve, what
about approximating LRU itself?

然而，还有一个更复杂的问题需要解决，那就是如何选择空闲时间最长的Key去删除它? Redis 的key本身用一个哈希表
来存储所有的key。添加另一个数据结构来存储这些过期选项不是一个选项，因为LRU本身就
是我们想要达到的目标的一个近似值关于LRU本身的近似?




The initial Redis algorithm was as simple as that: 
when there is to evict a key, select 3 random keys, and evict the one with the highest
idle time. Basically we do random sampling over the key space and evict
the key that happens to be the better. Later this “3 random keys”
was turned into a configurable “N random keys” and the algorithm
speed was improved so that the default was raised to 5 keys sampling
without losing performances. Considering how naive it was, it worked
well, very well actually. If you think at it, you always never do
the best decision with this algorithm, but is very unlikely to do
a very bad decision too. If there is a subset of very frequently accessed
keys in the data set, out of 5 keys it is hard to be so unlucky to
only sample keys with a very short idle time.

最初的Redis算法是这样简单:
========================================================================
当要驱逐一个Key 时，随机选择3个Key，驱逐最高空闲时间的那个。
基本上我们在Key 空间上做随机抽样和驱逐Key 是要更好。后来这“3个随机键”
变成了一个可配置的“N个随机Key”的算法速度得到了提高，因此默认值提高到了5个键采样
而不损失性能。很简单，它起作用了其实很好。如果你思考，你永远不会去做用这个算法做
出的最佳决策，但不太可能做出一个非常糟糕的决定。如果有一个非常频繁访问的子集
数据集中的键，5个键中很难有这么倒霉的只采样一个非常短的空闲时间键。



However if you think at this algorithm *across* its executions, you
can see how we are trashing a lot of interesting data. Maybe when
sampling the N keys, we encounter a lot of good candidates, but
we then just evict the best, and start from scratch again the next
cycle.

然而，如果你想一下这个算法的执行过程，你可以看出我们是如何丢弃大量有趣的数据的。也许当
对N个键进行抽样，我们会遇到很多好的候选项，但是然后我们只是驱逐最好的，然后重新开始下一个周期。

First rule of Fight Club is: observe your algorithms with naked eyes
===

At some point I was in the middle of working at the upcoming Redis
3.0 release. Redis 2.8 was actively used as an LRU cache in multiple
environments, and people didn’t complained too much about the
precision of the eviction in Redis, but it was clear that it could
be improved even without using a noticeable amount of additional CPU
time, and not a single bit of additional space.

However in order to improve something, you have to look at it. There
are different ways to look at LRU algorithms. You can write, for example,
tools that simulate different workloads, and check the hit/miss ratio
at the end. This is what I did, however the hit/miss ratio depends
a lot on the access pattern, so additionally to this information I
wrote an utility that actually displayed the algorithm quality in a
visual way.

The program was very simple: it added a given number of keys, then
accessed the keys sequentially so that each had a decreasing
idle time. Finally 50% more keys were added (the green ones in the
picture), so that half of the old keys needed to be evicted.

In a perfect LRU implementation no key from the new added keys are evicted, 
and the initial 50% of the old dataset is evicted.

This is the representation produced by the program for different
versions of Redis and different settings:

http://redis.io/images/redisdoc/lru_comparison.png

When looking at the graph remember that the implementation we
discussed so far is the one of Redis 2.8. The improvement you
see in Redis 3.0 is explained in the next section.

LRU V2: don’t trash away important information
===

With the new visual tool, I was able to try new approaches and
test them in a matter of minutes. The most obvious way to improve
the vanilla algorithm used by Redis was to accumulate the otherwise
trashed information in a “pool” of good candidates for eviction.

Basically when the N keys sampling was performed, it was used to
populate a larger pool of keys (just 16 keys by default).
This pool has the keys sorted by idle time, so new keys only enter
the pool when they have an idle time greater than one key in the
pool or when there is empty space in the pool.

基本上，当N个键采样时，它被用来填充一个更大的键池(默认情况下只有16个键)。
这个池按空闲时间对键排序，因此只输入新键当池中有大于一个键的空闲时间时或当池里有空位时。


This small change improved the performances of the algorithm
dramatically as you can see in the image I linked above and
the implementation was not so complex. A couple memmove() here
and there and a few profiling efforts, but I don’t remember
major bugs in this area.

这个小的改变极大地提高了算法的性能，正如我在上面链接的图片中所看到的，实现并不复杂。 


At the same time, a new redis-cli mode to test the LRU precision
was added (see the —lru-test option), so I had another way to
check the performances of the LRU code with a power-law access
pattern. This tool was used to validate with a different test that
the new algorithm worked better with a more real-world-ish workload.
It also uses pipelining and displays the accesses per second, so
can be used in order to benchmark different implementations, at least
to check obvious speed regressions.

Least Frequently Used
===

The reason I’m writing this blog post right now is because a couple
of days ago I worked at a partial reimplementation and to different
improvements to the Redis cache eviction code.

Everything started from an open issue: 

when you have multiple databases
with Redis 3.2, the algorithm evicts making local choices. So
if for example you have all keys with a small idle time in DB number 0,
and all keys with large idle time in DB number 1, Redis will evict
one key from each DB. A more rational choice is of course to start
evicting keys from DB number 1, and only later to evict the other keys.

当您在Redis 3.2中有多个数据库时，算法驱逐从本地选择。所以例如，
如果在db0 中，所有key 有一个比较短的空闲时间, 在db1 中，所有的key 有一个比较大的空闲时间。
Redis 将每个db中开始驱逐key。更合理的做法是从db1 中进行驱逐。 之后再去驱逐其他的key。




This is usually not a big deal, when Redis is used as a cache it is
rarely used with different DBs, however this is how I started to work
at the eviction code again. Eventually I was able to modify the pool
to include the database ID, and to use a single pool for all the DBs
instead of using multiple pools. It was slower, but by profiling and
tuning the code, eventually it was faster than the original
implementation by around 20%.

当Redis用作缓存时很少使用多个db，这通常不是什么大问题。最终我能够修改池
包括数据库ID，并为所有DBs使用一个池而不是使用多个池。它比较慢，但是通过剖析和
对代码进行调优，最终比原来的速度更快大约20%的实施。



However my curiosity for this subsystem of Redis was stimulated again
at that point, and I wanted to improve it. I spent a couple of days
trying to improve the LRU implementation: use a bigger pool maybe?
Account for the time that passes when selecting the best key?

After some time, and after refining my tools, I understood that the
LRU algorithm was limited by the amount of data sampled in the database
and was otherwise very good and hard to improve. This is, actually,
kinda evident from the image showing the different algorithms:
sampling 10 keys per cycle the algorithm was almost as accurate as
theoretical LRU.

Since the original algorithm was hard to improve, I started to test
new algorithms. If we rewind a bit to the start of the blog post, we
said that LRU is actually kinda a trick. What we really want is to
retain keys that have the maximum probability of being accessed in the
future, that are the keys *most frequently accessed*, not the ones with
the latest access.

由于原有的算法难以改进，我开始进行测试新算法。如果我们倒回一点博客文章的开头，我们
说LRU其实是一种技巧。我们真正想要的是方法中具有最大被访问概率的键，将来键是“最常访问的”，而不是“带”
最新的访问。

The algorithm evicting the keys with the least number of accesses
is called LFU. It means Least Frequently Used, which is the feature of
the keys that it attempts to kill to make space for new keys.

该算法以最少的访问次数逐出key被称为LFU。意味着最不常用的key将被驱逐来为新键腾出空间。

In theory LFU is as simple as associating a counter to each key. At
every access the counter gets incremented, so that we know that a given
key is accessed more frequently than another key.

从理论上讲，LFU就像为每个键关联一个计数器一样简单。每次访问key时计数器都递增，
所以我们可以知道一个给定的密钥比另一个key更频繁地被访问。

Well, there are at least a few more problems, not specific to Redis,
general issues of LFU implementations:



1. With LFU you cannot use the “move to head” linked list trick used for LRU 
in order to take elements sorted for eviction in a simple way, since keys 
must be ordered by number of accesses in “perfect LFU”. Moving the accessed 
key to the right place can be problematic because there could be many keys 
with the same score, so the operation can be O(N) in the worst case, even if 
the key frequency counter changed just a little. Also as we’ll see 
in point “2” the accesses counter does not always change just a little, 
there are also sudden large changes.

对于LFU，你不能使用LRU使用的“移动到头部”链表技巧以一种简单的方式对元素进行排序，因为键
必须按照“LFU”中的访问次数排序。

将被访问的键移动到正确的位置可能会有问题，因为可能有许多具有相同得分的键，所以在最坏的情况下，操作可能是O(N)，即使
键的计数器只改变了一点 

同时，我们会在第2点看到访问计数器并不总是会有一点变化，也有突然的巨大变化。



2. LFU can’t really be as trivial as, just increment the access counter
on each access. As we said, access patterns change over time, so a key
with an high score needs to see its score reduced over time if nobody
keeps accessing it. Our algorithm must be albe to adapt over time.

LFU不可能像这样简单，只是在每次访问时增加访问计数器。
正如我们所说，访问模式会随时间变化，所以如果没有人一直访问，那么得分高的键需要看到它的得分随时间下降。我们的算法必须适应时间的变化。

In Redis the first problems is not a problem: we can just use the trick
used for LRU: random sampling with the pool of candidates. The second
problem remains. So normally LFU implementations have some way in order
to decrement, or halve the access counter from time to time.

在Redis中，第一个问题不是问题:我们可以采用 用于LRU中的技巧:具有候选池的随机抽样。第二个
问题依旧存在。所以通常情况下，LFU实现会以某种方式递减，或者将访问计数器减半。


Implementing LFU in 24 bits of space
===

LFU has its implementation peculiarities itself, however in Redis all
we can use is our 24 bit LRU field in order to model LFU. To implement
LFU in just 24 bits per objects is a bit more tricky.
LFU有它自己的实现特性，但是在Redis中可以使用我们的24位LRU 字段来建模LFU。每个对象只有24位的LFU有点棘手。
What we need to do in 24 bits is:

1. Some kind of access frequency counter.
  
某种访问频率计数器。 
2. Enough information to decide when to halve the counter.

足够的信息来决定何时将计数器减半

My solution was to split the 24 bits into two fields:

           16 bits      8 bits
      +----------------+--------+
      + Last decr time | LOG_C  |
      +----------------+--------+

The 16 bit field is the last decrement time, so that Redis knows the
last time the counter was decremented, while the 8 bit field is the
actual access counter.

16 bit field：  记录上次衰减的时间
8 bit :         记录计数器


You are thinking that it’s pretty fast to overflow an 8 bit counter,
right? Well, the trick is, instead of using just a counter, I used
a logarithmic counter. This is the function that increments the
counter during accesses to the keys:

8 bit 很容易就溢出了，技巧 是用一个 逻辑计数器，而不是一个普通的递增计数器


  uint8_t LFULogIncr(uint8_t counter) {
      if (counter == 255) return 255;
      double r = (double)rand()/RAND_MAX;
      double baseval = counter - LFU_INIT_VAL;
      if (baseval < 0) baseval = 0;
      double p = 1.0/(baseval*server.lfu_log_factor+1);
      if (r < p) counter++;
      return counter;
  }

Basically the greater is the value of the counter, the less probable
is that the counter will really be incremented: the code above computes
a number ‘p’ between 0 and 1 which is smaller and smaller as the counter
increases. Then it extracts a random number ‘r’ between 0 and 1 and only
increments the counter if ‘r < p’ is true.

基本上，计数器的值越大，计数器真正增加的可能性就越小:上面的代码计算一个在0和1之间的数字“p”，随着计数器的增加，这个数字会越来越小。
然后它提取一个介于0和1之间的随机数“r”，并且仅当 ‘ r < p ‘ 是正确的。





You can configure how much aggressively the counter is implemented
via redis.conf parameters, but for instance, with the default
settings, this is what happens:

你可以通过redis.conf参数配置计数器执行的递增程度，例如，在默认设置下，如下所示:

After 100 hits the value of the counter is 10
After 1000 is 18
After 100k is 142
After 1 million hits it reaches the 255 limit and no longer increments



Now let’s see how this counter is decremented. The 16 bits are used in
order to store the less significant bits of the UNIX time converted
to minutes. As Redis performs random sampling scanning the key space
in search of keys to populate the pool, all keys that are encountered
are checked for decrement. If the last decrement was performed more than
N minutes ago (with N configurable), the value of the counter is halved
if it is an high value, or just decremented if it is a lower value
(in the hope that we can better discriminate among keys with few
accesses, given that our counter resolution is very small).

现在让我们看看这个计数器是如何递减的。
这16位用于存储转换为分钟的UNIX时间中不太重要的位。
当Redis对key 空间进行随机抽样扫描时
在搜索填充池的键时，将检查遇到的所有键是否递减。
如果最后一个减量在超过N分钟之前执行(N 可配置)，计数器的值就会减半


There is yet another problem, new keys need a chance to survive after
all. In vanilla LFU a just added key has an access score of 0, so it
is a very good candidate for eviction. In Redis new keys start with
an LFU value of 5. This initial value is accounted in the increment
and halving algorithms. Simulations show that with this change keys have
some time in order to accumulate accesses: keys with a score less than
5 will be preferred (non active keys for a long time).


还有另一个问题，新的Key毕竟需要生存的机会。
在普通的LFU中，一个刚添加的键的访问分数为0，所以它是一个很好的驱逐对象。
在Redis的实现中新Key开始LFU值为5。这个初始值被计入增量中和减半算法。
模拟表明，通过这种改变，键有一定的时间来积累访问:
分数小于5的键将被优先使用(在很长一段时间内非活动键)。




Code and performances
===

The implementation described above can be found in the “unstable” branch
of Redis. My initial tests show that it outperforms LRU in power-law
access patterns, while using the same amount of memory per key, however
real world access patterns may be different: time and space locality
of accesses may change in very different ways, so I’ll be very happy
to learn from real world use cases how LFU is performing, and how the
two parameters that you can tune in the Redis LFU implementation change
the performances for different workloads.

Also an OBJECT FREQ subcommand was added in order to report the
frequency counter for a given key, this can be both useful in order
to observe an application access pattern, and in order to debug the
LFU implementation.

Note that switching at runtime between LRU and LFU policies will have
the effect to start with almost random eviction, since the metadata
accumulated in the 24 bits counter does not match the meaning of the
newly selected policy. However over time it adapts again.

There are probably many improvements possible.

Ben Manes pointed me to this interesting paper, describing an algorithm
called TinyLRU (http://arxiv.org/pdf/1512.00727.pdf).

The paper contains a very neat idea: instead of remembering the
access frequency of the current objects, let’s (probabilistically)
remember the access frequency of all the objects seen so far, this
way we can even refuse new keys if, from the name, we believe they
are likely to get little accesses, so that no eviction is needed at all,
if evicting a key means to lower the hits/misses ratio.

My feeling is that this technique, while very interesting for plain
GET/SET LFU caches, is not applicable to the data structure server
nature of Redis: users expect the key to exist after being created
at least for a few milliseconds. Refusing the key creation at all
seems semantically wrong for Redis.

However Redis maintains LFU informations when a key is overwritten, so
for example after a:

    SET oldkey some_new_value

The 24 bit LFU counter is copied to the new object associated to the
old key.

The new eviction code of Redis unstable contains other good news:

1. Policies are now “cross DB”. In the past Redis made local choices as 
   explained at the start of this blog post. Now this is fixed for all the policies, not just LRU.

2. The volatile-ttl eviction policy, which is the one that evicts based on the remaining time 
   to live of keys with an expire set, now uses the pool like the other policies.

3. Performances are better by reusing SDS objects in the pool of keys.

This post ended a lot longer than I expected it to be, but I hope it offered 
a few insights on the new stuff and the improvements to the old things we already had. 
Redis, more than a “solution” to solve a specific problem, is a generic tool. 
It’s up to the sensible developer to apply it in the right way. Many people 
use Redis as a caching solution, so improvements in this area are always 
investigated from time to time.