Friday, 17 February 2017

how to make redis more proactive in expiring keys.

Redis expires keys in two ways, active way, and passive way.

In the 'passive' way, when a user tries to get a key, redis will check whether the key should have expired, and if the key should be expired, redis will expire the key and return to the user that the key does not exist. This is called 'passive' way because redis did not 'actively' expired the key by itself, but only 'passively' or 'lazily' expired it when it came to know about it on user's request.

In the 'active' way, redis will run background job which will randomly get a sample of keys, and will expire whichever keys need to be expired. Further if it sees that more than 75% of those keys should be expired, it will further get another sample. Lets call this process 'keys expiring process'. The process is described here.

It means that at any point, on an average, there are a maximum of 25% keys which should have expired but did not. Needless to say, each of these keys occupies memory.

However when we restart redis, it loads all the keys, and will expire all the keys which should expire.

However there is a way for us to direct redis to be more 'proactive' in expiring the keys.
In the redis configuration, there is a parameter named 'hz', which is the number of times redis will run the process to remove the expiring keys in a second('keys expiring process' described above).

The default value is 10, so that redis runs 10 such processes in a second.
As per redis.conf file, although the value of 'hz' should be less than 500, it is recommended to keep the value less than 100.

We can increase this value to some value, like 50.

redis-3.0.6 $redis-cli config get hz
1) "hz"
2) "10"
redis-3.0.6 $ cat redis.conf | grep 'hz '
hz 10
redis-3.0.6 $ redis-cli config set hz 50

redis-3.0.6 $ 

It is important to note that increasing the 'hz' value can lead to a slight increase in the CPU usage, so it is recommended to check the CPU usage after increasing it.

When we did it in our production stack, we saw that the number of keys reduced to some point and became stable, and same was the case with memory. CPU usage did not show any spike so, we were quite pleased to recover some more memory. 


