Thursday, 22 December 2016

Redis Cluster Monitoring - part 2, global cluster monitoring script.

In the first part of the series on how to effectively monitor redis cluster stacks here, we had seen the first of the four types of scripts that can be useful for us. It also mentioned about the monitoring shell script running on all the systems and monitoring the health of redis on that particular system.

So, lets say we have the redis instance monitoring script (described in part 1) running on a particular redis server and checking every 2 minutes by crontab, that the redis server process is up on that machine.
But what happens if the machine itself is down. If that is the case, the cron will not run and we will not get the notification about redis nodes on that machine. :(

In this part, we will see how we can globally maintain a script to fix the above problem.
In this script, we use the "cluster info" command to find out whether the cluster is in OK state or not.

#!/bin/sh

erroringStacks=''

# it expects five arguments, 1st being the cluster identifier, 
# 2nd n 3rd are IP port of a master in the cluster, 
# 4th n 5th are the IP port of the slave of the master.
checkCluster() {
   msg="Redis Cluster $1 $2 $3 $4 $5 "
   # try to get the cluster state using the first IP port combination. 
   val=`redis-cli -c -h $2 -p $3 cluster info 2> /dev/null | grep 'cluster_state'` 

   if [[ $val == *"ok"* ]]
   then
        msg="$msg OK"
   else
# if the first IP port did not give ok response, 
# try to get the cluster state from the second IP port combination.
val=`redis-cli -c -h $4 -p $5 cluster info 2> /dev/null | grep 'cluster_state'`

if [[ $val == *"ok"* ]]
        then
                msg="$msg OK"
        else
                msg="$msg down"
                erroringStacks="$erroringStacks $1"
        fi
   fi
}

# expects 3 args, 1st is stack identifier, 2nd,3rd are IP port
checkStandalone() {
   msg="Redis Standalone $1 $2 $3 "
   # try to set a simple value in the standalone instance, and get the output.
   val=`redis-cli -h $2 -p $3 set healthcheck:abc def 2> /dev/null`
   if [[ $val == *"OK"* ]]
   then
msg="$msg OK"
   else
msg="$msg down, not able to insert data"
erroringStacks="$erroringStacks $1"
   fi

}

checkCluster C1 127.0.0.1 7000 127.0.0.1 7001
checkStandalone S1 127.0.0.1 6380
checkCluster C2 127.0.0.1 7010 127.0.0.1 7011


if [[ "$erroringStacks" == "" ]]
then
    echo "all well" 
else
    echo "The following stacks are erroring out: $erroringStacks " 
    # send an email to the concerned team.
fi


The above script checks whether the various stacks are up or not. 

For clusters, it gets the info from redis-cli cluster info command. Further, if a particular node specified is not up, it checks its slave to see that it is up, and the cluster info is ok. If both the master and slave specified are not OK, then the cluster is bound to be down and an error is returned. Note that the above method will not work for cluster if cluster-require-full-coverage is set to 'no' in redis.conf file, which means that the cluster will still be up even if some of its slots are not being served properly.

For standalone instance, it can get to check the state by inserting a dummy key value pair in redis. If the key value pair cannot be inserted, then the error will be thrown.

The same code can be done for multiple stacks by specifying a stack name which is the identifier for the stack. Here the cluster stacks are mentioned as C1, and C2 and standalone stacks is S1.

In case, the error is thrown, an appropriate alert can be generated.

This script can be scheduled in crontab and could run every fixed interval like 5 minutes.

*/5 * * * * sh /redis/monitorGlobalRedisStacks.sh

Also, ideally this script should run on more than 2-3 machines, atleast 1 of those should be guaranteed to be up.

This concludes our second part of this series. In the third part, we will see how we can have an hourly checking script on the threshold of various important params in redis servers, and have a mailer in case the threshold in memory, connection, replication is reached.

:)

Wednesday, 21 December 2016

Redis Cluster monitoring - part 1 - node monitoring script

Redis is an in-memory database used for caching which provides very high performance and can run uninterrupted for months. Considering redis stores all the data in memory, if our data size is more than the memory for a single machine, we have to distribute the data on various machines. This is where Redis Cluster comes in, and it provides us a way to distribute the data on different machines, add/remove new machines etc.

However, once we have a large numbers of nodes in a redis cluster, it becomes imperative to continuously monitor the state and health of each redis cluster node/system because the chances of failure of one or more nodes increase.

Also, it helps to have automatic scripts which can monitor the redis nodes and alert in case it senses an error or that some memory/connection threshold has reached.

We should have the following monitoring in a cluster.
  1. monitoring individual nodes.
  2. monitoring overall cluster health.
  3. monitoring stats on redis nodes.
  4. viewing the redis stats over time on some graph

Monitoring individual nodes basically involves monitoring the redis process running on of each node and restarts it if it stops.

Monitoring overall cluster health is required because it is possible that one of the machine is down, so that its monitoring script running on it cannot send an alert. In this case, the global redis monitoring script should try to do a basic insert in each cluster stack, and if it fails, it should trigger the email alert.

Monitoring stats on redis nodes is required so that we don't have to wait for the things to go bad in redis, and we can identify whenever the threshold is reached for various indices in redis. This involves automatic monitoring of individual redis nodes, for the connections, memory, replication lag etc.

Finally, we need to have the stats of various redis nodes represented in terms of graphs. This is required to identify uneven patterns in the data usage/access and to have a global view of how the redis stack is used.

In this part, we will go through how we can monitor individual nodes.

Individual nodes can be monitored by a shell script, essentially a shell script will run every 30 seconds or so and will see if a redis server is running along with a port(s), if the redis server is not running on the ports defined, it restarts the redis server.

This can be achieved using a simple shell script as below.

#!/bin/bash

START_PORT=7000
END_PORT=7003

error=0
ports=''
checkRedis(){
        count=`ps -ef | grep "redis-server" | grep ":$1" | wc -l`
        if [[ $count -ne 1 ]]
        then
                error=1
                ports="$ports $1"
                echo "starting redis on port $1"
                # start redis either by redis-server or by service if redis is installed as a service
                # service redis-$1 start
                src/redis-server cluster-test/$1/redis.conf
        fi
}

for ((i=START_PORT;i<=END_PORT;i++)); do
    checkRedis $i
done

if [[ $error -eq 1 ]]
then
        echo "need to send mail that redis was started on ports $ports"
fi


The above script should run on each machine having one or many redis nodes. If a machine has redis running on different ports, they can be specified. In the above script, we specified that the redis will be running on ports 7000, 7001, 7002, 7003.

The above script can be saved in a file like 'monitorIndividualNodes.sh' and can be run every 2 minutes in crontab using

*/2 * * * * sh /redis/monitorIndividualNodes.sh

The script can be configured to run every interval, like every minute or so through crontab or any other trustworthy scheduling service, and will check whether the redis server is running on predefined ports on those machines. If it is not running, it will start the redis. Optionally, it should also send an email to alert the concerned.

Also, even in case of system restart, cron will run the script appropriately and all the redis instances will start.

Considering redis is very stable, and does not stop unless there is a machine restart, we don't have to worry about receiving too many emails. :)

In the next part, we will see the script to monitor the overall health of the cluster. This can be useful in case one or more machines are down as as result of which the monitoring script of individual nodes cannot run on them and no alert is generated by them.

Happy redising. :)

Tuesday, 6 December 2016

handling and killing idle clients in redis

Considering redis is single threaded, it is best if an application maintains a single connection and uses it to query redis, because redis will process all requests one by one, anyways.

Many times in redis, we have to kill the idle connections in redis. This is useful if there is an application which creates a number of connections but does not close them.

There are some ways to avoid the idle connections in redis.

One of way is to set an idle timeout in redis.
This is done by setting the "timeout" value in seconds in redis.conf which requires a redis restart. Below is the entry in redis.conf

#Close the connection after a client is idle for N seconds (0 to disable)
timeout 3600

This can also be done by setting the timeout through config without redis restart by executing the below command.

src/redis-cli CONFIG SET timeout 3600

Considering, the config value set using the method 2 above is lost if the redis restarts, it is important to set it in redis.conf(method1) and then set it using config(method 2) so that redis does not require a restart and also the value persists even after there is a restart.

Note that, by default the value of timeout is 0, ie idle connections are everlasting and are never killed by redis server.

The Catch
The catch here is that the timeout is only applicable for normal clients, not for pub-sub ones, so that a pubsub client will not timeout even though its idle time exceeds the idle timeout defined. This is because the default behavior of pub-sub clients is to wait for events.  This is mentioned here.


Killing clients of a particular type in redis

Sometimes if you want to kill all the clients of a particular type, the below command may come in handy.

# kills all normal clients
src/redis-cli CLIENT KILL TYPE normal

# kills all pub-sub clients
src/redis-cli CLIENT KILL TYPE pubsub

# kills all slave clients
src/redis-cli CLIENT KILL TYPE slave

It is important to note that the above commands should be executed after due consideration. If our redis clients(like Jedis/Lettuce clients in java) reconnect, the above command will kill all connections and valid connections will reconnect, so that the application will be ok.

However if our clients don't reconnect, then we need to manually identify the clients/ip addresses which are expired/need to be terminated, and we need to kill those clients only.


Killing redis connections from an ip address

If we want to kill connections of a particular type from a particular ip address, then we can identify those connections using the CLIENT LIST command in redis, and kill them using the CLIENT KILL command in redis.

The below command will kill all pub-sub connections of a given ip address '10.150.20.30'

redis-cli -h 127.0.0.1 -p 6379 CLIENT LIST | grep 'sub=1' | grep '10.150.20.30' | awk  {'print $2'} | awk -F "=" {'print "CLIENT KILL ADDR " $2'} |  redis-cli -h 127.0.0.1

In the above, "redis-cli -h 127.0.0.1 -p 6379 CLIENT LIST | grep 'sub=1' | grep '10.150.20.30' " will get the pub-sub connections from redis client list command, then get "id=10.150.20.30:port" using awk  {'print $2'} and print client kill ADDR <ip:port> and pass it to redis-cli command.

This will ensure that all those connections satusfying the criteria are killed.

Similar to above, the below command can also be used, but is a bit slower.

redis-cli -h 127.0.0.1 -p 6379 CLIENT LIST | grep '10.150.20.30' | awk  {'print $2'} | awk -F "=" {'print "CLIENT KILL TYPE pubsub addr " $2'} |  redis-cli -h 127.0.0.1

The complete details of the redis client kill command can be found here.

:)



Friday, 2 December 2016

How to get the rdb location, and config file location in redis.

Sometimes when we login to a redis server, we can see the running redis on it, but we don't know where are the configuration files and the rdb backups files for the redis. In this post, we will see how to find out the location of various configuration files in redis.

The below are the important files in redis.


  1. redis.conf          => main configuration file in redis
  2. dump.rdb          => backup file generated by background save in redis
  3. sentinel.conf     => sentinel  configuration file, required only while running sentinel
  4. nodes.conf        => cluster configuration file, auto-generated by running redis in cluster mode.

Location of redis.conf
redis.conf is the major redis configuration file which has lots of very useful info and lots of highly configurable parameters.
Whether you are a developer who wants to dive more in redis or a redis admin, it is very important to read the file and try to understand the configuration params. See the sample here.

For finding out the location of redis.conf, we will use the INFO command.
info command gives a lot of very useful information about the redis server, one of which is the location of config file.

so, below command gives us the location of redis.conf file.

src/redis-cli -p 6379 info | grep 'config_file'

Location of sentinel.conf file
sentinel.conf file is only required if you are using a sentinel in redis. See the sample sentinel.conf file here. It is normally present in the same directory as redis.conf.
For finding out the sentinel.conf, we can execute the command on sentinel(note that sentinel by default runs on 26379 port while redis runs on 6379 port)

src/redis-cli -p 26379 info | grep 'config_file'

Location of dump.rdb file
dump.rdb file is the default file in which redis will save the data to disk if you enable rdb based persistence in the redis.conf file.
The location of dump.rdb can be obtained in either of the two ways

  • reading the "dir" value in redis.conf file found from the location of redis.conf
           cat redis.conf | grep "dir "
  • getting it from redis config
            Sometimes if we get the "dir " from the redis.conf file, it just shows the current directory( "./")
            In this case, it is better to get it from the redis config by executing the command below.

           src/redis-cli -p 6379 CONFIG GET dir

Note that #2 will only work if CONFIG command is not renamed/disabled in redis. 
If it is you will have to find out the renamed command and then use the renamed command instead of CONFIG.

Location of nodes.conf file
In a redis cluster, each cluster node/instance automatically generates a nodes.conf file where the view of the cluster with respect to that node is stored. 
It stores the cluster related information like

For all the nodes in the cluster, it stores
  1. the current node, and whether it is a master or a slave.
  2. no of clients connected to it.
  3. slot ranges for which it holds data(if it is a master)
  4. the epoch
the nodes.conf file is also stored in the same location as dump.rdb.
 

Sunday, 27 November 2016

single command to kill all the redis processes running on a system

A running redis instance can be stopped by executing the command SHUTDOWN. This command will save the running redis into a backup RDB file and stop the redis server. It also has the option of not saving the RDB file and just stopping the redis server by executing "SHUTDOWN NOSAVE".

src/redis-cli SHUTDOWN NOSAVE



Sometimes in redis, particularly when testing redis cluster, we have a number of redis instances running on a system.



When we need to stop all the instances of the redis cluster, we can execute "SHUTDOWN NOSAVE" on all the instances one by one.

A better way is to have a single for loop which does the job

for port in {7000..7005} do ; src/redis-cli -p $port SHUTDOWN NOSAVE ; done ;



What if SHUTDOWN command is renamed or disabled.

Many times, we prefer to disable or rename the SHUTDOWN command as described here, so that it cannot be used by a rogue client. If that is the case, redis servers will need to be stopped by killing the redis processes. Again, the normal way for this is to find out the processes using ps -ef | grep "redis-server" and then manually passing the process ids to kill command. However this can be done by a single command like the below.

ps -ef | grep 'redis-server' | awk '{ print $2}' | xargs kill -9

The above command will find out all the processes having 'redis-server' and get their Ids and pass them to the kill command.

:)

Friday, 25 November 2016

redis data types and basic commands for each in redis

Sometime when working with redis, we need to know the data type of a particular key in redis.
This is useful to know what kind of commands should be executed against that key.
A key can have any one of the below data types.
  1. String
  2. Hash
  3. List
  4. Set
  5. Sorted Set
  6. HLL(they are stored as string types internally)
The "TYPE" command in redis can be used to find out type of a particular key.

Lets say we have a redis instance and we find out the keys by executing the "keys *" command. For simplicity, i have separate types of keys in redis.


For each of these keys, the 'type' command will give the type of the key.


For performing basic view operations on each of the types of redis keys, below are the types and commands for each.

String
          GET  command in redis can be used to get the data in a string key.

 src/redis-cli GET mystringkey

    


Hash

         HGETALL command can be used to view the data in a hash. It returns a list of values, which consists of the key followed by the value of that key.  Like in the below, in the key "myhashkey", field1 has value value1 while field2 has value value2. Hashes are normally stored to store an object which can have many fields, like a user can have various fields like name, age, gender, etc, so that the key can be user:<userid>, fields can be name, age, gender etc and values can be 'Jurgen', 50, 'm'

src/redis-cli HGETALL myhashkey



List
        Lists in redis are infact linked list, and data can be pushed or removed from either ends.    LRANGE command in redis can be used to view the data in a list, while LLEN can be used to get the length of the list.

src/redis-cli LLEN mylistkey
src/redis-cli LRANGE mylistkey -100 100

     


Set
          Set in redis consists of unique elements. The values in set can be viewed using the SMEMBERS command. 

src/redis-cli SMEMBERS mysetkey




Zset (Sorted Set)
          Sorted Set in redis contains the string data along with a score. Finding out any entry for a particular score takes log N time where N is the number of entries stored in a key. ZRANGEBYSCORE command can be used to view the data in a sorted set. Sorted sets are normally used for storing data involving some leaderboard where each user has a particular score. Another use case where sorted sets can be useful is in reporting where the data can be stored as value and date in YYMMDD format can be stored as a score. To get the data between two dates, ZRANGEBYSCORE query can be used.

src/redis-cli ZRANGEBYSCORE mysortedsetkey -INF +INF WITHSCORES
src/redis-cli ZRANGEBYSCORE mysortedsetkey 90 100 WITHSCORES



HLL
         HyperLogLog(HLL) in redis is used to calculate the estimate the number of unique values in a key without storing the actual values!!! Sounds too good to be true, go through this link to understand more. HLLs are stored as strings, so you can do the normal GET operation on it, but to approximately calculate the number of unique keys in a key, the command which is used is PFCOUNT

src/redis-cli PFCOUNT myhllkey


So what happens when we try to try to get a key using a separate command that what is expected for that type, say when we try to GET a hash key instead of HGETALL?
We get the error, "WRONGTYPE Operation against a key holding the wrong kind of value" which is expected since we should be using HGETALL command to get the data from a hash instead of GET.

:)

Sunday, 20 November 2016

How to get a random key in redis

Sometimes when we are on the redis prompt on an unknown database, we need to see some random key to know what sort of keys exist in this redis instance.
We can try the "keys *" command but it scans all the keys.
To get a random key, redis "randomkey" command can be used, as documented here.


As expected randomkey returns a random key from the dataset.



Monday, 14 November 2016

Redis Cluster - How to fix a cluster which is in broken state after migration

Sometimes in redis cluster, we need to expand the redis cluster by adding more machines. This is accomplished by adding more machines, making them as part of the cluster and resharding the existing cluster as explained here.

However, many times, the cluster is stuck in an inconsistent state when there is an error in resharding. This can happen because of many reasons like sudden termination of reshard script, redis timeout error while moving slots(in case the keys are too big), or if a key already exists in the target database(because it involves migration).

There is no quick fix to these problems, and one would have to understand the internals of how a slot movement happens.

But there is a general rule on what can be done to try to fix if a redis cluster is stuck in an inconsistent state during reshard/migration.

The following two are important ways to fix a cluster which was broken because of migration.


Run the Fix Script

Fixing a resharding error can be done by running the fix script provided by redis.
It can be run using

./redis-trib.rb fix 127.0.0.1:7000

We will need to change the ip address and port as per your configuration. Also, you only need to provide the ip address/port of only a single node which is part of the cluster. The configuration of the other nodes are read automatically by the script.


If the above cannot fix the cluster state, then you can follow the below step.

Setting the Cluster slots to a particular node.

Manually checking the keys in the unstable slot, and setting the slot to be served by a particular slot. We can execute the "cluster nodes" command on all the nodes and see if any slot in set in a migrating/importing state. If we are sure a slot belongs to a particular node and the node holds and serves the data for that slot, we can set the slot to that node by executing the cluster setslot <slot> node <nodeid> command as described here.

If a node 127.0.0.1:7000 does not have the correct configuration as per cluster nodes command executed on it, and it shows that the slot 1000 is with some other node, but for all other nodes, it shows that it is with the node with node id abcdefgghasgdkashd, then we need to correct the configuration of that node(127.0.0.1:7000). It can be executing like  the following.

redis-cli -h 127.0.0.1 -p 7001 cluster setslot 1000 node abcdefgghasgdkashd

The above command just assures the node 127.0.0.1:7000 that the slot 1000 is served by the node with node id abcdefgghasgdkashd, and that the node 127.0.0.1 should correct its configuration to affect the below.

Note that you need to run it if you are sure that all other nodes agree to it with their cluster nodes command, and you are sure that the data for slot 1000 resides in the node abcdefgghasgdkashd.


Friday, 11 November 2016

redis cluster - slave sync with master failed - client scheduled to be closed ASAP for overcoming of output buffer limits.

Sometimes while working with redis cluster, and while the master-slave replication happens, sometimes slave is not able to sync itself to the master.
While syncing, the slave's connection to the master breaks again and again.

The following are the logs on the slave instance(IP:Port are masked).

10 Nov 17:09:20.387 * Connecting to MASTER IP1:PORT1
10 Nov 17:09:20.387 * MASTER <-> SLAVE sync started
10 Nov 17:09:20.387 * Non blocking connect for SYNC fired the event.
10 Nov 17:09:20.387 * Master replied to PING, replication can continue...
10 Nov 17:09:20.388 * Trying a partial resynchronization (request 479e75d20ac03150de943a24d9b61300b6fa59d0:266086101935).
10 Nov 17:09:20.894 * Full resync from master: 479e75d20ac03150de943a24d9b61300b6fa59d0:266259284057
10 Nov 17:09:20.894 * Discarding previously cached master state.
10 Nov 17:10:31.281 * MASTER <-> SLAVE sync: receiving 3811948917 bytes from master
10 Nov 17:10:40.138 * MASTER <-> SLAVE sync: Flushing old data
10 Nov 17:11:11.933 * MASTER <-> SLAVE sync: Loading DB in memory
10 Nov 17:12:37.115 * MASTER <-> SLAVE sync: Finished with success
10 Nov 17:12:37.849 # Connection with master lost.
10 Nov 17:12:37.849 * Caching the disconnected master state.
10 Nov 17:12:38.866 * Connecting to MASTER IP1:PORT1
10 Nov 17:12:38.866 * MASTER <-> SLAVE sync started
10 Nov 17:12:38.866 * Non blocking connect for SYNC fired the event.

The error keeps repeating itself.
Somehow the error above does not give much clue. Just when the master-slave sync succeeds, connection with master is lost.

Looking at the master logs give more clue.
Below are master logs.

10 Nov 17:09:20.383 * Slave IP2:PORT2 asks for synchronization
10 Nov 17:09:20.383 * Unable to partial resync with slave IP2:PORT2 for lack of backlog (Slave request was: 266086101935).
10 Nov 17:09:20.383 * Starting BGSAVE for SYNC with target: disk
10 Nov 17:09:20.889 * Background saving started by pid 37651
10 Nov 17:10:30.557 * DB saved on disk
10 Nov 17:10:30.844 * RDB: 283 MB of memory used by copy-on-write
10 Nov 17:10:31.276 * Background saving terminated with success
10 Nov 17:10:39.561 * Synchronization with slave IP2:PORT2 succeeded
10 Nov 17:11:09.060 # Client id=2292912 addr=IP2:23666 fd=233 name= age=109 idle=9 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=7104 omem=127259816 events=rw cmd=psync scheduled to be closed ASAP for overcoming of output buffer limits.
10 Nov 17:11:09.160 # Connection with slave IP2:PORT2 lost.
10 Nov 17:12:38.861 * Slave IP2:PORT2 asks for synchronization
10 Nov 17:12:38.861 * Unable to partial resync with slave IP2:PORT2 for lack of backlog (Slave request was: 266261610586).
10 Nov 17:12:38.861 * Starting BGSAVE for SYNC with target: disk


The Key part is cmd=psync scheduled to be closed ASAP for overcoming of output buffer limits.

It seems to show that the slave connection was closed as it reached output buffer limits for the master.
Somehow we need to fix this.
There is a setting in the conf file
client-output-buffer-limit slave 256mb 64mb 60

Reading the excellent documentation seems to suggest that the two limits, 256 mb and 64 mb are two limits, hard limits and soft limits. If the client connection for slave exceeds 256 mb or exceeds 64 mb continuously for 60 seconds.

We can either increase this setting, and restart the master server, but it seems too much of an effort and a risk and restarting a master may make its outdated slave trying to become a master.
Thankfully, this can be changed at runtime by executing the following command on master server.


redis-cli -h MASTER_IP -p MASTER_PORT CONFIG get client-output-buffer-limit

It gives the output as 
1) "client-output-buffer-limit"
2) "normal 0 0 0 slave 268435456 67108864 60 pubsub 33554432 8388608 60"

Notice that the hard and soft limits for slave is 256 MB/64 MB (268435456/67108864 bytes)
We will update it to  1 GB/256 MB(1073741824/268435456 bytes) by executing the following command.


redis-cli -h MASTER_IP -p MASTER_PORT CONFIG set client-output-buffer-limit "normal 0 0 0 slave 1073741824 268435456 60 pubsub 33554432 8388608 60"

Note that depending on how far your slave is lagging behind the master and how fast your slave is able to consume the data from the master while syncing, you may have to increase the limit to more than 1GB/256 MB. Just be sure to go through the excellent documentation of client-output-buffer-limit in redis.conf.

After executing this command, we see that the slave succeeds in syncing with the master and does not give any other error.  


Below are the logs on slave after executing the command.

10 Nov 17:15:58.692 * MASTER <-> SLAVE sync started
10 Nov 17:15:58.692 * Non blocking connect for SYNC fired the event.
10 Nov 17:15:58.692 * Master replied to PING, replication can continue...
10 Nov 17:15:58.692 * Trying a partial resynchronization (request 479e75d20ac03150de943a24d9b61300b6fa59d0:266435075281).
10 Nov 17:15:59.195 * Full resync from master: 479e75d20ac03150de943a24d9b61300b6fa59d0:266612963059
10 Nov 17:15:59.195 * Discarding previously cached master state.
10 Nov 17:17:10.490 * MASTER <-> SLAVE sync: receiving 3811953308 bytes from master
10 Nov 17:17:19.613 * MASTER <-> SLAVE sync: Flushing old data
10 Nov 17:17:53.203 * MASTER <-> SLAVE sync: Loading DB in memory
10 Nov 17:19:19.332 * MASTER <-> SLAVE sync: Finished with success
10 Nov 18:01:31.139 * Background saving started by pid 16989


:)

Tuesday, 1 November 2016

How to move a single slot in redis cluster

In redis cluster, there are a total of 16384 logical slots, which are divided between multiple masters.
Often, when we need to add nodes to the redis cluster, we need to reshard the data.
This is done by redis-trib.rb utility file provided by redis.
However the redis-trib.rb utility can move N no of slots between masters, but it does not provide a function to move a single slot.
Moving a slot normally involves 3 steps as described here.


  1. setting the destination node to receive a slot.
  2. setting the source node to migrate a slot.
  3. migrating the data from source node to destination node.
  4. communicating source and destination nodes that the destination node is the owner of the slot.

Below is shown an existing cluster where the slots are equally divided between nodes.
Also, each node has some data.



We will try to move a slot, say slot no 102, from redis running on port 7000 to redis running on port 7001
The java code for it is below.



import java.util.List;

import com.google.common.collect.Lists;
import com.lambdaworks.redis.MigrateArgs;
import com.lambdaworks.redis.RedisClient;
import com.lambdaworks.redis.RedisURI;
import com.lambdaworks.redis.api.sync.RedisCommands;
import com.lambdaworks.redis.cluster.models.partitions.ClusterPartitionParser;
import com.lambdaworks.redis.cluster.models.partitions.Partitions;
import com.lambdaworks.redis.cluster.models.partitions.RedisClusterNode;
import com.lambdaworks.redis.codec.Utf8StringCodec;
import com.lambdaworks.redis.protocol.CommandArgs;

public class RedisMoveSingleSlot {
public static void main(String[] args){
String sourceHost = "127.0.0.1";
int sourcePort = 7000;
String destHost = "127.0.0.1";
int destPort = 7001;
int slotToMove = 102;
moveSlot(sourceHost, sourcePort, destHost, destPort, slotToMove);
}
private static void moveSlot(String sourceHost, int sourcePort, String destHost, int destPort, int slot) {
// create connections to source and destinations.
RedisClient sourceClient = RedisClient.create(RedisURI.create(sourceHost, sourcePort));
RedisClient destinationClient = RedisClient.create(RedisURI.create(destHost, destPort));
RedisCommands<String, String> source = sourceClient.connect().sync();
RedisCommands<String, String> destination = destinationClient.connect().sync();
// get the source node Id and destination Node Id.
String sourceNodeId = source.clusterMyId();
String destinationNodeId = destination.clusterMyId();

//STEP 1, set destination to be importing slot from source.
destination.clusterSetSlotImporting(slot, sourceNodeId);
//STEP 2, set source to be migrating slot to destination.
source.clusterSetSlotMigrating(slot, destinationNodeId);
// STEP 3 starts, we need to move all the keys from source to destination.
List<String> keys = source.clusterGetKeysInSlot(slot, Integer.MAX_VALUE);
// move keys in batches of 100, 
int keySize = 100;
System.out.println("Slot : " + slot + " : got keys of size : " + keys.size() );
if(keys != null && keys.size() > 0){
List<List<String>> listOfListOfKeys = Lists.partition(keys, keySize);
int noOfKeysMoved = 0;
for(List<String> listOfKeys : listOfListOfKeys){
MigrateArgs<String> migrateArgs = new MigrateArgs<>();
migrateArgs.keys(listOfKeys);
migrateArgs.replace();
CommandArgs<String, String> args1 = new CommandArgs<String, String>(new Utf8StringCodec());
migrateArgs.build(args1);
// migrate the keys from source to destination in db 0(default), and will some timeout set, here set to 5000 secs.
source.migrate(destHost, destPort, 0, 5000000, migrateArgs);
noOfKeysMoved = noOfKeysMoved + listOfKeys.size();
System.out.println("totally moved " + noOfKeysMoved + " for slot " + slot);
}
//Thread.sleep(5000);
}
//STEP 3 finished.

// STEP 4, set cluster set slot on source and destination and close the source and destinations.
source.clusterSetSlotNode(slotdestinationNodeId);

destination.clusterSetSlotNode(slotdestinationNodeId);

source.close();
destination.close();

// STEP 4 done.
System.out.println("Moved slot " + slot);
}
}

After the code is executed, it moved 2 keys(which belonged to slot 102) and slot no 102 from redis on 7000 port to redis on 7002 port.
Below was the output of "cluster nodes" and "dbsize" command on each nodes.
Note that the slot 102 moved. Also the no of keys belonging to redis with 7001 port increased by 2.


Storing files in redis

In Redis, data can be stored in various types like String, Hash(HashMap), Set, SortedSet, List etc.
Today we will see the java code to store a file in redis.
File is normally stored in the form of byte array in Redis.
First we start the redis server by executing the below in redis home directory.
src/redis-server


This will start the server.
We can see that there is no data in it by executing the "dbsize" command.


The below java code for uploading the file uses Jedis client, although Lettuce client can also be used.
The code saves redis-3.2.4.tar.gz in the redis, but any other file can also be stored in it.
It will save the byte array in redis. further it will get the data from redis and store it as a file with a different name on the disk.

<pre class="brush: java">
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

import redis.clients.jedis.Jedis;

public class RedisUploadFile {

public static void main(String[] args) throws IOException {
String host = "localhost";
int port = 6379;
String filePath = "/Users/test/redis/";
String fileName = "redis-3.2.4.tar.gz";
String newFileName = "new_"+ fileName;
Jedis jedis = new Jedis(host, port);
byte[] bytes = Files.readAllBytes(new File(filePath+fileName).toPath());
byte[] key =
(filePath+fileName).getBytes();
System.out.println("Setting bytes of length:" + bytes.length + " got from file " + filePath + fileName + "in redis.");
jedis.set(key, bytes);

byte[] retrievedBytes = jedis.get(key);
System.out.println("Saving data in file " + filePath + newFileName);
Files.write(Paths.get(filePath+newFileName), retrievedBytes);
System.out.println("Done.");
jedis.close();

}


}
</pre>

After the above code is run, it will save the tar file in the redis, and will download it from redis and save it in a separate name.

After that, we can see that the data is also stored in redis by executing "dbsize" command, which returns 1, "keys *" command which returns the key.
Getting the key returns the stored binary representation of the file.


Also, we can see that the file is also created on the file system.




Friday, 14 October 2016

Renaming Redis Commands and executing Monitor command after renaming.

Renaming commands is one of the important steps to secure your redis instance. It is particularly important to rename commands like
FLUSHDB, FLUSHALL, KEYS, CONFIG, RENAME, MONITOR, SHUTDOWN.

Most of the commands work as expected after renaming. However, the command MONITOR behaves a bit differently.
Here, we have a look at why it behaves differently and what can be done to counter it.

Lets fire up our redis instance on default port 6379 by navigating into redis directory and executing src/redis-server redis.conf




Now, open another command prompt and try to insert some data continuously by executing the below.
while true ; do src/redis-cli set testMonitorKey testMonitorValue ; sleep 1; done


The above will keep on inserting a key in redis. Lets let it run without stopping.

Lets see whether we can use monitor command from redis-cli.
This is done by executing src/redis-cli MONITOR


We can see that the MONITOR command is able to monitor the traffic on the redis instance.
So, monitor command works as expected when it is not renamed.

Now, lets rename the MONITOR command to "EAVESDROP"(or anything else) in redis.conf file.
After that we need to restart the redis server,





Now, lets try to execute monitor command, both original and renamed command(eavesdrop)
src/redis-cli MONITOR
src/redis-cli EAVESDROP


Although MONITOR command gives the error that it is unknown(since we renamed it), it is strange because the terminal waits for the other output!!!
Also, EAVESDROP command just returns OK and does nothing else!!!

One wonders why.
Lets give it another shot.
Lets go inside the redis-shell using redis-cli and execute EAVESDROP and MONITOR separately.


It works!!!! YAY!!!

The issue is that for MONITOR to work properly, both client and server should know that the client has asked for monitoring, so that the client can keep on listening to whatever the server has to feed it.

This can be seen in the redis-cli.c file in redis codebase, that redis-cli listens to MONITOR command.


Now if we rename it at the server, we either need to rename it at the client or send the renamed command to the server and MONITOR command to the client.


When we execute two separate commands like "redis-cli EAVESDROP" and "redis-cli MONITOR", it executes these commands in two separate sessions of redis-cli
Now, when we renamed the MONITOR command to "EAVESDROP", we changed the state at the server but not the client(redis-cli)
When we executed the eavesdrop command, it sent the request to the server to send the data to this client.
However, only after we executed the MONITOR command on redis-cli client, that it was able to listen to it.

But the problem is that the redis-cli shell does not output to a file. What if we have a large number of commands being executed on redis and we want the output to be redirected to a file, so that we can search it easily(or view the tail).
In that case, we should make sure that only one instance of redis-cli should execute both these commands and output it to a file.

This can be done, by creating a file(rename.txt here) and feeding it as input to redis-cli and redirecting the output to an output file.

cat rename.txt 
cat rename.txt | redis-cli > runningCommands.out


Also, in another terminal, look at the output of runningCommands.out using 
tail -f runningCommands.out


So, finally we were able to solve the issue of monitoring the commands in redis-cli after renaming the MONITOR commands. :)


Sunday, 25 September 2016

Redis Cluster - How to create a cluster without redis-trib.rb file

The redis documentation here mentions about creating the redis cluster using a redis-trib.rb script. However, just to know how it works internally, one should know how to create the cluster without the redis-trib.rb script.

This blog assumes that you have 6 redis instances running on ports 7000, 7001, 7002, 7003, 7004, and 7005.

The redis cluster creation involves the below steps.

  1. Starting all the redis nodes in cluster mode.
  2. Making all the nodes meet each other.
  3. Assign slots to the nodes.
  4. Assign slaves to masters.

Step 1: Assuming the redis nodes are started in cluster mode, you need to check whether redis is indeed running using 'ps -ef | grep redis' command.



Step 2: Making the nodes meet each other.
This is done using the cluster meet command as mentioned here.
redis-cli -c -p 7000 cluster meet 127.0.0.1 7001
redis-cli -c -p 7000 cluster meet 127.0.0.1 7002
redis-cli -c -p 7000 cluster meet 127.0.0.1 7003
redis-cli -c -p 7000 cluster meet 127.0.0.1 7004
redis-cli -c -p 7000 cluster meet 127.0.0.1 7005

This way, all the nodes will meet the node 7000, and in turn, each other.



Now all the nodes have met each other. Execute "redis-cli -p 7000 cluster info" command to check the output. It should show the cluster_state as failed, and cluster_slots_assigned as 0.



Step 3: Assign Slots to the nodes.
This is done using the "cluster addslots" command as mentioned here.
Below we use inline shell script to assign the slots to the three nodes and ignore the output by directing it to "/dev/null". If we don't direct the output anywhere, it will still work, only that it will show "OK" many times.

for slot in {0..5461}; do redis-cli -p 7000 CLUSTER ADDSLOTS $slot > /dev/null; done;
for slot in {5462..10923}; do redis-cli -p 7001 CLUSTER ADDSLOTS $slot > /dev/null;; done;
for slot in {10924..16383}; do redis-cli -p 7002 CLUSTER ADDSLOTS $slot > /dev/null;; done;

Each command may take some time, around a minute. Executing "cluster info" after it shows that cluster state is "ok" and cluster_slots_assigned are 16384.


Step 4: Assign Slaves to masters.
Note that we had assigned slots to 7000, 7001, 7002, and plan to make the other nodes as their slaves.
Slaves are assigned using "cluster replicate" command as described here.
The syntax of the command is
"redis-cli -c -p <port_of_slave> cluster replicate <node_id_of_its_master>"
To find the node id of masters, we can use "cluster nodes" command and note the ids corresponding to ports 7000, 7001, 7002.

After that we will execute (Note your node ids will be different than below)
redis-cli -p 7003 cluster replicate 0db8943cccb9b8c3963af78019baf8d2db827f14 
redis-cli -p 7004 cluster replicate add93eed2b5f05371afab5d2611381ebc363d9c7
redis-cli -p 7005 cluster replicate 12e0e6c2e59b8ac284d413623b7d73fd3ec56383

Executing "redis-cli -p 7000 cluster nodes" after it will show that you have slaves also.


Congratulations, your cluster is up and running.
Now, you can add some dummy data in it and start playing around with it.