blog tumblog github @santosh79 resume
July 2nd 2011

Concurrency and Redis

Redis is an amazing global datastructure server. The fact that it’s global, makes it ideal for a multi-process or multi-threaded system, to get some concurrency action going. This also means, that a lot of the cautions that need to be taken while working in a shared memory system also apply to a situation where redis is operating in a concurrent/distributed environment. In this article, I am going to glaze over a couple of gotcha’s to watch out for when working with Redis. It is by no means an attempt to be an exhaustive monograph on concurrency and Redis, but rather something to get your feet wet.

Having the rug pulled from under you

Check out the following code:

if redis.exists("some_key")
  puts "Yay! Redis' got it"
  compute_primes   #Perform some time-intensive computation
  val = redis.get("some_key")
  render :json => { :value => val }
end

This code that checks for the existence of a key in redis in line 1 and then performs some conditional logic, part of which involves retrieving the key from redis has a race condition in it. It is the fact that in between lines 1 and 4 another process could’ve deleted the key from redis. A quick fix for this:

if val = redis.get("some_key")
  #rest of the code here ...
end

Modifying keys safely

Again an example is far more illustrative:

def update_safe_ips
  redis.del("safe_ips")
  safe_ip_ids = SafeIps.select(id).all.map(&:id)
  safe_ip_ids.each { |safe_ip_id| redis.lpush("safe_ips", safe_ip_id) }
end

What this method is supposed to do is update the safe_ips redis list with stuff from the SafeIps table in the DB. The problem with this code is that it’s too eager to delete the “safe_ips” list. After line 1 executes the safe_ips list is nixed from redis. Assuming a different process runs after line 1 executes, and that process depends upon this safe_ips list existing, it’s going to blow. So what’s the solution? For any kind of operations that involve updating a redis datastructure, avoid deleting it. Instead lean towards creating a “temp” version of the datastructure and using the rename command, which is atomic. A second pass at fixing the code looks something like this:

def update_safe_ips
  safe_ip_ids = SafeIps.select(id).all.map(&:id)
  safe_ip_ids.each { |safe_ip_id| redis.lpush("safe_ips_temp", safe_ip_id) }
  redis.rename "safe_ips_temp", "safe_ips"
end

While this code looks like it should work great it still has a race condition in it. If two processes were to concurrently hit this method, they both would each populate this “safe_ids_temp” list thereby creating dups. In essence, if:

SafeIps.select(id).all.map(&:id) # => ["10.0.0.1", "10.0.0.2", "10.0.0.3"]

Then if two processes were to execute this update_safe_ips method at the same time, the “safe_id_temp” list could be [“10.0.0.1”, “10.0.0.2”, “10.0.0.3”,“10.0.0.1”, “10.0.0.2”, “10.0.0.3”]. To protect against this:

def update_safe_ips
  safe_ip_ids = SafeIps.select(id).all.map(&:id)
  temp_list   = %Q{ safe_id_temp#{UUID.getUUID} }
  safe_ip_ids.each { |safe_ip_id| redis.lpush(temp_list, safe_ip_id) }
  redis.rename temp_list, "safe_ips"
end

This code assumes the existence of a UUID library, which returns a unique ID on every call. Now, if more than one process were to run, they would each create their own temp_list’s. This way, dups will not be created.

Running a piece of code only once

Often times, we would like for a certain piece of code to run successfully exactly once. A classic example of this is something like setting up some auth tokens:

def setup_auth_tokens
  username   = redis.hget "web_service_creds", "username"
  pwd        = redis.hget "web_service_creds", "pwd"
  auth_token = get_auth_token username, pwd
  redis.set "web_service_auth_token", auth_token
end

Now, we know that this setup_auth_tokens method is going to be called multiple times in a concurrent environment. How do we ensure it executes successfully just once, in the lightest possible manner? A first stab could be:

def setup_auth_tokens
  unless redis.setnx "setting_up_auth_token", true
    username   = redis.hget "web_service_creds", "username"
    pwd        = redis.hget "web_service_creds", "pwd"
    auth_token = get_auth_token username, pwd
    redis.set "web_service_auth_token", auth_token
  end
end

The setnx command would return true, only if the key does not already exist. This way we can force a certain block of code to be executed only once. While this would ensure that the block in the ensure happens only once, if for some reason an exception gets thrown in the block code no other attempts are made at re-setting the auth token. A quick fix:

def setup_auth_tokens
  begin
    unless redis.setnx "setting_up_auth_token", true
      username   = redis.hget "web_service_creds", "username"
      pwd        = redis.hget "web_service_creds", "pwd"
      auth_token = get_auth_token username, pwd
      redis.set "web_service_auth_token", auth_token
    end
  rescue
    redis.del "setting_up_auth_token"
  end
end

Conclusion

Redis is shared memory on steroids. Working with redis in a concurrent environment is both fun and highly performant. Enjoy!

blog comments powered by Disqus