Alex Kahn

Notes on Redis: Data Modeling, Hashes, and Namespaces

March 30, 2010

What is Redis?

Redis is a simple database. It stores its data in memory and saves a snapshot to disk in the background periodically. It can be used as a simple key-value store like Memcached, but it also supports more complex data types such as lists, sets, sorted sets and hashes. These will come in handy. Redis is versatile and is used for a plethora of tasks: as a job queue, as a cache store, for tracking downloads of Ruby Gems, for storing A/B testing results, and even as the backend for a chat server.

Redis has been receiving a lot of praise lately and is the subject of lot of excitement online. This praise is deserved – it’s an excellent piece of software. I’ve had the pleasure of using Redis recently and here are some of my findings.

Our Use Case

On mediaFAIL, we wanted to track how many mentions a fail receives on Twitter. We wanted to show a count of a fail’s Twitter mentions and roll that number into a fail’s votes – this feature lies somewhere between act.ly and Tweetmeme. Couldn’t we just store each tweet in our votes table? Perhaps. But this would break our “user has many fails through votes” association. How do I store this data in Redis?

Data Modeling

My first pass at the problem took advantage Redis’s sets. For each fail that was mentioned on Twitter, I created a set containing the users that mention a fail. So, key fail:1:tweets contained the set ['alexanderkahn', 'levjoy', 'TimKarr']. Since elements in sets must be unique, I wouldn’t have to worry about a user making multiple mentions of a fail getting counted more than once. This was an elegant solution: it allowed looking up the number of mentions a fail has received using SLEN (set length) and retrieving all the Twitter usernames for a fail using SMEMBERS (set members).

As I continued work on this feature, I realized that I wanted to link to the specific tweet where a Twitter user mentioned a fail. One way to do that would be to set a string key for each member of a set that corresponds to a member of the tweet:1:tweets set. So, for the above example, I would also set fail:1:tweet:alexanderkahn to “12330508057”, the tweet ID for my mention of the fail. But this would cause a lot of keys to be created and mean that looking up usernames and tweet IDs for a fail would require a great deal of queries for a popular fail. Thanks to Redis’s new hash data structure, there was a cleaner way. In a hash, I can store both the username and the tweet ID under the same fail:1:tweets key. The hash for this key would look like {"alexanderkahn" => "12330508057", "levjoy" => "12330508067", "TimKarr" => "12330518057"}. Since hash keys have to be unique, a Twitter mention can’t be counted twice, just like with a set. Now I can look up how many mentions a fail has received with HLEN (hash length) and fetch both usernames an tweet IDS with one HGETALL (hash get all) query. Groovy.

Redis and Ruby

The Ruby library for Redis, written by Ezra Zygmuntowicz, is a pleasure to use. It has certain touches that make it feel like Redis was born for Ruby. One example of this quality is how it handles Redis’s MULTI/EXEC transactions:

r = Redis.new
r.multi do
  r.set 'foo', 'bar'
  r.incr 'baz'
end

In the above code, if an exception is raised during the execution of the block, none of the operations inside are committed. This is just a basic transaction like ActiveRecord provides when working with a SQL database, but is neatly implemented in idiomatic Ruby. The block returns the result of each Redis operation in an array.

Another nice Ruby touch is the way the library translates a Redis hashes into a Ruby hashes. When retrieving the entire contents of a hash using HGETALL, rather than returning a flat list (1. key, 2.value, 3. secondkey, 4. secondvalue) as the Redis internal protocol does, the library turns this list into a Ruby hash using Hash::[]. I had never run into this way of creating a Ruby hash out of an array, but its use here struck me as clever. Check out ri Hash::[] for more on this.

Another good project for working with Redis in Ruby is redis-namespace by Chris Wanstrath. This library helps you compartmentalize your Redis keys to keep different sets of data separate. We’re using this to keep tweets, described above, separate from our A/B testing data. To use it, you just interact with a Redis::Namespace object rather than a Redis::Client object.

redis = Redis.new
namespaced = Redis::Namespace.new(:my_feature, :redis => redis)

# Interact with namespaced Redis as you would a normal Redis client:
namespaced.set "foo", "bar"
namespaced.get "foo"
# => 'bar'
namespaced.keys "*"
# => ['foo']

# The actual key in Redis is prefixed with our namespace:
redis.keys "*"
# => ['my_feature:foo']

Nice of Chris to take care of that for us, no?

One last note. How am I accessing Redis from within my Rails app? Currently, I’m taking a page out of Nick Quaranto’s book and setting a global variable (in environment.rb):

$redis = Redis.new # Or a namespaced equivalent

This variable provides access to Redis from anywhere in the application. This is not necessarily the best way, but for now it works. I’m happy to hear how I could improve upon this.

If you haven’t had a chance to explore Redis yet, I recommend doing so. I hope this post helps push you in the right direction. Some handy resources will be the Redis command reference, this PDF cheat sheet, and following @antirez, Redis’s author, on Twitter.