Yesterday I’ve presented this at Locaweb/Brazil as part of our NoSQL cycle. Many other people presented interesting new ideas and I’ve opted for this intro and a piece about Redis which I may publish soon.
NoSql Introduction
View more presentations from gleicon.

…or how to do proper caching without heavily modifing your code.

I’ve been using this technique on django views and Juno for some time. It’s just a decorator which inspect the parameters given to a function to search for a key on memcached.

You will find two files on github: and Both of them search on twitter for a given term. The first script executes the search everytime, and the second script implements a 60 second cache, which uses the parameter (search term) as key.

I’m sure there must be another way to deal with parameters without the wrapper class, but so far nothing I’ve tried gave the same result.

For  Django views, one could go so far as using return HttpResponse(your_response, mimetype=’text_plain’) too.


More on memcached

February 17, 2009

Sharing memcache data between different applications is useful and easy, be it as a glorified IPC, a robust distributed cache, rate limit control or any other suggested architecture approach.

There are some caveats tho:

  1. The captain obvious one: if its the case, make sure the way you store your data is readable between different languages. For example, storing in python and reading in java or ruby a pickled object is trivial, but persisting some specific objects, like rails is prone to do, may render the data almost unreadable. Try to use simple serialization formats if possible (like yaml, json, xml).
  2. The other captain obvious one: saving and invalidating data must be done by the application responsible for its integrity, for simplicity and safety sake. Remember cache 101: a cache is not a database. It’s not searchable, and its data must reflect a coherent source of data.
  3. The not so obvious one: if you use more than a memcached server, make sure both clients understand the hashing algorithm which is used to select the right server for the key you are asking. When using the same language and client this is transparent, but there different known ways to select the right server as:
  • md5 hash of the key
  • crc32 based hash
  • native hash (as String.hashCode() in java)
  • pure magic hash (some clients implement non-standard memcache

The case in point is a ruby application using the memcache-client gem and a java application using whalin’s client. If you use more than one server, the ruby client uses it’s unique algorithm, which is CRC32 based. The java client defaults to a NATIVE based algorithm, but contains 3 more algorithms. Keys would never get correct hits this way.

Let’s see how it works :

Code for hashing in Ruby (straight from memcache_client gem)

# Note that the method crc32_ITU_T is a patch for the String class from memcache_client

def hash_for(key)
 (key.crc32_ITU_T >> 16) & 0x7fff

Code for the right hashing algorithm in JAVA:

private static long newCompatHashingAlg( String key ) {
                CRC32 checksum = new CRC32();
                checksum.update( key.getBytes() );
                long crc = checksum.getValue();
                return (crc >> 16) & 0x7fff;

The algorithm is selected by this piece of code, whalin’s memcache client library:

    case NATIVE_HASH:
        return (long)key.hashCode();
        case OLD_COMPAT_HASH:
        return origCompatHashingAlg( key );
        return newCompatHashingAlg( key );
        return md5HashingAlg( key );
        // use the native hash as a default
        hashingAlg = NATIVE_HASH;
        return (long)key.hashCode();

So, before using the client in java, we need to issue setHashingAlg( SockIOPool.NEW_COMPAT_HASH ); on the right SockIOPool object.

That’s it.

Now, for a change …

Really unneccessary section !

We can test the CRC32 based algorithm like this:

Start irb and type:

irb --> require "rubygems"
    ==> true
irb --> require "memcache"
    ==> true
irb --> a = "mykey"
    ==> "mykey"
irb --> (a.crc32_ITU_T() >> 16) & 0x7fff
    ==> 17510

From this, we see that 17510 is the resulting hash for “mykey” key.

The memcache client was required just to attach the crc32_ITU_T() method to the String class, but if you dont want to install it, just paste the following code (which is part of memcache_client) instead:

class String

  # Uses the ITU-T polynomial in the CRC32 algorithm.

  def crc32_ITU_T
    n = length
    r = 0xFFFFFFFF

    n.times do |i|
      r ^= self[i]
      8.times do
        if (r & 1) != 0 then
          r = (r>>1) ^ 0xEDB88320
          r >>= 1

    r ^ 0xFFFFFFFF


Let’s test it in JAVA’s end:


public class TestCRC {

        public static void main(String[] args) {

                CRC32 checksum = new CRC32();
                long crc = checksum.getValue();
                System.out.println(((crc >> 16) & 0x7fff));

Compile and run as:

$ javac
$ java -cp . TestCRC

Again, 17510, as in Ruby. That’s the right value for “mykey”.

Both cases lent 17510 as result, which would then be divided by the number of machines in the pool (e.g. 2) and the mod of this operation is the index of the right server, both in JAVA and Ruby. Weee.

I’ve been using inkscape to draw diagrams, specially abusing its import openclipart feature, but I found out that for simple sequence diagrams, there is another great tool:

Their API is clean and the text parsing very accurate.Choose “Napkin” in the style combo and click draw to see their demo.

I found it via another gem, this caching 101 page for dummies. Things like this and this ‘Threads primer’ are necessary reminder nowadays. There are a lot of butchered applications and architectures popping everywhere, and without these kind souls providing the best of them going thru basic concepts, we’re all doomed.

There goes my contribution, a kind of memcached 101, both in napkin and blue modern styles !

memcached 101 napkin style

memcached 101 napkin style

memcached 101 blue modern style

memcached 101 blue modern style

Paste the code below to generate these diagrams. Note the alt 'command' I used to put both cases in the same diagram.
Gotta love that.

Alice->Application: Asks for her Profile
Application->Memcached: is Alice profile there \?
alt Data is not cached yet
 Memcached->Application: No, it's not here
 Application->Database: get me Alice's Profile
 Database->Application: here is the data - it took me a lot of time, k \?
 Application->Memcached: set Alice Profile there
else data is already cached yay
 Memcached->Application: Got it
Application-->Alice: Response (her profile data)