ein Blog

Avoiding cache rebuild deadlock

rebuilding a cache object from a database can take some time. on high traffic sites, many users can simultaneously query an expired cache item and trigger multiple rebuilds at the same time. i wrote this pseudocode that should prevent that:

function cache_get(key, default, rebuildcallback)
{
    cacheobject = get_from_cache(key);
    if(empty(cacheobject)){
        return default;
    }

    if(cacheobject.expire >= time()){
        return cacheobject.value;
    }elseif(cacheobject.rebuilding){ // <-
        return cacheobject.value;    // <-
    }

    cache_set_rebuilding(key);       // <-
    newvalue = rebuildcallback();    // <-
    cache_set(key, newvalue);
    return newvalue;
}

function cache_set_rebuilding(key)
{
    cacheobject = get_from_cache(key);
    cacheobject.rebuilding = true;
    write_to_cache(key, cacheobject);
}

function cache_set(key, value)
{
    cacheobject = get_from_cache(key);
    cacheobject.rebuilding = false;
    cacheobject.value = value;
    cacheobject.expire = time() + 3600;
    write_to_cache(key, cacheobject);
}

(write_to_cache() and get_from_cache() are assumed to be functions that directly write/read from/to the cache backend)

the rebuilding property of the cache object could also be implemented as a timestamp to trigger a rebuild again if it doesn't finish in a certain amount of time.