Avoiding cache rebuild deadlock
rebuilding a cache object from a database can take some time. on high traffic sites, many users can simultaneously query an expired cache item and trigger multiple rebuilds at the same time. i wrote this pseudocode that should prevent that:
function cache_get(key, default, rebuildcallback)
{
cacheobject = get_from_cache(key);
if(empty(cacheobject)){
return default;
}
if(cacheobject.expire >= time()){
return cacheobject.value;
}elseif(cacheobject.rebuilding){ // <-
return cacheobject.value; // <-
}
cache_set_rebuilding(key); // <-
newvalue = rebuildcallback(); // <-
cache_set(key, newvalue);
return newvalue;
}
function cache_set_rebuilding(key)
{
cacheobject = get_from_cache(key);
cacheobject.rebuilding = true;
write_to_cache(key, cacheobject);
}
function cache_set(key, value)
{
cacheobject = get_from_cache(key);
cacheobject.rebuilding = false;
cacheobject.value = value;
cacheobject.expire = time() + 3600;
write_to_cache(key, cacheobject);
}
(write_to_cache()
and get_from_cache()
are assumed to be functions that directly write/read from/to the cache backend)
the rebuilding
property of the cache object could also be implemented as a timestamp to trigger a rebuild again if it doesn't finish in a certain amount of time.