Implementing Cache Tagging with Redis


Cache tagging with Redis

Redis has quickly become one of the top open source key value stores around. We at Stackify have been using Windows Azure’s Managed Cache but have had a long history of intermittent problems with it. So we decided it was time to bite the bullet and give Redis a try, which Azure now supports and actually recommends it over their previous Managed Cache. We had one big problem though. Several important parts of our system relied on cache tags and Redis has no out-of-the-box support for tags.

Summary (TL;DR)

We use a Lua script to create a Redis “SET” for each tag that contains all the keys associated with the tag. We have some logic to ensure expiration of the tags corresponds properly with the keys and we have some cleanup code to eliminate old keys created for tags no longer needed. See Sample project on GitHub based on ServiceStack and C#.

How we use tags

We use tags for a few different purposes but the most important is to track which users are currently logged in to our system and what are they currently looking at the UI. Stackify collects monitoring metrics data, as it flows into our system. We use SignalR to update the UI in real-time to show the latest monitoring data and alert status. It would be a huge waste of resources to send every result over SignalR to the UI if no one was there to see it.

For every user logged in we store  what they are viewing in cache sort of like this:

Cache key: Client1-User1-UniquePageID
Cache value: true  (not really used)

Related Tags:

  • ServerID1
  • ServerID2
  • ServerID3

When results come in we look for all cache keys that have a specific tag “ServerID1”, for example, then we know by the cache keys that had that tag, what users are logged in and if we should send SignalR data via the UI.

How did we implement tagging

We chose to implement tagging using Redis’ SET datatype. Redis SETs are unordered and are a true set – meaning duplicate values are not allowed. Their rough analog in the .NET framework is HashSet<T>. To achieve tagging functionality in Redis, we represent each tag as a SET whose values are the keys to the cache entries that have been associated with that tag. From our example it looks something like this:

Key: tag:ServerID1

Value: Redis “SET”

  • Client1-User1-UniquePageID
  • Client1-User2-UniquePageID
  • Client1-User3-UniquePageID

The process of using the tag is then to get all members of the tag’s SET and take action (MGET, DEL, EXISTS, etc.) on each member accordingly. In order to keep our operations atomic, we’ve taken advantage of Redis’ built-in scripting capabilities. Redis guarantees that execution of Lua scripts will happen atomically (just as any other single command does), so we don’t have to worry about leaving our key-tag associations in an indeterminate state.

Once we implement our tagging Lua script in our cache library, adding cache items with tags is simple. Here’s an example below of the simple C# call to PUT a key-value pair with the tags “red” and “blue” associated to it.

Note: We are utilizing the ServiceStack library for connecting to Redis and our sample project on GitHub  also utilizes the ServiceStack with some additional methods we have made around tags. The Lua scripts in our sample project could be used via any programming language.

int numberOfTagsSet = cache.SetWithTags(cacheKeyToSet, cacheValueToSet, 
        new[] { "red", "blue" }, TimeSpan.FromSeconds(60));

Lua script to add the tags. Need to learn more about Redis and Lua? Check out this nice beginners guide: Lua: A Guide for Redis Users

For adding tag(s) to a specific cache item.
Given a list of tags (KEYS[2..n]), marks the designated cache item
(id passed as KEYS[1]) with the specified tag(s).

local tagcount = 0
local cacheKey = KEYS[1]
local exp = ARGV[2]
local setValue = ARGV[1]

--loop through all the tags
for i=2,#KEYS do
    -- find existing expiration of this tag
    local tagTtl ='ttl', KEYS[i])

    -- Add the cacheKey to the list of keys related to the tag in the Redis SET
    tagcount = tagcount +'sadd', KEYS[i], cacheKey) 

    -- Sets/updates expiration on the tag key'expire', KEYS[i], math.max(tagTtl, exp or 3600)) 

if(setValue) then
    if(exp ~= nil) then'setex', cacheKey, exp, setValue) -- sets the cached value with expiration
    else'set', cacheKey, setValue) -- sets the cached value

return tagcount

Sample of getting related keys by tag from our sample project:

var keys = cache.GetKeysByAnyTag(new[] { "red", "blue" });

GetKeyByAnyTag is an extension method that we made to ServiceStack that does the following. You could extend this to not only return keys but also return the objects themselves from cache. As part of that you may want to ensure that those keys still exist or TTL has not expired on them. We have done several variations in our own libraries but we opted to keep this sample project simple.

public static HashSet<string> GetKeysByAnyTag(this IRedisClient client, params string[] tags)
if (client == null) throw new ArgumentNullException("client");

if (tags == null || tags.Length == 0) return new HashSet<string>();


//ServiceStack methods that gets all the SETs based on the tag names 
//and then pulls together all of the keys related to the tag
HashSet<string> taggedKeys = (tags.Length == 1)
? client.GetAllItemsFromSet(tags[0])
: client.GetUnionFromSets(tags);

// NOTE: depending on your specific use case, you may want to
// check that the returned keys are still valid by checking
// EXISTS or TTL on each one, or you may want to call the tag
// cleanup script prior to retrieving the keys in the first
// place.return taggedKeys;

Cleaning up the tag lists

One thing we do have to worry about in this scenario is tag-set growth. While Redis will happily persist a cache entry which never expires, for our purposes using it as a cache, almost everything we persist to Redis is given an explicit TTL. This presents a problem in our chosen tagging strategy, because Redis only allows the expiration of entire entries – expiration cannot be set to a single value within any of Redis’ collection data types. This means our tag SETs will eventually contain values that point to keys that have expired. For very active tags with an unbounded set of associated keys, this can result in a large amount of invalid data (up to the point of consuming all allocated memory), and time wasted attempting to operate on cache entries that don’t exist. It’s a gotcha type of problem without a highly elegant solution –  you just have to choose a least-ugly solution and make it work. For us, that turned out to be a tag-cleanup Lua script which is triggered probabilistically by other tag-related cache operations:

public static void CleanupTags(this IRedisClient client)

The cleanup is accomplished by iterating through every key in Redis that is a tag and evaluating if it is still used. You will see advice elsewhere suggesting that Redis’ KEYS operation should never be run in production. Our experience so far suggests that its performance is acceptable for this type of an operation that is only being called a few times per day.

Tag Naming Consistency

It is important that within your caching framework around Redis that you use a consistent format for your tags. For example, our developers may use “blue” and “red” as a tag but we actually prefix it with a “tag:” so it becomes “tag:blue” within Redis. This is important so we know which Redis keys are actually tags and not normal keys without the developers having to think about the formatting of the tag names. This is also important for cleaning up the tags no longer be used. You may also need to prefix your tags (and keys) with a customer number or some other identifier if you have a multi-tenant system. well does it work?

We have been using this new implementation for over a couple months now and so far it has worked very well for us. We are doing thousands of Redis queries per minute looking for “tags” the way we implemented it and seeing 3-4ms response times for our queries which has performed well for us thus far.