Redis expire on large set of keys

Asked 14/11, 2013 at 12:46 Answered 24/6 at 14:20

My problem is: i have a set of values that each of them has to have an expire value. code:

set a:11111:22222 someValue
expire a:11111:22222 604800 \\usually equal a week

In a perfect world i would have put all those values in a hash and give each of them it's appropriate expire value, but redis does not allow expire on a hash fields.

problem is that i also have a process that need to get all those keys about once an hour

keys a:*

this command is really expensive and according to redis documentation can cause performance issues. I have about 25000-30000 keys at each given moment.

Does someone knows how can i solve such a problem? thumbs up it guarantee (-;
Roy

Ancier answered 14/11, 2013 at 12:46 Comment(4)

Why do you need to get all keys? Would it be better to just get keys that changed compared to your last run? You can use publish/subscribe to notify changes with loose coupling for instance. Or appending in a list the keys to check for the next run. – Treadwell 14/11, 2013 at 13:3

i need to do work on all the keys, more importantly on the keys that were added on previous run. however i liked your idea of appending keys...but i will still need to run over the appended list and get each key (for it's value). i can guess that it's more that half of the keys so wouldn't it be better just to use my previous method? – Ancier 14/11, 2013 at 13:26

Look at @RienNeVaPlus answer: you can use a zset to store all your keys. Storing all keys in a zset will be faster than a:* still it will take some space in redis. Then you get your zset and iterate through each value to get your hash after. Use for instance expiration as score for your zset then you can compare for each key the expiration to the actual time and delete the value manually on your hash if expiration is after actual time. The best way is to implement it in LUA. – Treadwell 14/11, 2013 at 13:41

here is the answer to your questions check this: #13175115 – Ballot 15/11, 2013 at 9:20

Let me propose an alternative solution.

Rather than asking Redis to scan all the keys, why not perform a background dump, and parse the dump to extract the keys? This way, there is zero impact on the Redis instance itself.

Parsing the dump file is not as scary as it sounds, because you can use the excellent redis-rdb-tools package:

https://github.com/sripathikrishnan/redis-rdb-tools

You can either convert the dump file into a json file, and then parse the json file, or use the Python API to extract the keys by yourself.

Reflux answered 14/11, 2013 at 15:19 Comment(2)

It can be slow if the dump is big, but +1 for new idea and I think it can actually be a good design to separate concerns (it can even be run on another server: great). – Treadwell 14/11, 2013 at 16:44

very nice suggestion (-:...Couldn't find rdb-tools for java...will see on monday what the java client can give us. Thanks! – Ancier 16/11, 2013 at 16:52

As you've already mentioned, using keys is not a good solution to get your keys:

Warning: consider KEYS as a command that should only be used in production environments with extreme care. It may ruin performance when it is executed against large databases. This command is intended for debugging and special operations, such as changing your keyspace layout. Don't use KEYS in your regular application code. If you're looking for a way to find keys in a subset of your keyspace, consider using sets.

Source: Redis docs for KEYS

As the docs are suggesting, you should build your own indices! A common way of building an index is to use a sorted set. You can read more on how it's working on my question over here.

Building references to your a:* keys using a sorted set, will also allow you to only select the required keys in relation to a date or any other int value, in case you're filtering the results!

And yes: it would be awesome if hashes could expire. Sadly it looks like its not going to happen, but there are in fact creative alternatives to take care about it by yourself.

Lowdown answered 14/11, 2013 at 13:7 Comment(0)

Why don't you use a sorted set.

Here is some data creation sequence.

redis 127.0.0.1:6379> setex a:11111:22222 604800 someValue
OK
redis 127.0.0.1:6379> zadd user:index 1385112435 a:11111:22222   // 1384507635 + 604800
(integer) 1
redis 127.0.0.1:6379> setex a:11111:22223 604800 someValue2
OK
redis 127.0.0.1:6379> zadd user:index 1385113289 a:11111:22223  // 1384508489 + 604800
(integer) 1
redis 127.0.0.1:6379> zrangebyscore user:index 1385112435 1385113289
1) "a:11111:22222"
2) "a:11111:22223"

This is no select performance issue. but, It spends more memory and insert cost.

Ballot answered 15/11, 2013 at 9:52 Comment(0)

Also you can do complex commands with EVAL, like this:


EVAL "local keys = redis.call('keys',ARGV[1]) for _, key in ipairs(keys) do redis.call('EXPIRE', key, ARGV[2]) end return 1" 0 * 86400

It's a script * and 86400 at the end of command are pattern and expiration time

Swish answered 24/6 at 14:20 Comment(0)

Recommended topics

Hot tags