Redis list with expiring entries?
Asked Answered
B

3

29

I'm looking for a way to store a list of items for a user, that will expire within 24 hours. Is there a way to accomplish this using Redis? I was thinking of just using the list and setting an expiration for each individual item, is there a better way?

Boaster answered 31/12, 2017 at 18:3 Comment(1)
This would be a good feature in Redis.Ingenue
C
17

NO, you CANNOT set expiration for each item in a LIST. You can only set an expiration for the entire LIST.

In order to achieve what you want, you need to have a key for each item:

SET user1:item1 value EX 86400
SET uesr1:iter2 value EX 86400
SET user2:item1 value EX 86400

To get all items of a specified user, you can use the SCAN command with a pattern (or use the Keyspace Notification to achieve better performance, but with more complex work):

SCAN 0 MATCH user1:*
Celestyna answered 1/1, 2018 at 9:10 Comment(9)
Be aware of SCAN which is not always guaranteed to return requested item count. Otherwise this solution works.Artificiality
@itamarhaber response is on point. ZADD and ZREMRANGEBYSCORE works really well with epoch date as scoreLid
@Artificiality I don't understand your advice. Documentation redis.io/commands/scan says something that is no clear to me. Could you please go deep about this. Maybe an example?Culbertson
@Encroach NO, KEYS is dangerous, while SCAN is NOT. You can control the QPS of scan, and the number of keys returned by each scan. So that it will hurt the performance.Celestyna
However, in doc: redis.io/commands/scan it says that the behavior is similar to the functions keys when you use the MATCH pattern, so you must used this function carefully to avoid converting a SCAN operation in a KEYS call. Also check this for more details: docs.keydb.dev/blog/2020/08/10/blog-post/….Encroach
@Encroach The doc mentioned the behavior is similar to keys, however, its implementation is different. Check the source code, and you can see that Redis runs the MATCH job after it gets the scan result. So the extra cost with MATCH, is only some string matching work. It won't be a big ideal.Celestyna
@Encroach The KeyDB test is a little misleading. It uses a large COUNT number in the test, i.e. 100, 1,000, 10,000. As I mentioned above, you should control the COUNT option, e.g. the default COUNT is 10. When COUNT is too large, it works as if you run MGET with lots of keys, which, of course, hurts performance. Also as I mentioned you need to control the QPS, since once Redis handle more SCAN command, it can handle less other commands.Celestyna
@for_stack, that is my point, if you want to get all items of the user using SCAN, it is not good in performance terms, as you will need to call SCAN multiple times until you get 0 as cursor, which means that SCAN is O(N), with N being the total number of keys in the global space. So, if you don't have many keys in your DB, it's ok, but if you have millions of keys, your app performance will be drastically impacted. I have to face this problem in my work, and the solution was to use ordered set which is O(log(N)), with N being the number of keys in the set (largely lower than the global keys)Encroach
Using SCAN can be a performance issue when using with a big database. The MATCH won't really help here. See thisOtti
E
42

i use:
ZADD - adding new unique value to sorted set.
ZRANGE - get all current values ordered by score from the set. (ZREMRANGEBYSCORE has been deprecated)
ZREMRANGEBYSCORE - remove all keys between scores from the set.

in this solution the score = timestamp

for example:
3 values insertion:

ZADD mykey 160 val1        // 1
ZADD mykey 161 val2        // 1
ZADD mykey 120 val3        // 1

get sorted values between score (between -infinity to 400):

ZRANGE mykey -inf 400 BYSCORE      // ['val3', 'val1', 'val2'] 

remove value (between -infinity to 121) - val3 will removed:

ZREMRANGEBYSCORE mykey -inf 121      // 1

(again) - get sorted values between score (between -infinity to 400):

ZRANGE mykey -inf 400 BYSCORE       // ['val1', 'val2'] 
Elo answered 1/1, 2018 at 16:8 Comment(4)
If the elements should not all expire after 24 hours but at different intervals, it is possible to define the score as being the expiration timesptamp and therefore to call ZREMRANGEBYSCORE from -inf and now. The insertion order will no longer be respectedLitigate
To complement the comment from @Alex83690, you can check this post: levelup.gitconnected.com/…, I was able to implement it in my auth service with great performance results. I ve also added auto remove of expired entries, sessions count, etc.Encroach
If elements are added in timestamp order, a regular list of serialized timestamp/values with RPUSH would work. Use LINDEX 0 to check the oldest entry and LPOP to expire it. This is more efficient since these operations are all O(1).Imperturbable
In the explanation it says ZREMRANGEBYSCORE is deprecated, I guess that is a typo as ZRANGEBYSCORE is deprecated not ZREMRANGEBYSCORECarrington
C
17

NO, you CANNOT set expiration for each item in a LIST. You can only set an expiration for the entire LIST.

In order to achieve what you want, you need to have a key for each item:

SET user1:item1 value EX 86400
SET uesr1:iter2 value EX 86400
SET user2:item1 value EX 86400

To get all items of a specified user, you can use the SCAN command with a pattern (or use the Keyspace Notification to achieve better performance, but with more complex work):

SCAN 0 MATCH user1:*
Celestyna answered 1/1, 2018 at 9:10 Comment(9)
Be aware of SCAN which is not always guaranteed to return requested item count. Otherwise this solution works.Artificiality
@itamarhaber response is on point. ZADD and ZREMRANGEBYSCORE works really well with epoch date as scoreLid
@Artificiality I don't understand your advice. Documentation redis.io/commands/scan says something that is no clear to me. Could you please go deep about this. Maybe an example?Culbertson
@Encroach NO, KEYS is dangerous, while SCAN is NOT. You can control the QPS of scan, and the number of keys returned by each scan. So that it will hurt the performance.Celestyna
However, in doc: redis.io/commands/scan it says that the behavior is similar to the functions keys when you use the MATCH pattern, so you must used this function carefully to avoid converting a SCAN operation in a KEYS call. Also check this for more details: docs.keydb.dev/blog/2020/08/10/blog-post/….Encroach
@Encroach The doc mentioned the behavior is similar to keys, however, its implementation is different. Check the source code, and you can see that Redis runs the MATCH job after it gets the scan result. So the extra cost with MATCH, is only some string matching work. It won't be a big ideal.Celestyna
@Encroach The KeyDB test is a little misleading. It uses a large COUNT number in the test, i.e. 100, 1,000, 10,000. As I mentioned above, you should control the COUNT option, e.g. the default COUNT is 10. When COUNT is too large, it works as if you run MGET with lots of keys, which, of course, hurts performance. Also as I mentioned you need to control the QPS, since once Redis handle more SCAN command, it can handle less other commands.Celestyna
@for_stack, that is my point, if you want to get all items of the user using SCAN, it is not good in performance terms, as you will need to call SCAN multiple times until you get 0 as cursor, which means that SCAN is O(N), with N being the total number of keys in the global space. So, if you don't have many keys in your DB, it's ok, but if you have millions of keys, your app performance will be drastically impacted. I have to face this problem in my work, and the solution was to use ordered set which is O(log(N)), with N being the number of keys in the set (largely lower than the global keys)Encroach
Using SCAN can be a performance issue when using with a big database. The MATCH won't really help here. See thisOtti
O
3

First of all, don't use the SCAN command to retrieve the keys if you have a lot of keys in the database. It's very inefficient. it will take O(1) for each call, and O(n) to iterate all the keys in the database. And you will probably need to make a lot of calls to iterate the entire database:

O(1) for every call. O(N) for a complete iteration, including enough command calls for the cursor to return back to 0. N is the number of elements inside the collection.

Also, the MATCH filter won't help in terms of iteration performance:

It is important to note that the MATCH filter is applied after elements are retrieved from the collection, just before returning data to the client. This means that if the pattern matches very little elements inside the collection, SCAN will likely return no elements in most iterations.

Optimizing storage space by automatically removing data and using list as an index

On insert

1). Create a new key with expiration:

SET user:{UID}:entry:{EID} {data} EX 300

UID is a user ID in your system.

EID is an entry ID. You can use a monotonically increasing ID to make it small or a hashed value or a random value.

2). Add the key to the list:

LPUSH user:{UID}:entries {EID}

When retrieving

1). Get the key from the end of the list:

RPOP user:{UID}:entries

2). Use the key to fetch the value:

GET user:{UID}:entry:{EID}

3). If the value is expired, it will return nil, just ignore it and fetch another key from the list (i.e. go to 1).

Example:

SET user:1:entry:1 {data} EX 300
LPUSH user:1:entries 1

SET user:1:entry:2 {data} EX 300
LPUSH user:1:entries 2

SET user:1:entry:3 {data} EX 300
LPUSH user:1:entries 3

SET user:2:entry:1 {data} EX 300
LPUSH user:2:entries 1

This solution will help optimizing the storage by automatically deleting the expired data, however, the list should be garbage-collected manually. Use this if you are storing a significant amount of data.

Otti answered 9/10, 2023 at 19:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.