SCAN vs KEYS performance in Redis

Asked 16/9, 2015 at 8:57 Answered 23/1 at 15:42

A number of sources, including the official Redis documentation, note that using the KEYS command is a bad idea in production environments due to possible blocking. If the approximate size of the dataset is known, does SCAN have any advantage over KEYS?

For example, consider a database with at most 100 keys of the form data:number:X where X is an integer. If I want to retrieve all of these, I might use the command KEYS data:number:*. Is this going to be significantly slower than using SCAN 0 MATCH data:number:* COUNT 100? Or are the two commands essentially equivalent in this circumstance? Would it be accurate to say that SCAN is preferable to KEYS because it protects against the scenario where an unexpectedly large set would be returned?

Drachma answered 16/9, 2015 at 8:57 Comment(1)

I guess, there is no performance difference if you are not using pagination. – Coverage 23/8, 2017 at 4:43

You shouldn't care about current command execution but about the impact to all other commands, since Redis processes commands using a single thread (i.e. while a command is being executed all others need to await until executing one ends).

While keys or scan might provide you similar or identical performance executed alone in your case, some milliseconds blocking Redis will significantly decrease overall I/O.

This the main reason to use keys for development purposes and scan on production environments.

OP said:

"While keys or scan might provide you similar or identical performance executed alone in your case, some milliseconds blocking Redis will significantly decrease overall I/O." - This sentence seems to indicate that one command blocks Redis, and the other doesn't, which can't be the case. If I am guaranteed 100 results from my call to KEYS, in what way is it worse than SCAN? Why do you feel that one command is more prone to blocking?

There should be a good difference when you can paginate the search. It's not the same being forced to get 100 keys in a single pass than being able to implement pagination and get 100 keys, 10 by 10 (or 50 and 50). This very small interruption can let other commands sent by the application layer be processed by Redis. See what Redis official documentation says about this:

Since these commands allow for incremental iteration, returning only a small number of elements per call, they can be used in production without the downside of commands like KEYS or SMEMBERS that may block the server for a long time (even several seconds) when called against big collections of keys or elements

Megalopolis answered 16/9, 2015 at 9:9 Comment(5)

Very true. However, with only 100 keys it is likely to have no practical difference. Also, one should note that SCAN's COUNT argument is only a hint rather than a directive. – Lend 16/9, 2015 at 9:41

@ItamarHaber BTW I like to architect solutions with right approaches by design instead of relying on "this won't never happen". In the other hand, don't you think that if you need a keys or scan in the global keyspace it might sound like there're better solutions to store the data and access it by pages? lrange, zrange :D – Selfsupport 16/9, 2015 at 9:56

"While keys or scan might provide you similar or identical performance executed alone in your case, some milliseconds blocking Redis will significantly decrease overall I/O." - This sentence seems to indicate that one command blocks Redis, and the other doesn't, which can't be the case. If I am guaranteed 100 results from my call to KEYS, in what way is it worse than SCAN? Why do you feel that one command is more prone to blocking? – Drachma 17/9, 2015 at 9:40

What if I use redis in an app and the following execution of the program depends on the KEYS (or SCAN) result. I mean the program needs all the data from this operation to continue calculation. I think this is the case where KEYS is more preferable than SCAN, isn't it? – Unapproachable 2/9, 2016 at 13:55

@Unapproachable I don't think so. Most programming languages have frameworks or libraries to handle this scenarios using thread synchronization, or even process synchronization with mutexes and other approaches :D – Selfsupport 2/9, 2016 at 16:53

The answer is in the SCAN documentation

These commands allow for incremental iteration, returning only a small number of elements per call, they can be used in production without the downside of commands like KEYS or SMEMBERS that may block the server for a long time (even several seconds) when called against big collections of keys or elements.

So ask for small chunks of data rather than getting whole of it

Also as Matías Fidemraizer pointed out, Redis is single threaded and KEYS is a blocking call thus blocking any incoming requests for operation until execution of KEYS is done.

Whether your data is small or not, it never hurts to apply best practices.

Honorific answered 16/9, 2015 at 10:16 Comment(1)

Clarification - all commands in Redis are blocking on the server since it's single-threaded. Keys just has the potential to block for a long time trying to get all the keys in one operation rather than sending small amounts of keys back in several operations. – Orography 5/7, 2016 at 2:6

There is no performance difference between KEYS and SCAN other than pagination (count) where the amount bytes transferred (IO) from redis to client will be controlled in pagination.
The count option it self has its own specification where sometimes you will not get data, but still scan cursor is on, so will get data in the next iterations. So the count option should be reasonable amount say 200 to something max to avoid multiple round trip time. I think this value depends on total number of keys in your db.
There is no point/difference when we use SCAN within LUA compare to KEYS, though there is no IO involved, still both are blocking other calls till entire big collection get iterated. I haven't tried this, my guess it is.

Coverage answered 23/8, 2017 at 17:3 Comment(3)

Please write your comments when you are down voting, so that i will also learn and correct. – Coverage 13/11, 2017 at 6:53

I can confirm that using LUA with SCAN, it can still split them out to many commands, without blocking other calls. Or at least it generate multi commands ? – Gaia 23/1 at 15:46

@KanagaveluSugumar there is absolutely a performance difference. Redis, by its nature, will frequently be used in a distributed environment. Because it is single threaded the use of KEYS is dangerous if the operation is expensive as it will block other process from accessing your node. – Hartshorn 15/2 at 20:55

The main difference between KEYS and SCAN is KEYS returns the matched keys in one request while SCAN uses the cursor. Both behind-the-scenes iterate all over the keys.

During my testing, I created a Redis database with 50k records, I tested against KEYS and SCAN.

Here is what I have:

KEYS takes 3000 microseconds
SCAN - No COUNT, takes 11000 microseconds
SCAN - COUNT:100000, takes 14000 microseconds. Split into two commands
SCAN - COUNT:10000, takes 14000 microseconds. Split into 6 commands

My conclusion is:

As long as we have a fairly small number of keys, we can still use KEYS, a little bit faster then SCAN
In case we use SCAN, we need to choose the good COUNT so that it will split into smaller chunks. Sometimes, the performance is not that good comparing to KEYS. Ex: when I choose 10k, each chunk takes 2300 microseconds

There is no point/difference when we use SCAN within LUA compare to KEYS, though there is no IO involved, still both are blocking other calls till entire big collection get iterated. I haven't tried this, my guess it is.

-> From the screenshot, I can see that using LUA with SCAN, the commands are still able to split into smaller chunks without blocking other calls

LUA script

EVAL "local cursor = '0' local count = 0 repeat local result = redis.call('SCAN', cursor, 'match', ARGV[1], 'count', 100000) cursor = result[1] local keys = result[2] until cursor == '0' if count == 0 then return 'no keys' else return count end" 0 USER:BB*

Gaia answered 23/1 at 15:42 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags