What is the easiest way to find the biggest objects in Redis?
Asked Answered
S

6

62

I have a 20GB+ rdb dump in production. I suspect there's a specific set of keys bloating it. I'd like to have a way to always spot the first 100 biggest objects from static dump analysis or ask it to the server itself, which by the way has ove 7M objects.

Dump analysis tools like rdbtools are not helpful in this (I think) really common use case!

I was thinking to write a script and iterate the whole keyset with "redis-cli debug object", but I have the feeling there must be some tool I'm missing.

Snell answered 2/12, 2012 at 19:44 Comment(0)
H
129

An option was added to redis-cli: redis-cli --bigkeys

Sample output based on https://gist.github.com/michael-grunder/9257326

$ ./redis-cli --bigkeys

# Press ctrl+c when you have had enough of it... :)
# You can use -i 0.1 to sleep 0.1 sec every 100 sampled keys
# in order to reduce server load (usually not needed).

Biggest string so far: day:uv:483:1201737600, size: 2
Biggest string so far: day:pv:2013:1315267200, size: 3
Biggest string so far: day:pv:3:1290297600, size: 5
Biggest zset so far: day:topref:2734:1289433600, size: 3
Biggest zset so far: day:topkw:2236:1318723200, size: 7
Biggest zset so far: day:topref:651:1320364800, size: 20
Biggest string so far: uid:3467:auth, size: 32
Biggest set so far: uid:3029:allowed, size: 1
Biggest list so far: last:175, size: 51


-------- summary -------

Sampled 329 keys in the keyspace!
Total key length in bytes is 15172 (avg len 46.12)

Biggest   list found 'day:uv:483:1201737600' has 5235597 items
Biggest    set found 'day:uvx:555:1201737600' has 47 members
Biggest   hash found 'day:uvy:131:1201737600' has 2888 fields
Biggest   zset found 'day:uvz:777:1201737600' has 1000 members

0 strings with 0 bytes (00.00% of keys, avg size 0.00)
19 lists with 5236744 items (05.78% of keys, avg size 275618.11)
50 sets with 112 members (15.20% of keys, avg size 2.24)
250 hashs with 6915 fields (75.99% of keys, avg size 27.66)
10 zsets with 1294 members (03.04% of keys, avg size 129.40)
Hewitt answered 8/1, 2015 at 0:34 Comment(5)
Finally after 2 years a real solution to this. Thanks.Snell
This is a "new" feature beginning with 2.8 (and backported to 2.6)... so it's been around for the last 18mo or so... just sayin' (heya Simone! ;))Ephod
Hi Itamar :D I guess I just stopped looking for a solution to this when I moved everything to ARDB. But good to know!Snell
Is there anything like this that operates on the entirety of a Redis cluster's keys? The only way I can think to leverage this functionality in cluster is to script this to run on each master node, then aggregate. Would be awesome if there were a better way though.Roselba
Echoing Thomp's comment, is there a way to do this with clustered Redis?Position
S
6

redis-rdb-tools does have a memory report that does exactly what you need. It generates a CSV file with memory used by every key. You can then sort it and find the Top x keys.

There is also an experimental memory profiler that started to do what you need. Its not yet complete, and so isn't documented. But you can try it - https://github.com/sripathikrishnan/redis-rdb-tools/tree/master/rdbtools/cli. And of course, I'd encourage you to contribute as well!

Disclaimer: I am the author of this tool.

Supranatural answered 3/12, 2012 at 10:44 Comment(4)
Upvote for the offline approach to the problem. the CSV file has come out really big and needs parsing/sorting.Snell
@Snell - I created issue github.com/sripathikrishnan/redis-rdb-tools/issues/19 to track this problem. I will update the parser to maintain and display the top N keys by memory usage when I get a chance.Supranatural
Having that feature would be really useful. Thanks!Snell
I've recently added an argument to filter by the key size as well as only returning the N largest keys, both arguments are in my fork of the project - github.com/joshtronic/redis-rdb-tools - cheers!Tag
F
4

I am pretty new to bash scripting. I came out with this:

for line in $(redis-cli keys '*' | awk '{print $1}'); do echo `redis-cli DEBUG OBJECT $line | awk '{print $5}' | sed 's/serializedlength://g'` $line; done; | sort -h

This script

  • Lists all the key with redis-cli keys "*"
  • Gets size with redis-cli DEBUG OBJECT
  • sorts the script based on the name prepend with the size

This may be very slow due to the fact that bash is looping through every single redis key. You have 7m keys you may need to cache the out put of the keys to a file.

Finely answered 3/12, 2012 at 3:10 Comment(2)
serializedlength is actually a poor indicator of memory. For small objects that use Redis' special encoding, it is accurate. But for larger objects, it is completely off the mark. Larger objects have overheads due to pointers, and they may be compressed in the serialized version - both of which are not accounted for in the serialized length.Supranatural
Thanks @peterpan, however the bash cycle |sort never has been an option given the number of the keys and the need to interact directly the production server in such a massive way.Snell
C
4

In order to show the memory used by the big keys. You could use redis-cli --memkeys per redis mem keys

Mem Keys: Similarly to big keys, mem keys will look for the biggest keys but also report on the average sizes.

$ redis-cli --memkeys

# Scanning the entire keyspace to find biggest keys as well as
# average sizes per key type.  You can use -i 0.1 to sleep 0.1 sec
# per 100 SCAN commands (not usually needed).

[00.00%] Biggest string found so far '"counter:__rand_int__"' with 62 bytes
[00.00%] Biggest string found so far '"key:__rand_int__"' with 63 bytes
[00.00%] Biggest hash   found so far '"myhash"' with 86 bytes
[00.00%] Biggest list   found so far '"mylist"' with 860473 bytes

-------- summary -------

Sampled 4 keys in the keyspace!
Total key length in bytes is 48 (avg len 12.00)

Biggest   list found '"mylist"' has 860473 bytes
Biggest   hash found '"myhash"' has 86 bytes
Biggest string found '"key:__rand_int__"' has 63 bytes

1 lists with 860473 bytes (25.00% of keys, avg size 860473.00)
1 hashs with 86 bytes (25.00% of keys, avg size 86.00)
2 strings with 125 bytes (50.00% of keys, avg size 62.50)
0 streams with 0 bytes (00.00% of keys, avg size 0.00)
0 sets with 0 bytes (00.00% of keys, avg size 0.00)
Charlacharlady answered 11/9, 2023 at 14:4 Comment(0)
U
2

If you have keys that follow this pattern "A:B" or "A:B:*", I wrote a tool that analyzes both existing content as well as monitors for things such as hit rate, number of gets/sets, network traffic, lifetime, etc. The output is similar to the one below.

https://github.com/alexdicianu/redis_toolkit

$ ./redis-toolkit report -type memory -name NAME
+----------------------------------------+----------+-----------+----------+
|                     KEY                | NR  KEYS | SIZE (MB) | SIZE (%) |
+----------------------------------------+----------+-----------+----------+
| posts:*                                |      500 |      0.56 |     2.79 |
| post_meta:*                            |      440 |     18.48 |    92.78 |
| terms:*                                |      192 |      0.12 |     0.63 |
| options:*                              |      109 |      0.52 |     2.59 |
Unanimous answered 18/11, 2017 at 22:5 Comment(0)
L
0

Try redis-memory-analyzer - a console tool to scan Redis key space in real time and aggregate memory usage statistic by key patterns. You may use this tools without maintenance on production servers. It shows you detailed statistics about each key pattern in your Redis serve.

Also you can scan Redis db by all or selected Redis types such as "string", "hash", "list", "set", "zset". Matching pattern also supported.

RMA also try to discern key names by patterns, for example if you have keys like 'user:100' and 'user:101' application would pick out common pattern 'user:*' in output so you can analyze most memory distressed data in your instance.

Limner answered 11/2, 2016 at 11:3 Comment(2)
It seems to be solely meant for python >= 3.4. Any suggestion for python 2.7?Vinylidene
@misterion: I installed RMA by python3.6 -m pip install rma and it installed successfully. When I type python3.6 -m rma it says /usr/local/bin/python3.6: No module named rma.__main__; 'rma' is a package and cannot be directly executedVeneaux

© 2022 - 2024 — McMap. All rights reserved.