How to search read hotkeys in aerospike cluster?
Asked Answered
S

2

5

We have an aerospike cluster of 8 nodes. We saw that during peak hours one of the nodes is having a significantly higher load average in comparison to other nodes. Also in the AMC dashboard, we saw that the node is having only 30% read success. After following few similar issues posted in the aerospike community, we thought that the presence of hotkeys might be the possible culprit.

After following (https://discuss.aerospike.com/t/how-to-identify-read-hotkeys/4193), we found out a few hotkey digests with TCPdump in real-time. Among the top 10 digests, the interesting thing is that one key is present in 90% of the time. We then followed (https://discuss.aerospike.com/t/faq-how-keys-and-digests-are-used-in-aerospike/4663) to find out UserKey/record from those digests. We were able to map user key from all those except for one key which is present in 90% of the time.

Is there any way we can identify that hotkey?

Sammer answered 2/12, 2019 at 8:56 Comment(0)
C
4

Depending on your version of aerospike, you can also change the logging level for rw-client module which would also print the digest in the logs. That may remove any false positive from the tcpdump method.

Turn detail level logging for rw-client context

asinfo -v "set-log:id=0;rw-client=detail"

Turn back to info

asinfo -v "set-log:id=0;rw-client=info"

Also did you try the UDF from the above article to determine the set and key? (They original key would only be stored if the client has explicitly enable the SendKEY policy). Were there any corresponding record write failures, like record too big? Or possibly trying to read a non-existing record. (read not found) The write failures from a record too big would have the most impact on your network infrastructure. In both of these cases, the digest and record would not make it to storage and digest would not match an existing record.

Cannery answered 4/12, 2019 at 19:12 Comment(0)
S
2

It is possible that the frequent read request with the rouge digest may be failing with a 'not found' error (and hence only 30% read success). But Aerospike will spend its resources (CPU) to search for this digest in the index tree. If this is true, there will be no record in the database corresponding to the digest that you found via tcpdump. So, you will not get any details about that in the database. How did you identify the keys of other digests ? and what issue are you facing to find the key corresponding to the rouge digest ?.

Another option is to track back to the application. One option is to see in the tcpdump if all the requests for this rouge digest are coming from a single machine. That will narrow down your search greatly. We have seen bots creating such a mess in the past.

Sexpartite answered 3/12, 2019 at 3:27 Comment(2)
Check point 7 of discuss.aerospike.com/t/…. It says that even though it is not possible to reverse the hash but if the client application maintained some kind of dictionary of key structure and its assoicated digest; then one may be able to lookup easily the “User key” abd “Set Name” from the digest.Sammer
So, are you storing your key and set name in the record ? Is that how you are finding the user keys ?Sexpartite

© 2022 - 2024 — McMap. All rights reserved.