We have an aerospike cluster of 8 nodes. We saw that during peak hours one of the nodes is having a significantly higher load average in comparison to other nodes. Also in the AMC dashboard, we saw that the node is having only 30% read success. After following few similar issues posted in the aerospike community, we thought that the presence of hotkeys might be the possible culprit.
After following (https://discuss.aerospike.com/t/how-to-identify-read-hotkeys/4193), we found out a few hotkey digests with TCPdump in real-time. Among the top 10 digests, the interesting thing is that one key is present in 90% of the time. We then followed (https://discuss.aerospike.com/t/faq-how-keys-and-digests-are-used-in-aerospike/4663) to find out UserKey/record from those digests. We were able to map user key from all those except for one key which is present in 90% of the time.
Is there any way we can identify that hotkey?