Why swapping is not a good idea in zookeeper and kafka?
Asked Answered
F

1

6

I have read instructions on

do not use swap

both on zookeeper and kafka. I know that kafka depends on the pagecaching to keep parts of sequential logs cached in-memory even they are written to disk.

But can not understand how swapping can harm zk and kafka.

Fisticuffs answered 1/11, 2015 at 15:22 Comment(0)
B
13

Swapping may cause performance as well as stability problems; in your example, you don't want the Linux kernel to "mistakenly/accidentally" swap your Kafka or ZooKeeper processes.

Also, swapping may be particularly bad for JVM processes such as Kafka and ZooKeeper, quoting:

[The] JVM generally won't do a full GC cycle until it has run out of its allowed heap, so most of your heap is likely occupied by not-yet-collected garbage. Since these pages aren't being touched (because they are garbage and thus unreferenced), the OS happily swaps them out. When GC finally runs, you have a ridiculous swap storm, pulling in all these pages only to then discover that they are in fact filled with garbage and should be discarded; this can easily make your GC cycle take many minutes!

Hence the recommendation to disable swapping by setting vm.swappiness to 0, though for some operating systems like RHEL 6.5 this should actually be 1 (because the semantics of the value 0 was changed on these OS's). Note that some swapping may still occur.

The following links may shed further light on your question. They explain why to disable swapping for Hadoop and Elasticsearch, respectively, and it's for the same reasons you should disable swapping for Kafka and ZooKeeper:

Bauer answered 2/11, 2015 at 8:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.