I am using latest sensu core version 0.26.1, set up sensu server on one CentOS machine, set up one sensu client on another.
There are about 500 checks in one sensu client, I keep seeing "previous check command execution in progress"
in sensu-client.log, but each check actually finishes very fast (most of them less than 0.1 seconds and interval is 60 seconds)
, i can confirm this by running it under sensu
user sudo su sensu -c "{run my check}"
.
However, the actual situation was, in the uchiwa dashboard, it shows many of my checks were running more than 1 minute, seems the sensu client got seriously stuck/slow, i tried the following approaches:
- Remove several long running checks and restart sensu server/client
- Add timeout definition to my checks, limit timeout to 10, then resulted in many of the checks "Execution timed out".
- I tried only run 1, 10, 50 checks, seems everything was working normally, however, as soon as the count of the checks reaches certain number maybe 200-300, the issue occurred.
Neither of above work, is there a way to debug which check(s) are actually blocking? Or can I configure sensu to simply kill the check when it exceeded the timeout
definition? So that I won't see this message in log "previous check command execution in progress"
I got blocked by this and need help:)