How to troubleshoot Jenkins UI slowness?
Asked Answered
K

2

8

I'm trying to troubleshoot a problem and I'm hoping you could help me with the method :) Hopefully this could also benefit other people.

I have a Jenkins server running (v2.46.2). For some reasons, when browsing the Web UI, some requests are very very slow (up to 10 minutes). It seems to be especially the case when loading the UI a first time. Then, it is in general very responsive. If you wait maybe 10/15 minutes, the first request will be very slow again. And then pretty fast for a while.

How would you go to troubleshoot this? So far this is what I did:

  1. Checked general stats of the host (cpu, ram, etc). All looks ok.
  2. Checked the stats of the Java VM. Memory usage looks ok to me.
  3. Set up better monitoring to track nb of incoming request, time to process them and status code. This showed indeed that some requests get delayed (up to 10 minutes).

I've read a number of interesting things from the internet including: Jenkins GUI only shown after waiting for 2 minutes. In this case, the info is a bit old (outdated?) and as my jobs only keep a limited number of builds, it did not help much.

This blog post was also super interesting: https://jenkins.io/blog/2016/11/21/gc-tuning/. In my case I am not convinced the problem comes from garbage collection.

I am left with a number of hypotheses such as:

  1. I am using a lot of Categorized View (with regexp), maybe it is inefficient? But it doesn't seem enough to explain minutes of delays.
  2. The fact that only the first page is long to load make me thing some caching is involed. But is it caching of html? Of credentials? Etc.

At the moment, these are only guesses. Ideally I would like to somehow profile the time spent by the server to answer requests. And from there try to find where this time go.

Is this possible at all? I tried using VisualVM, but this only show global data right? Is it possible to isolate the resources used to answer a request? How would you approach it?

Note: I am discovering the Java world (coming from Python) so please do not assume I know well how the Java VM works or the tools you use :-)

Thanks a lot!

Katzenjammer answered 12/6, 2017 at 11:7 Comment(3)
You can use jmeter to monitor java processes: jmeter.apache.orgEmpower
Could it be, that the Hypervisor is very slow. So it won't reflect in your VM stats. Is the hypervisor is swapping for instanceEmpower
Thanks Rik for your comments! Jmeter will only allow me to generate traffic right? But this won't really help me understand why some requests take so long I guess? Yes the Hypervisor is also on my radar :-)Katzenjammer
C
5

I am a few years late to the party but this seems to be a common problem, and this seems like a reasonable post that comes up on search, so I will post my experiences here.

I debugged Jenkins UI slowness using:

  • updating Jenkins to the latest version to address all the issues that I found in the issue tracker and forums, and that were marked as fixed in the newer versions. This did not fix it.

  • "top" on the Linux host to see what resources are used. it showed 100+% CPU load for "jenkins" almost all the time when using the GUI. Memory did not seem to be an issue.

  • memory use was at about 4GB, so I tried setting Jenkins JVM minimum heap size to 16GB and max to 22GB (host had 24GB) to ensure memory and GC was not an issue. It was not, UI was consistently slow all the time and memory use stayed at minimum setting.

  • "tcpdump" on the host to see what requests it is getting, and if Jenkins tried to poll some network resources that could be slow to respond. Some interesting findings but no real resolution.

  • using YourKit Java profiler to profile the Jenkins JVM, while manually continuously accessing the GUI to cause the slowness effect. Collecting dumps of CPU use, method invocations, thread scheduling, memory use. Connected the profiler to the Jenkins JVM over SSH, which worked great.

  • downloading Jenkins source code from Github to match the profiler results to source code and see what those places in the code were doing. Added the profiler plugin and Jenkins source code to IntelliJ IDEA to debug them.

My findings:

Difficult to pinpoint the issue as there seems to be very little visibility.

Per thread, the profiler showed numerous threads with occasional high load but nothing consistent. The overall flamegraph showed a large portion of time spent in Thread.sleep for some plugin. This seemed a bit strange as sleeping should not consumer CPU. It appears YourKit shows shows it to give an overall view of wall-clock time used. So, consider filtering Thread.sleep out to see the real issues.

The flamegraph showed a lot of logging statements interleaved with everything else, which seemed slightly suspicious. Looking at Jenkins source code for the profiler highlighted traces, I saw Jenkins supports very detailed logging. So I went looking a bit deeper into Jenkins logs and how to configure them. Also considered enabling traces (logs) for the profiler highlighted parts to see what is happening.

These logs are set up as log recorders under Manage Jenkins -> System Log -> Log Recorders. I found previously added log recorders there. This was for a Jenkins Java package that was getting accessed all the time and had detailed tracing enabled. Likely added to trace some issue but never removed. Remembering the profiler results, it looked suspicious with the amount of logs it generated, so I removed it.

After this the Jenkins performance improved to a good level. Not quite instant but close. Very much good enough. So in this case the issue was this excess logging configuration that was taking the CPU to 100% and slowing the GUI to a crawl (running in the single main thread I guess..). The main Jenkins log is in /var/log/jenkins and it was not showing any of this log, which was also a bit confusing as it was one of the first things I looked at.

This finding is of course just one potential issue to check. But the above approach worked for me, and might be useful more generally...

Casing answered 26/10, 2019 at 7:50 Comment(1)
Good stuff. Thanks for taking the time to write it!Katzenjammer
O
3
  • Navigate to Manage Jenkins.
  • Go to your Jenkins dashboard and click on "Manage Jenkins" on the left-hand side. Access System Configuration.
  • In the Manage Jenkins section, select "Configure System". Update Jenkins Location URL.
  • Find the field labeled "Jenkins Location" or "Jenkins URL".

Update it to match the URL you use to access Jenkins, including the correct protocol (http:// or https://) and domain. Save Changes. Scroll down and click on "Save" to apply the changes.

This issue resolved my jenkins slowness issue. Its very useful and easily method to fast jenkins UI response.

Orvilleorwell answered 2/5 at 18:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.