In Jenkins, how do I set SCM behavior for the master node rather the build nodes?

Asked 3/11, 2021 at 7:48 Answered 9/11, 2021 at 21:8

jenkins jenkins-pipeline jenkins-plugins gerrit

I'm aware I'm lacking basic Jenkins concepts but with my current knowledge it's hard for to research successfully - maybe you can give me some hints I can use to re-word my question if needed.

Currently I'm facing a situation in which in a setup with several build nodes the Jenkins master machine is running out of disk space because Jenkins clones git repositories on both, the master and build nodes (and the master only has limited space). This question explains why.

Note: the master node itself does not build anything - it just clones the repo to a local workspace folder (I guess it just needs the Jenkinsfiles).

Going through the job configurations and googling this issue I find options regarding shallow and sparse clones or cleaning up the workspace before or after the build using the Cleanup Plugin. But those settings and plugins only care about the checkout done with checkout(SCM) on the build nodes, not the master.

But in case I want to leave the situation as is on the build nodes but keep the workspace folders on the Jenkins master machine slim, how do I approach this? What do I have to search for?

And as a side question - isn't it possible to have something like "git exports"? I.e. having the .git folders removed after checking out the commit I need?

In case it depends on the kind of job I use, I'm using scripted pipeline jobs.

Yearbook answered 3/11, 2021 at 7:48 Comment(0)

I've got a similar setup: A master node, multiple build nodes.

Simply, I set the number of executors=0 on the master node (from Manage Jenkins -> Manage Nodes), so every job will land on build nodes.

The only repo cloned on the master is the shared library.

Gossipmonger answered 5/11, 2021 at 10:22 Comment(3)

That's actually the easiest and quickest way out, I believe. – Lapillus 7/11, 2021 at 21:43

This way you avoid building on the master but not cloning the repository. What do you mean by "shared library"? – Yearbook 9/11, 2021 at 7:0

The shared library is a repo with common functions and snippets used by the pipelines. It's optional. With this setup, the master doesn't clone the git repos: it fetches only the Jenkinsfile with a "Lightweight checkout". There isn't the full repo content – Gossipmonger 9/11, 2021 at 8:10

Running Jenkins builds in the master node is discouraged for two main reasons:

First of all, the usability of the Jenkins platform might be affected by many ongoing builds, for example showing delays on certain operations.
It is a well-known security problem, as pointed out by the documentation:

Any builds running on the built-in node have the same level of access to the controller file system as the Jenkins process.

It is therefore highly advisable to not run any builds on the built-in node, instead using agents (statically configured or provided by clouds) to run builds.

Always in that wiki page you can find details on this security problems, like what an attacker can do and an alternative that lets you use the master node to build, but patching some of the listed security problems. The solution is based on a plugin called Job Restrictions Plugin.

By the way, the most popular decision is to let slave nodes do the build:

To prevent builds from running on the built-in node directly, navigate to Manage Jenkins » Manage Nodes and Clouds. Select master in the list, then select Configure in the menu. Set the number of executors to 0 and save. Make sure to also set up clouds or build agents to run builds on, otherwise builds won’t be able to start.

If you really have strong reasons to build on the master node, you can always apply a different git clone strategy based on the value of the env.NODE_NAME environment variable. It is set to master if the pipeline job is run on the master node, otherwise it is filled with the node name (of course). Nonetheless, I have never seen anyone customizing the git clone command based on the node used, so... Don't do it 😉

About the sparse checkout and the sparse/shallow clone:

The former creates an incomplete working directory, avoiding to map all the trees and blobs present in the current commit, but only those you specify. Do you save that much space? Or better, is your project tree that heavy that you would need to do something like this? The sparse-checkout is generally used when you want a clean working tree, without unnecessary files.
sparse/shallow clone can be useful sometimes to reduce the download time, especially when you have a huge history. The most common option is --depth=1 that instructs git to retrieve only the most recent commit. As far as I know, Jenkins already applies some optimizations to speed the clone process but it generally keeps the entire history. Again, I am not sure you would gain a lot more space.

A valid (at least for me) alternative to space-optimizations on git files, is to build on Docker containers. Jenkins has reached a good level of integration with Docker and there are a lot of advantages using it, among which the disposal of the workspace after the job finished.

Tautologism answered 5/11, 2021 at 15:7 Comment(4)

The master node doesn't build anything in my case - it just clones the repo in order to have the Jenkinsfiles available to know what and where to build – Yearbook 8/11, 2021 at 7:7

I have seen the edit, what happens if you manually clean the git repo in the master? Is it completely cloned again on the next build? – Tautologism 8/11, 2021 at 10:24

yes, it gets completely cloned, and this question: stackoverflow.com/questions/39452030 explains the background - I'd just like to have the checkout be removed afterwards – Yearbook 9/11, 2021 at 6:49

I was mistakenly convinced that the master node did not contain the entire workspace, or at least some kind of optimization was applied but it seems it is not the case. At this point, I expect the Workspace Cleanup plugin to be working on the master too, have you tried? – Tautologism 9/11, 2021 at 10:23

I didn't use the pipeline feature myself so far -- but conceptually it is clear that the master requires initial access to the Jenkinsfile. It will therefore be difficult to avoid this step entirely.

If Jenkins itself does not provide an option to fine-tune the clone/checkout behavior on the master side, then I'd see these options:

Create a custom version of Jenkins (or of the corresponding plugin) which hard-codes the behavior that you need (like, shallow/sparse clone). Modifying and building both Jenkins and its plugins is surprisingly simple; often, the most difficult part is to locate the code that you need to touch.
Tune the master's clone in-place. Shallowness and sparse-checkout properties can be set for existing clones. If you set these properties after the initial clone (possibly in the Jenkinsfile itself or in a post-build step), then Jenkins may possibly maintain those properties.
Constantly re-cloning and deleting the repo on master side increases the load both on the Jenkins master and on your Git server, so better be careful with that (especially since your repository has a size where disk space matters already). If you really want to go that way, you could try to force-remove the clone on the master in a post-build step -- this should be relatively easy to implement. You need to check that this hack will not interfere with Jenkins' access to the Jenkinsfile.

Pulling answered 9/11, 2021 at 21:8 Comment(0)

Recommended topics

Hot tags