Concourse: how to pass job's output to a different job
Asked Answered
D

1

20

It's not clear for me from the documentation if it's even possible to pass one job's output to the another job (not from task to task, but from job to job).

I don't know if conceptually I'm doing the right thing, maybe it should be modeled differently in Concourse, but what I'm trying to achieve is having pipeline for Java project split into several granular jobs, which can be executed in parallel, and triggered independently if I need to re-run some job.

How I see the pipeline:

  1. First job:
    • pulls the code from github repo
    • builds the project with maven
    • deploys artifacts to the maven repository (mvn deploy)
    • updates SNAPSHOT versions of the Maven project submodules
    • copies artifacts (jar files) to the output directory (output of the task)
  2. Second job:
    • picks up jar's from the output
    • builds docker containers for all of them (in parallel)
  3. Pipeline goes on

I was unable to pass the output from job 1 to job 2. Also, I am curious if any changes I introduce to the original git repo resource will be present in the next job (from job 1 to job 2).

So the questions are:

  1. What is a proper way to pass build state from job to job (I know, jobs might get scheduled on different nodes, and definitely in different containers)?
  2. Is it necessary to store the state in a resource (say, S3/git)?
  3. Is the Concourse stateless by design (in this context)?
  4. Where's the best place to get more info? I've tried the manual, it's just not that detailed.

What I've found so far:

  1. outputs are not passed from job to job
  2. Any changes to the resource (put to the github repo) are fetched in the next job, but changes in working copy are not

Minimal example (it fails if commented lines are uncommented with error: missing inputs: gist-upd, gist-out):

---
resources:
  - name: gist
    type: git
    source:
      uri: "[email protected]:snippets/foo/bar.git"
      branch: master
      private_key: {{private_git_key}}

jobs:
  - name: update
    plan:
      - get: gist
        trigger: true

      - task: update-gist
        config:
          platform: linux
          image_resource:
            type: docker-image
            source: {repository: concourse/bosh-cli}

          inputs:
            - name: gist

          outputs:
            - name: gist-upd
            - name: gist-out

          run:
            path: sh
            args:
              - -exc
              - |
                git config --global user.email "[email protected]"
                git config --global user.name "Concourse"
                git clone gist gist-upd
                cd gist-upd
                echo `date` > test
                git commit -am "upd"
                cd ../gist
                echo "foo" > test
                cd ../gist-out
                echo "out" > test

      - put: gist
        params: {repository: gist-upd}

  - name: fetch-updated
    plan:
      - get: gist
        passed: [update]
        trigger: true

      - task: check-gist
        config:
          platform: linux
          image_resource:
            type: docker-image
            source: {repository: alpine}

          inputs:
            - name: gist
            #- name: gist-upd
            #- name: gist-out

          run:
            path: sh
            args:
              - -exc
              - |
                ls -l gist
                cat gist/test
                #ls -l gist-upd
                #cat gist-upd/test
                #ls -l gist-out
                #cat gist-out/test
Dungeon answered 6/3, 2017 at 20:27 Comment(2)
@DwayneForde, I see that you're editing all the concourse links manually. If there are too many of those, you should create a question on meta. A better alternative might pop up, rather than suggesting all those edits where some might even get rejected.Collect
@KeyurPotdar thanks for the suggestion. I just recently made it through all the edits, but I'll keep that in mind next time.Cushman
H
21

To answer your questions one by one.

  1. All build state needs to be passed from job to job in the form of a resource which must be stored on some sort of external store.
  2. It is necessary to store on some sort of external store. Each resource type handles this upload and download itself, so for your specific case I would check out this maven custom resource type, which seems to do what you want it to.
  3. Yes, this statelessness is the defining trait behind concourse. The only stateful element in concourse is a resource, which must be strictly versioned and stored on an external data store. When you combine the containerization of tasks with the external store of resources, you get the guaranteed reproducibility that concourse provides. Each version of a resource is going to be backed up on some sort of data store, and so even if the data center that your ci runs on is to completely fall down, you can still have strict reproducibility of each of your ci builds.
  4. In order to get more info I would recommend doing a tutorial of some kind to get your hands dirty and build a pipeline yourself. Stark and wayne have a tutorial that could be useful. In order to help understand resources there is also a resources tutorial, which might be helpful for you specifically.

Also, to get to your specific error, the reason that you are seeing missing inputs is because concourse will look for directories (made by resource gets) named each of those inputs. So you would need to get resource instances named gist-upd and gist-out prior to to starting the task.

Hyperboloid answered 6/3, 2017 at 21:55 Comment(2)
No problem, I just edited the answer to add some more info.Hyperboloid
I'd note that while job-to-job state requires declared resources, it is not explicitly required in concourse to uses external resources to pass data between steps of the same job. The get/puts of a task or resource are automatically copied as directories across them. A task can declare a new put: resource that's local to the job and is not declared as a resource in the pipeline, and another task in the same job can have a get: to match it.Integrator

© 2022 - 2024 — McMap. All rights reserved.