Looping over multiple files with GitHub Actions
Asked Answered
D

3

10

I have multiple .md files in several subdirectories of my repository. They all have the same naming convention, e.g. seminar1/slides.md, seminar2/slides.md, etc. These *.md files need to be processed using pandoc. I would like to do this automatically every time I commit to the repository, and decided to implement this as an action that runs on Github.

I have created the following workflow as a .yml file, which GitHub recognises as an action. It works if I have a single subdirectory, e.g., /seminar1/*.md, but fails with more subdirectories.

name: Make slides

on: push

jobs:
  convert_via_pandoc:
    runs-on: ubuntu-18.04
    steps:
      - uses: actions/checkout@v2
        with:
          ref: slides
      - run: |
          echo "::set-env name=FILELIST::$(printf '"%s"' seminar*/*.md)"
      - uses: docker://pandoc/latex:2.9
        with:
          args: -t beamer --output=${{env.FILELIST}}.pdf ${{env.FILELIST}}
      - uses: actions/upload-artifact@v2
        with:
          name: seminar-slides
          path: seminar*/*.md.pdf        

How can I make a script detect all of the seminar*/*.md files and act on them?

Also, I need some help with general usability:

  1. All of the scripts run from the root directory. This means I have to modify the content of the .md file to include the directory, e.g. seminar1/bridge.jpg rather than just including bridge.jpg. How can I change the working directory for each $env.FILELIST?
  2. How can I strip the extension from the filename and use this in $env.FILELIST?
Detoxicate answered 27/7, 2020 at 12:27 Comment(0)
W
5

GitHub Actions supports 'matrix' for iterating in jobs, but it's hard to use and I could not get it to work with a list from a string. The only working solution I found was splitting the string myself and just using bash.

Here is my solution. It does not use docker://pandoc/latex:2.9 but is more to make the concepts clear.

  • Iterating in GitHub Actions

    You have a string of comma-separated-values, like "a,b,c". You will need to parse it to a real array in every step and iterate over the values.

    IFS stands for "internal field separator". It is used by the shell to determine how to do word splitting, i. e. how to recognise word boundaries. We use the read command and feed it with our array-as-string. Afterwards we just iterate over the real array.

    name: Looping over values in Github Actions
    
    env:
      VALUE_ARRAY_AS_STRING: 'a.md,b.md,c.md'
    
    jobs:
      run-my-stuff:
        name: Iterating over comma-separated-values
        runs-on: ubuntu-latest
        steps:
          - name: Echo values from ENV
              run: |
                IFS="," read -a myarray <<< ${{ env.VALUE_ARRAY_AS_STRING }}
                for i in "${myarray[@]}"; do
                  echo "Value: ${i}"
                  echo "Value: ${i%.*}"
                done
    
          - name: Finding files and store to output
            id: finding-files
            run: |
              echo "::set-output name=FILELIST::$(find . -name '*.md' -print)"
    
          - name: Processing my found files from output
            run: |
              IFS="," read -a myarray <<< ${{ steps.finding-files.outputs.FILELIST }}
              for i in "${myarray[@]}"; do
                file_path=$(dirname "${i}")
                file_name=$(basename "${i}")
                cd file_path
                cat file_name
              done
    
  • Finding all *.md files

    find . -name '*.md' -print
    
  • Stripping file extensions

    From https://mcmap.net/q/73575/-how-do-i-remove-the-file-suffix-and-path-portion-from-a-path-string-in-bash or How to extract directory path from file path?

    x="filename.md"
    echo ${x%.*} 
    
  • Changing the working directory

    You can do this per step or with cd if you just run a command. I didn't look into it, but I would guess docker://pandoc/latex:2.9 has an argument for the working directory too. You will need to check the documentation.

Wengert answered 13/1, 2021 at 8:45 Comment(3)
Thank you! No need to apologise for contributing!Detoxicate
The scripting in this is all goofed up. You can't cd into every iteration of a for loop without at least changing back to the original directory (pushd/popd) or using a subshell so your script CWD‌ doesn't move around. But even if you fix that this doesn't fix using a GH Action on something in a different path than it expects. There is a way, but this isn't it.Conductive
@Conductive would you be able to provide an answer?Detoxicate
A
2

As of October 2022, set-output is deprecated and was supposed to be removed on May 31st 2023.

While the removal was delayed, it didn't make sense to me to try to use it so here's what worked for me.

- name: Finding files
  id: finding-files        
  run: |
    {
      echo 'FILELIST<<EOF'
      find . -name '*.md' -print 
      echo EOF
    } >> "$GITHUB_ENV"
- name: Do something with each file
  run: |
    for i in $FILELIST; do
      echo "Do something with file ${i}"
    done  
Attached answered 6/12, 2023 at 23:3 Comment(0)
D
1

My simple solution was to code up the action I wanted as a python script, and run the python script using an action.

This is much simpler to code and has the advantage of being testable on my local machine.

Detoxicate answered 5/5, 2023 at 6:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.