Recursive wildcards in GNU make?
Asked Answered
T

7

118

It's been a while since I've used make, so bear with me...

I've got a directory, flac, containing .FLAC files. I've got a corresponding directory, mp3 containing MP3 files. If a FLAC file is newer than the corresponding MP3 file (or the corresponding MP3 file doesn't exist), then I want to run a bunch of commands to convert the FLAC file to an MP3 file, and copy the tags across.

The kicker: I need to search the flac directory recursively, and create corresponding subdirectories in the mp3 directory. The directories and files can have spaces in the names, and are named in UTF-8.

And I want to use make to drive this.

Toadeater answered 20/3, 2010 at 13:28 Comment(9)
Any reason for selecting make for this purpose? I'd have thought writing a bash script would be simplerCollotype
(...or I could write it in Ruby or Python). I'd like to have a play with make beyond the basics, and this is a 'project' I have open right now.Toadeater
@Neil, make's concept as pattern-based file system transformation is the best way to approach the original problem. Perhaps implementations of this approach have its limitations, but make is closer to implementing it than bare bash.Orthochromatic
@Pavel Only if it works!Collotype
@Pavel Well, a sh script that walks through the list of flac files (find | while read flacname), makes a mp3name from that, runs "mkdir -p" on the dirname "$mp3name", and then, if [ "$flacfile" -nt "$mp3file"] converts "$flacname" into "$mp3name" is not really magic. The only feature you are actually losing compared to a make based solution is the possibility to run N file conversions processes in parallel with make -jN.Crosshatch
Admittedly, make's declarative approach is nicer than any imperative language could ever offer.Crosshatch
@Crosshatch That's the first time I have ever heard make's syntax be described as "nice" :-)Collotype
Using make and having spaces in file names are contradictory requirements. Use a tool appropriate for the problem domain.Jaan
Related: #3775068Nickinickie
C
141

I would try something along these lines

FLAC_FILES = $(shell find flac/ -type f -name '*.flac')
MP3_FILES = $(patsubst flac/%.flac, mp3/%.mp3, $(FLAC_FILES))

.PHONY: all
all: $(MP3_FILES)

mp3/%.mp3: flac/%.flac
    @mkdir -p "$(@D)"
    @echo convert "$<" to "$@"

A couple of quick notes for make beginners:

  • The @ in front of the commands prevents make from printing the command before actually running it.
  • $(@D) is the directory part of the target file name ($@)
  • Make sure that the lines with shell commands in them start with a tab, not with spaces.

Even if this should handle all UTF-8 characters and stuff, it will fail at spaces in file or directory names, as make uses spaces to separate stuff in the makefiles and I am not aware of a way to work around that. So that leaves you with just a shell script, I am afraid :-/

Crosshatch answered 20/3, 2010 at 13:35 Comment(18)
This is where I was going...fast fingers for the win. Though it looks like you may be able to do something clever with vpath. Must study that one of these days.Gander
Doesn't appear to work when the directories have spaces in the names.Toadeater
Didn't realize that I'd have to shell out to find to get the names recursively...Toadeater
Oh. Spaces. Well, make will not work with spaces. Make syntax uses spaces for its own purposes.Crosshatch
@Roger: No it doesn't. There is a Grouch Marx skit involving a doctor... But I suppose that the file naming is not in your control.Gander
What, even in the filenames? Ick.Toadeater
Fair enough; I'll investigate other options. Marking this as the answer anyway.Toadeater
See the follow up question about rake, instead: https://mcmap.net/q/188994/-recursive-wildcards-in-rakeToadeater
Does make not use lazy evaluation on dependency lists? I put something like this into my Makefile and it still runs the find command even if I tell make to build a target that doesn't need it (and yes, I am using '=' instead of ':=').Orrin
Perfect. I needed a variant on it that turns one subtree's coffee into another subtree's javascript: gist.github.com/johan/5490763Pali
Is there a way to parallelize and/or asynchronize the conversion recipe? It's rather slow when I run it in sequence but when I send all the file names to the convert command directly it's many fold faster.Citystate
@PaulKonova: Run make -jN. For N use the number of conversions which make should run in parallel. Caution: Running make -j without an N will start all conversion processes at once in parallel which might be equivalent to a fork bomb.Crosshatch
Yeah, that works. Also, in this particular case, is there a way to remove old directories and files from the MP3 directory which are no longer in the FLAC directory? To have the MP3 directory maintain parity with the FLAC directory?Citystate
@PaulKonova Not with ´make`.Crosshatch
FLAC_FILES = $(shell find flac -type f -name '*.flac') remove the / after flac folder if you don't want something like that (double slashes): flac//file.flacAnaheim
What is the line .PHONY: all for?Taxiway
@Adrian: The .PHONY: all line tells make that the recipe for the all target is to be executed even if there is a file called all newer than all the $(MP3_FILES).Crosshatch
thank you, been trying to do exactly this for an hour already. glad I stumbled upon this.Jem
T
87

You can define your own recursive wildcard function like this:

rwildcard=$(foreach d,$(wildcard $(1:=/*)),$(call rwildcard,$d,$2) $(filter $(subst *,%,$2),$d))

The first parameter ($1) is a list of directories, and the second ($2) is a list of patterns you want to match.

Examples:

To find all the C files in the current directory:

$(call rwildcard,.,*.c)

To find all the .c and .h files in src:

$(call rwildcard,src,*.c *.h)

This function is based on the implementation from this article, with a few improvements.

Tinderbox answered 15/8, 2013 at 17:37 Comment(9)
This doesn't seem to work for me. I've copied the exact function and it still won't look recursively.Roach
I am using GNU Make 3.81, and it seems to work for me. It won't work if any of the filenames have spaces in them, though. Note that the filenames it returns have paths relative to the current directory, even if you are only listing files in a subdirectory.Tinderbox
This is truly an example, that make is a Turing Tar Pit (see here: yosefk.com/blog/fun-at-the-turing-tar-pit.html). It is not even that hard, but one has to read this: gnu.org/software/make/manual/html_node/Call-Function.html and then "understand recurrence". YOU had to write this recursively, in the verbatim sense; it's not the everyday understanding of "automatically include stuff from subdirs". It's actual RECURRENCE. But remember - "To understand recurrence, you have to understand recurrence".Hesta
@TomaszGandor You don't have to understand recurrence. You have to understand recursion and in order to do that you must first understand recursion.Taejon
My bad, I fell for a linguistic false-friend. Comments can't be edited after such a long time, but I hope everybody got the point. And they also understand recursion.Hesta
This is more portable than calling a shell function. Also I was able to use it to get a list of directories $(sort $(dir $(call rwildcard,src,*))) (where src is my top level dir)Spanish
Despite the tar pit, this is the right answer because it is portable and does not depend on shell commands.Idolist
Interestingly, this command seems to fail for search strings which have an _ in them. For instance, $(call rwildcard,src,hw*.h) works, but $(call rwildcard,src,hw_*.h) does not. This is not the case for a normal wildcard call, e.g. $(wildcard src/hw_*.h).Idolist
Its a very cool function I wish it worked with two wildcards like *.subst.* it doesn't seem like it.Jinajingle
W
6

If you're using Bash 4.x, you can use a new globbing option, for example:

SHELL:=/bin/bash -O globstar
list:
  @echo Flac: $(shell ls flac/**/*.flac)
  @echo MP3: $(shell ls mp3/**/*.mp3)

This kind of recursive wildcard can find all the files of your interest (.flac, .mp3 or whatever). O

Winn answered 10/5, 2015 at 18:1 Comment(9)
To me, even just $(wildcard flac/**/*.flac) seems to work. OS X, Gnu Make 3.81Indian
I tried $(wildcard ./**/*.py) and it behaved the same as $(wildcard ./*/*.py). I don't think make actually supports **, and it just doesn't fail when you use two *s next to each other.Carlie
@Carlie It should when you invoking commands via Bash shell and you've enabled globstar option. Maybe you're not using GNU make or something else. You may also try this syntax instead. Check the comments for some suggestions. Otherwise it's a thing for the new question.Winn
@Winn no no, I didn't even try your thing because I wanted to avoid shell invocation for this particular thing. I was using akauppi's suggested thing. The thing I went with looked like larskholte's answer, though I got it from somewhere else because the comments here said this one was subtly broken. shrug :)Carlie
@Carlie In this case ** won't work, because the extended globbing is a bash/zsh thing.Winn
Just want to chime in that $(wildcard flac/**/*.flac) does not work for me on OS X Catalina with Gnu Make 4.2.1. Which is a bummer :/Dunne
@SơnTrần-Nguyễn Make sure to enable it with shopt -s globstar and try on shell first. Check option by shopt | grep globstar.Winn
@Winn Catalina switched the shell from bash to zsh. I did test ls css/**/*.css and it worked.Dunne
Bash switched to GNU license with version 4.0 so Apple is stuck on 3.2. You have to upgrade it yourself or switch to a different shell (switch to bash with a newer version? upgrading feels dangerous).Secondguess
T
2

FWIW, I've used something like this in a Makefile:

RECURSIVE_MANIFEST = `find . -type f -print`

The example above will search from the current directory ('.') for all "plain files" ('-type f') and set the RECURSIVE_MANIFEST make variable to every file it finds. You can then use pattern substitutions to reduce this list, or alternatively, supply more arguments into find to narrow what it returns. See the man page for find.

Thorson answered 3/12, 2011 at 20:9 Comment(0)
A
2

My solution is based on the one above, uses sed instead of patsubst to mangle the output of find AND escape the spaces.

Going from flac/ to ogg/

OGGS = $(shell find flac -type f -name "*.flac" | sed 's/ /\\ /g;s/flac\//ogg\//;s/\.flac/\.ogg/' )

Caveats:

  1. Still barfs if there are semi-colons in the filename, but they're pretty rare.
  2. The $(@D) trick won't work (outputs gibberish), but oggenc creates directories for you!
Almonte answered 15/1, 2014 at 18:52 Comment(0)
V
2

Here's a Python script I quickly hacked together to solve the original problem: keep a compressed copy of a music library. The script will convert .m4a files (assumed to be ALAC) to AAC format, unless the AAC file already exists and is newer than the ALAC file. MP3 files in the library will be linked, since they are already compressed.

Just beware that aborting the script (ctrl-c) will leave behind a half-converted file.

I originally also wanted to write a Makefile to handle this, but since it cannot handle spaces in filenames (see the accepted answer) and because writing a bash script is guaranteed to put in me in a world of pain, Python it is. It's fairly straightforward and short, and thus should be easy to tweak to your needs.

from __future__ import print_function


import glob
import os
import subprocess


UNCOMPRESSED_DIR = 'Music'
COMPRESSED = 'compressed_'

UNCOMPRESSED_EXTS = ('m4a', )   # files to convert to lossy format
LINK_EXTS = ('mp3', )           # files to link instead of convert


for root, dirs, files in os.walk(UNCOMPRESSED_DIR):
    out_root = COMPRESSED + root
    if not os.path.exists(out_root):
        os.mkdir(out_root)
    for file in files:
        file_path = os.path.join(root, file)
        file_root, ext = os.path.splitext(file_path)
        if ext[1:] in LINK_EXTS:
            if not os.path.exists(COMPRESSED + file_path):
                print('Linking {}'.format(file_path))
                link_source = os.path.relpath(file_path, out_root)
                os.symlink(link_source, COMPRESSED + file_path)
            continue
        if ext[1:] not in UNCOMPRESSED_EXTS:
            print('Skipping {}'.format(file_path))
            continue
        out_file_path = COMPRESSED + file_path
        if (os.path.exists(out_file_path)
            and os.path.getctime(out_file_path) > os.path.getctime(file_path)):
            print('Up to date: {}'.format(file_path))
            continue
        print('Converting {}'.format(file_path))
        subprocess.call(['ffmpeg', '-y', '-i', file_path,
                         '-c:a', 'libfdk_aac', '-vbr', '4',
                         out_file_path])

Of course, this can be enhanced to perform the encoding in parallel. That is left as an exercise to the reader ;-)

Virginia answered 25/12, 2016 at 19:38 Comment(0)
S
0

To find files recursively without resorting to external dependencies like find, you can use functions. Then use the result as in the other answer to convert the files.

rwildcard=$(wildcard $1) $(foreach d,$1,$(call rwildcard,$(addsuffix /$(notdir $d),$(wildcard $(dir $d)*))))

FLAC_FILES = $(call rwildcard,flac/*.flac)
MP3_FILES = $(patsubst flac/%.flac, mp3/%.mp3, $(FLAC_FILES))

.PHONY: all
all: $(MP3_FILES)

mp3/%.mp3: flac/%.flac
        @mkdir -p "$(@D)"
        @echo convert "$<" to "$@"
Samoyed answered 25/8, 2023 at 11:42 Comment(1)
See github.com/markpiffer/gmtt#call-wildcard-reclist-of-globs for a beefed up version of recursive wildcardsNippur

© 2022 - 2024 — McMap. All rights reserved.