Java library to return a List<File> for glob or Ant-like pattern "*foo/**/*.txt"?
Asked Answered
K

3

14

I'm looking for a lib which would provide a method which would give me a list of files matching given Ant-like pattern.

For *foo/**/*.txt I'd get

foo/x.txt
foo/bar/baz/.txt
myfoo/baz/boo/bar.txt

etc. I know it's achievable with DirWalker and

PathMatcher mat = FileSystems.getDefault().getPathMatcher("glob:" + filesPattern);

, but I'd rather some maintained lib. I expected Commons IO to have it but no.

Update: I'm happy with reusing Ant's code, but would prefer something smaller than whole Ant.

Keppel answered 4/6, 2013 at 17:23 Comment(2)
File.list(FileNameFilter) also not helpful?Plautus
That's not recursive.Vanzandt
K
3

So I sacrified few MB of app's size for the sake of speed and used Ant's DirectoryScanner in the end.

Also, there's Spring's PathMatchingResourcePatternResolver.

//files = new PatternDirWalker( filesPattern ).list( baseDir );
files = new DirScanner( filesPattern ).list( baseDir );


public class DirScanner {

    private String pattern;

    public DirScanner( String pattern ) {
        this.pattern = pattern;
    }

    public List<File> list( File dirToScan ) throws IOException {

            DirectoryScanner ds = new DirectoryScanner();
            String[] includes = {  this.pattern };
            //String[] excludes = {"modules\\*\\**"};
            ds.setIncludes(includes);
            //ds.setExcludes(excludes);
            ds.setBasedir( dirToScan );
            //ds.setCaseSensitive(true);
            ds.scan();

            String[] matches = ds.getIncludedFiles();
            List<File> files = new ArrayList(matches.length);
            for (int i = 0; i < matches.length; i++) {
                files.add( new File(matches[i]) );
            }
            return files;
    }

}// class

And here's my impl I started to code, not finished, just if someone would like to finish it. The idea was it would keep a stack of patterns, traverse the dir tree and compare the contents to the actual stack depth and the rest of it in case of **.

But I resorted to PathMatcher and then to Ant's impl.

public class PatternDirWalker {
    //private static final Logger log = LoggerFactory.getLogger( PatternDirWalker.class );

    private String pattern;
    private List segments;
    private PathMatcher mat;

    public PatternDirWalker( String pattern ) {
        this.pattern = pattern;
        this.segments = parseSegments(pattern);
        this.mat = FileSystems.getDefault().getPathMatcher("glob:" + pattern);
    }

    public List<File> list( File dirToScan ) throws IOException{

        return new DirectoryWalker() {
            List<File> files = new LinkedList();

            @Override protected void handleFile( File file, int depth, Collection results ) throws IOException {
                if( PatternDirWalker.this.mat.matches( file.toPath()) )
                    results.add( file );
            }

            public List<File> findMatchingFiles( File dirToWalk ) throws IOException {
                this.walk( dirToWalk, this.files );
                return this.files;
            }
        }.findMatchingFiles( dirToScan );

    }// list()

    private List<Segment> parseSegments( String pattern ) {
        String[] parts = StringUtils.split("/", pattern);
        List<Segment> segs = new ArrayList(parts.length);
        for( String part : parts ) {
            Segment seg = new Segment(part);
            segs.add( seg );
        }
        return segs;
    }

    class Segment {
        public final String pat;  // TODO: Tokenize
        private Segment( String pat ) {
            this.pat = pat;
        }
    }

}// class
Keppel answered 5/6, 2013 at 3:6 Comment(0)
D
2

As of Java 7 there is a recursive directory scan. Java 8 can improve it a bit syntactically.

    Path start = FileSystems.getDefault().getPath(",,,");
    walk(start, "**.java");

One needs a glob matching class, best on directory level, so as to skip directories.

class Glob {
    public boolean matchesFile(Path path) {
        return ...;
    }

    public boolean matchesParentDir(Path path) {
        return ...;
    }
}

Then the walking would be:

public static void walk(Path start, String searchGlob) throws IOException {
    final Glob glob = new Glob(searchGlob);
    Files.walkFileTree(start, new SimpleFileVisitor<Path>() {
        @Override
        public FileVisitResult visitFile(Path file,
                BasicFileAttributes attrs) throws IOException {
            if (glob.matchesFile(file)) {
                ...; // Process file
            }
            return FileVisitResult.CONTINUE;
        }

        @Override
        public FileVisitResult preVisitDirectory(Path dir,
                BasicFileAttributes attrs) throws IOException {
            return glob.matchesParentDir(dir)
                ? FileVisitResult.CONTINUE : FileVisitResult.SKIP_SUBTREE;
        }
    });
}

}

Demicanton answered 23/3, 2015 at 9:16 Comment(2)
Could you please elaborate on the Glob? I'm not quite sure how to implement the missing parts.Peacemaker
My answer is not optimal, as it does not utilize the constant parts in the glob pattern, like /src/main/'. The Glob could be implemented starting with every heading subdirectory, *foo/**/*.txt` as first in the currect directory searching *foo.Demicanton
B
0

Google Guava has a TreeTraverser for files that lets you do depth-first and breadth-first enumeration of files in a directory. You could then filter the results based on a regex of the filename, or anything else you need to do.

Here's an example (requires Guava):

import java.io.File;
import java.util.List;
import java.util.regex.Pattern;
import com.google.common.base.Function;
import com.google.common.base.Predicates;
import com.google.common.io.Files;
import com.google.common.collect.Iterables;
import com.google.common.collect.TreeTraverser;

public class FileTraversalExample {

  private static final String PATH = "/path/to/your/maven/repo";
  private static final Pattern SEARCH_PATTERN = Pattern.compile(".*\\.jar");

  public static void main(String[] args) {
    File directory = new File(PATH);
    TreeTraverser<File> traverser = Files.fileTreeTraverser();
    Iterable<String> allFiles = Iterables.transform(
        traverser.breadthFirstTraversal(directory),
        new FileNameProducingPredicate());
    Iterable<String> matches = Iterables.filter(
      allFiles,
      Predicates.contains(SEARCH_PATTERN));
    System.out.println(matches);
  }

  private static class FileNameProducingPredicate implements Function<File, String> {
    public String apply(File input) {
      return input.getAbsolutePath();
    }
  }

}

Guava will let you filter by any Predicate, using Iterables.filter, so you don't have to use a Pattern if you don't want to.

Bacteriostat answered 23/3, 2015 at 8:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.