How to "git log --follow <path>" in JGit? (To retrieve the full history including renames)
Asked Answered
B

3

5

How do I have to extend the following logCommand, to get the --follow option of the git log command working?

Git git = new Git(myRepository);
Iterable<RevCommit> log = git.log().addPath("com/mycompany/myclass.java").call();

This option is implemented in jGit, but I don't know how to use it. The logCommand's methods don't appear to be useful. Thank you!

Bosnia answered 13/7, 2012 at 13:50 Comment(3)
First result in google for "jgit follow renames": dev.eclipse.org/mhonarc/lists/jgit-dev/msg00426.htmlGuaranty
Although it's not JGit, but I found another project called "JavaGit", that seems to offer the whole High-Level-API of git, including a "Detect Renames"-Option for the LogCommand. However, unlike JGit it requires an installed git client on a linux or windows OS.Bosnia
Just did some further research. JavaGit isn't maintained since 2008 ;(Bosnia
B
16

During some midnight work I got the following:

The last commit of a LogCommand will get checked for renames against all older commits until a rename operation is found. This cycle will continue until no rename was found.

However, that search can take some time, especially if it iterates over all commits until the end and doesn't find any rename operation anymore. So, I am open for any improvement. I guess git normally uses indexes to perform the follow option in shorter time.

import org.eclipse.jgit.api.Git;
import org.eclipse.jgit.api.errors.GitAPIException;
import org.eclipse.jgit.diff.DiffEntry;
import org.eclipse.jgit.diff.RenameDetector;
import org.eclipse.jgit.errors.MissingObjectException;
import org.eclipse.jgit.lib.Repository;
import org.eclipse.jgit.revwalk.RevCommit;
import org.eclipse.jgit.treewalk.TreeWalk;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

/**
 * Create a Log command that enables the follow option: git log --follow -- < path >
 * User: OneWorld
 * Example for usage: ArrayList<RevCommit> commits =  new  LogFollowCommand(repo,"src/com/mycompany/myfile.java").call();
 */
public class LogFollowCommand {

    private final Repository repository;
    private String path;
    private Git git;

    /**
     * Create a Log command that enables the follow option: git log --follow -- < path >
     * @param repository
     * @param path
     */
    public LogFollowCommand(Repository repository, String path){
        this.repository = repository;
        this.path = path;
    }

    /**
     * Returns the result of a git log --follow -- < path >
     * @return
     * @throws IOException
     * @throws MissingObjectException
     * @throws GitAPIException
     */
    public ArrayList<RevCommit> call() throws IOException, MissingObjectException, GitAPIException {
        ArrayList<RevCommit> commits = new ArrayList<RevCommit>();
        git = new Git(repository);
        RevCommit start = null;
        do {
            Iterable<RevCommit> log = git.log().addPath(path).call();
            for (RevCommit commit : log) {
                if (commits.contains(commit)) {
                    start = null;
                } else {
                    start = commit;
                    commits.add(commit);
                }
            }
            if (start == null) return commits;
        }
        while ((path = getRenamedPath( start)) != null);

        return commits;
    }

    /**
     * Checks for renames in history of a certain file. Returns null, if no rename was found.
     * Can take some seconds, especially if nothing is found... Here might be some tweaking necessary or the LogFollowCommand must be run in a thread.
     * @param start
     * @return String or null
     * @throws IOException
     * @throws MissingObjectException
     * @throws GitAPIException
     */
    private String getRenamedPath( RevCommit start) throws IOException, MissingObjectException, GitAPIException {
        Iterable<RevCommit> allCommitsLater = git.log().add(start).call();
        for (RevCommit commit : allCommitsLater) {

            TreeWalk tw = new TreeWalk(repository);
            tw.addTree(commit.getTree());
            tw.addTree(start.getTree());
            tw.setRecursive(true);
            RenameDetector rd = new RenameDetector(repository);
            rd.addAll(DiffEntry.scan(tw));
            List<DiffEntry> files = rd.compute();
            for (DiffEntry diffEntry : files) {
                if ((diffEntry.getChangeType() == DiffEntry.ChangeType.RENAME || diffEntry.getChangeType() == DiffEntry.ChangeType.COPY) && diffEntry.getNewPath().contains(path)) {
                    System.out.println("Found: " + diffEntry.toString() + " return " + diffEntry.getOldPath());
                    return diffEntry.getOldPath();
                }
            }
        }
        return null;
    }
}
Bosnia answered 16/7, 2012 at 12:19 Comment(5)
Setting a path filter to the tree walk saved some time: tw.setFilter(PathFilter.create("src/main/java/"));Bosnia
Works great! But I think you should also add the start ObjectId (if !=null) to the log command in call(). What now happens is that when the file with the old name is being added again AFTER the rename, it will show up in the log of the new file.Lustrous
Thank you for providing the code. Because of your JavaDoc comments I knew immediately how I have to use your code. Excellent! Such good code examples are rare nowadays. +1! :)Fruitcake
Stackoverflow ain't gonna stop me from dropping a +1 comment, or maybe they will. Thank You OneWorld for this code, getting a git log --follow to work was absolute death until I stumbled upon this. +NickL if you're able to remember what you mean exactly, i'd love it if you elaborated a little, I'm getting exactly the issue you describe but I don't know how to catch it with a checkAmygdala
based on your example i wrote a code in scala which gets the first commit of a file. Thank you a lot! Maybe this will help someone: gist.github.com/wobu/ccfaccfc6c04c02b8d1227a0ac151c36So
R
0

I recall trying OneWorld's solution on a previous occasion, and while it worked, it was very slow. I thought I'd google around to see if there were any other possibilities out there.

Yes, in this Eclipse thread, there was a suggestion of using org.eclipse.jgit.revwalk.FollowFilter and to look for a use-example in RevWalkFollowFilterTest.java.

So thought I'd give that a try, resulting in code like that looks like this:

private static class DiffCollector extends RenameCallback {
    List<DiffEntry> diffs = new ArrayList<DiffEntry>();

    @Override
    public void renamed(DiffEntry diff) {
        diffs.add(diff);
    }
}

private DiffCollector diffCollector;

private void showFileHistory(String filepath)
{
    try
    {
        Config config = repo.getConfig();
        config.setBoolean("diff", null, "renames", true);

        RevWalk rw = new RevWalk(repo);
        diffCollector = new DiffCollector();

        org.eclipse.jgit.diff.DiffConfig dc = config.get(org.eclipse.jgit.diff.DiffConfig.KEY);
        FollowFilter followFilter =
                 FollowFilter.create(filepath, dc);
        followFilter.setRenameCallback(diffCollector);
        rw.setTreeFilter(followFilter);
        rw.markStart(rw.parseCommit(repo.resolve(Constants.HEAD)));

        for (RevCommit c : rw)
        {
            System.out.println(c.toString());
        }
    }
    catch(...

The results were, erm, ok I guess... The RevWalk did manage to walk through a simple rename of a file in the git-repo's history (performed by a "git mv {filename}" action).

However, it was unable to handle messier situations, such as when a colleague performed this set of actions in the repo's history:

  • 1st commit: Renamed a file with "git mv"
  • 2nd commit: Added a copy of that file in a new sub-folder location
  • 3rd commit: Deleted the old location's copy

In this scenario, JGit's follow capabilities will only get me the from the head to that 2nd commit, and stop there.

The real "git log --follow" command, however, seems to have enough smarts to figure out that:

  • The file added in the 2nd commit is the same as that in the 1st commit (even though they are in different locations)
  • It will give you the entire history:
    • from HEAD-to-2nd-commit (added copy of newly-named file in new location)
    • skips any mention of the 3rd-commit (delete of old file in old path)
    • followed by the 1st-commit and its history (old location and name of file)

So JGit's follow capabilities seem a little weaker compared to real Git. Ah well.

But anyway, I can confirm that using JGit's FollowFilter technique did work a lot faster than the previously suggested technique.

Ringtail answered 21/4, 2020 at 22:46 Comment(0)
N
0

It seems that few people encounter this problem. I have seen various solutions, but none of them are ideal.

Finally, I used a Java subprocess to invoke the dos command and analyze the results to obtain a specific commit, and then analyzed the commit. This method avoids searching for renamed files and relies entirely on the ability of git itself. I hope it can help people with this need!

String command ="cmd /c cd path/to/dir && git log --follow file_you_want";
Process p = Runtime.getRuntime().exec(command);
BufferedReader input = new BufferedReader(new 
InputStreamReader(p.getInputStream()));
String line;
String text = command +"";
System.out.println(text);
while ((line = input.readLine()) != null) {
    text += line;
    System.out.println("Line:" + line);
}
Nonobjective answered 3/4, 2023 at 6:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.