GitPython list all files affected by a certain commit
Asked Answered
D

4

10

I am using this for loop to loop through all commits:

repo = Repo("C:/Users/shiro/Desktop/lucene-solr/")
for commit in list(repo.iter_commits()):
    print commit.files_list  # how to do that ?

How can I get a list with the files affected from this specific commit ?

Dunsinane answered 26/9, 2016 at 16:21 Comment(0)
S
23

Try it

for commit in list(repo.iter_commits()):
    commit.stats.files
Shastashastra answered 12/11, 2016 at 21:20 Comment(2)
Keep in mind that this method is very slow. It runs git diff --numstat, which calculates a lot more than just the file list.Octad
FWIW: that that does strange things with renames/moves.Notate
H
1
from git import Repo
repo = Repo('/home/worldmind/test.git/')
prev = repo.commit('30c55d43d143189698bebb759143ed72e766aaa9')
curr = repo.commit('5f5eb0a3446628ef0872170bd989f4e2fa760277')
diff_index = prev.diff(curr)
for diff in diff_index:
    print(diff.change_type)
    print(f"{diff.a_path} -> {diff.b_path}")
Hadsall answered 6/12, 2019 at 8:53 Comment(3)
Things get more complicated when curr has more than one parent. Dealing with that will likely involve something like for prev in curr.parents: but exactly what to do from there is a function of what you are trying to do.Notate
There is also the question of what to do when curr is an initial commit and doesn't have a parent.Notate
After some more tinkering, it looks like commit.diff(None) is useful (e.g. with initial commits?) in some cases. But I haven't yet played around with it enough to know the details.Notate
O
0

commit.stats.files works, but it's very slow. It will take several seconds to process a large commit.

This is much faster:

repo = Repo("C:/Users/shiro/Desktop/lucene-solr/")
for commit in list(repo.iter_commits()):
   print(self.repo.git.show(commit.hexsha, name_only=True).split('\n'))
Octad answered 7/4, 2023 at 10:29 Comment(1)
That works but also generates some "human readable" lines that will need to be filtered.Notate
B
-5

I solved this problem for SCM Workbench. The important file is:

https://github.com/barry-scott/scm-workbench/blob/master/Source/Git/wb_git_project.py

Look at cmdCommitLogForFile() and its helper __addCommitChangeInformation().

The trick is to diff the tree objects.

Bettis answered 30/9, 2016 at 13:22 Comment(2)
StackOverflow highly discourages people from putting answers in linked pages. Linked pages often get moved or removed, then the answer is gone.Ribeiro
Can you please update your answer. I'm searching for the same and don't want to search this big file when you already have found the solution there. I want to get all files that are committed currently not all commits since the start of the repo.Bellerophon

© 2022 - 2024 — McMap. All rights reserved.