In GitPython, I can iterate separately the diff information for every change in the tree by calling the diff()
method between different commit objects. If I call diff()
with the create_patch=True
keyword argument, a patch string is created for every change (additions, deletions, renames) which I can access through the created diff
object, and dissect for the changes.
However, I don't have a parent to compare to with the first commit.
import git
from git.compat import defenc
repo = git.Repo("path_to_my_repo")
commits = list(repo.iter_commits('master'))
commits.reverse()
for i in commits:
if not i.parents:
# First commit, don't know what to do
continue
else:
# Has a parent
diff = i.diff(i.parents[0], create_patch=True)
for k in diff:
try:
# Get the patch message
msg = k.diff.decode(defenc)
print(msg)
except UnicodeDecodeError:
continue
You can use the method
diff = repo.git.diff_tree(i.hexsha, '--', root=True)
But this calls git diff
on the whole tree with the given arguments, returns a string and I cannot get the information for every file separately.
Maybe, there is a way to create a root
object of some sorts. How can I get the first changes in a repository?
EDIT
A dirty workaround seems to be comparing to the empty tree by directly using its hash:
EMPTY_TREE_SHA = "4b825dc642cb6eb9a060e54bf8d69288fbee4904"
....
if not i.parents:
diff = i.diff(EMPTY_TREE_SHA, create_patch=True, **diffArgs)
else:
diff = i.diff(i.parents[0], create_patch=True, **diffArgs)
But this hardly seems like a real solution. Other answers are still welcome.