GitPython get tree and blob object by sha
Asked Answered
S

3

8

I'm using GitPython with a bare repository and I'm trying to get specific git object by its SHA. If I used git directly, I would just do this

git ls-tree sha_of_tree
git show sha_of_blob

Since I'm using GitPython and I want to get a specific tree, I do the following:

repo = Repo("path_to_my_repo")
repo.tree("b466a6098a0287ac568ef0ad783ae2c35d86362b")

And get this back

<git.Tree "b466a6098a0287ac568ef0ad783ae2c35d86362b">

Now I have a tree object, but I cannot access its attributes like path, name, blobs, etc.

repo.tree("b466a6098a0287ac568ef0ad783ae2c35d86362b").path
Traceback (most recent call last):

File "<stdin>", line 1, in <module>
File "c:\Python27\lib\site-packages\gitdb\util.py", line 238, in __getattr__
self._set_cache_(attr)
File "c:\Python27\lib\site-packages\git\objects\tree.py", line 147, in _set_cache_
super(Tree, self)._set_cache_(attr)
File "c:\Python27\lib\site-packages\git\objects\base.py", line 157, in _set_cache_
raise AttributeError( "path and mode attributes must have been set during %s object creation" % type(self).__name__ )
AttributeError: path and mode attributes must have been set during Tree object creation

But if I type the following, it works

repo.tree().trees[0].path

The other part of my question is how to get a blob object with GitPython. I noticed that the only object tree has attribute blobs, so in order to get blob by SHA, I have to (a) first know which tree it belongs to, (b) find this blob, and then (c) call the data_stream method. I could just do

repo.git.execute("git show blob_sha")

but I would like to know first that this is the only way to do this.

Soph answered 23/5, 2012 at 10:11 Comment(0)
V
5

Try this:

   def read_file_from_branch(self, repo, branch, path, charset='ascii'):
            '''
            return the contents of a file in a branch, without checking out the
            branch
            '''
            if branch in repo.heads:
                blob = (repo.heads[branch].commit.tree / path)
                if blob:
                    data = blob.data_stream.read()
                    if charset:
                        return data.decode(charset)
                    return data
            return None
Value answered 17/4, 2015 at 20:50 Comment(1)
Please add some more explanation of your code. Code-only answers are not very helpful. Thanks.Robertoroberts
L
4

In general, a tree has children which are blobs and more trees. The blobs are files that are direct children of that tree and the other trees are directories that are direction children of that tree.

Accessing the files directly below that tree:

repo.tree().blobs # returns a list of blobs

Accessing the directories directly below that tree:

repo.tree().trees # returns a list of trees

How about looking at the blobs in the subdirectories:

for t in repo.tree().trees:
    print t.blobs

Let's get the path of the first blob from earlier:

repo.tree().blobs[0].path # gives the relative path
repo.tree().blobs[0].abspath # gives the absolute path

Hopefully this gives you a better idea of how to navigate this data structure and how to access the attributes of these objects.

Lyns answered 12/10, 2012 at 20:0 Comment(0)
C
3

I was looking for this because I had the same issue, and I found a solution:

>>> import binascii
>>> id_to_find = repo.head.commit.tree['README'].hexsha  # For example
>>> id_to_find
"aee35f14ee5515ee98d546a170be60690576db4b"
>>> git.objects.blob.Blob(repo, binascii.a2b_hex(id_to_find))
<git.Blob "aee35f14ee5515ee98d546a170be60690576db4b">

I feel like there should be a way to reference Blob through the repo, but I couldn't find it.

Complicated answered 15/2, 2014 at 22:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.