Git repository unique id
Asked Answered
M

6

8

I need to find out if a commit belongs to a particular git repository.

The idea is to generate some unique id for every repository I need to test. Then I can compare this unique id to the id, calculated from tested commit.

For example take an SHA of initial change set. Can it uniqely identify the repository?

Motherhood answered 14/2, 2011 at 14:59 Comment(5)
When you say "belongs to" do you mean "originated in" or "is found in"?Tui
Well... I can't check if the commit really can be found in repo (too much time needed). But I would like to know if some ancestor of the tested commit exists in the repo. I think that means 'originated" :)Motherhood
You're right... but lets take a look from another side: My repository and your's repository are cloned from same origin. Can we find out the fact that the origin is same (without trying to push/fetch)Motherhood
There's certainly some meaning in two repositories having the same initial commit. What's your true end goal here?Pietje
"Well... I can't check if the commit really can be found in repo" . . . of course you can. if git cat-file -e $thecommit; then the commit exists in the repo; fiRephrase
S
5

The SHA1 key is about identifying the content (of a blob, or of a tree), not about a repository.
If the content differ from repo to repo, then its history has no common ancestor, so I don't think a change-set-based solution will work.

Maybe (not tested) you could add some marker (without having to change all the SHA1) through git notes.
See for instance GitHub deploy-notes which uses this mechanism to track deployments.

Subpoena answered 14/2, 2011 at 15:14 Comment(3)
Thanks for the answer. I just supposed that SHA also includes some timestamp of when the 'git init' was executed or whatever...Motherhood
@Ilya: It's much more than that: the SHA1 of a commit depends on all of its contents: the metadata (date, author, message), the tree, and its parent(s). If there is any difference at all in a commit or any of its ancestry, the SHA1 will change.Pietje
Thanks! This probably saved couple of hours reading manuals :)Motherhood
J
1

(moved from comment)

That's not possible if you don't have the parent of the particular commit already in your repository (in which case you can trivially answer the question). While the commit holds a reference to the parent and maintains the whole tree's integrity that way, you cannot reconstruct a commit just from the hash if you don't have that commit, so you can't find out that parent's parent and so on until you find a parent which actually is within your repository.

Josselyn answered 14/2, 2011 at 15:18 Comment(0)
T
0

You can use git filter-branch to search for the commit you are looking for.

A hash of the initial commit does not give you much info about the repository itself. There's no way to uniquely identify a repository.

Thirion answered 14/2, 2011 at 15:14 Comment(0)
C
0

In Rietveld we can not force everybody to use 'git notes' when people want to find reviews made against their repositories, so we are going to use the last hash from the output of git rev-list --parents HEAD.

Carroty answered 22/9, 2011 at 12:44 Comment(0)
D
0

Compare with Mercurial, where is checks mercurial/treediscovery.py (Mercurial repository identification):

base = list(base)
if base == [nullid]:
    if force:
        repo.ui.warn(_("warning: repository is unrelated\n"))
    else:
        raise util.Abort(_("repository is unrelated"))

base variable store last common parts of two repositories.

Git have same assumptions when emit warning: no common commits on fetch/push. I just didn't grep Git sources, that require time.

By giving this idea of Mercurial push/pull checks we may assume that repositories are related if they have common roots. For mercurial this means that hashes from command:

$ hg log -r "roots(all())"

for both repositories must have non-empty interjection.

You may not trick roots checking by carefully crafting repositories because building two repositories looks like these (with common parts but different roots):

0 <--- SHA-256-XXX <--- SHA-256-YYY <--- SHA-256-ZZZ
0 <--- SHA-256-YYY <--- SHA-256-ZZZ

impossible because that mean you reverse SHA-256 as each subsequent hash depends on previous values. That is true both for Mercurial and Git.

Corresponding command to see roots in Git is:

$ git log --format=oneline --all --max-parents=0

You can toy yourself with:

bash# md git
/home/user/tmp/git

bash# md one
/home/user/tmp/git/one

bash# git init
Initialized empty Git repository in /home/user/tmp/git/one/.git/

bash# echo x1 > x1
bash# git add x1
bash# git ci -m x1
[master (root-commit) 1208fb0] x1

bash# echo x2 > x2
bash# git add x2
bash# git ci -m x2
[master 1c3fe86] x2

bash# cd ..

bash# md two
/home/user/tmp/git/two

bash# git init
Initialized empty Git repository in /home/user/tmp/git/two/.git/

bash# echo y1 > y1
bash# git add y1
bash# git ci -m y1
[master (root-commit) ff56a8e] y1

bash# echo y2 > y2
bash# git add y2
bash# git ci -m y2
[master 18adff5] y2

bash# git fetch ../one/
warning: no common commits
remote: Counting objects: 6, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 6 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (6/6), done.
From ../one
 * branch            HEAD       -> FETCH_HEAD

bash# git co --orphan one
Switched to a new branch 'one'

bash# git merge FETCH_HEAD

bash# git log --format=oneline --all
18adff541c7ce9f1a1f2be2804d6d0e5792ff086 y2
ff56a8e7e9145d2b1b5a760bbc9b12451927ab0c y1
1c3fe8665851e89d37f49633cd2478900217b91c x2
1208fb0f721005207c6afe6a549a9ed0dcc5b0a8 x1

bash# git log --format=oneline --all --max-parents=0
ff56a8e7e9145d2b1b5a760bbc9b12451927ab0c y1
1208fb0f721005207c6afe6a549a9ed0dcc5b0a8 x1

bash# git log --all --graph

* commit 18adff541c7ce9f1a1f2be2804d6d0e5792ff086
|     y2
|  
* commit ff56a8e7e9145d2b1b5a760bbc9b12451927ab0c
      y1

* commit 1c3fe8665851e89d37f49633cd2478900217b91c
|     x2
|  
* commit 1208fb0f721005207c6afe6a549a9ed0dcc5b0a8
      x1

NOTE Git allow partial checkout. I didn't check this case for --max-parents=0.

Darnel answered 23/10, 2015 at 11:17 Comment(0)
A
0

When I have a write access on a repo, I find useful to generate a random uuid that I will store inside a .gituuid file, which is also commited:

uuidgen > .gituuid
git add .gituuid
git commit -m "Add: git uuid" .gituuid

This globally solve how to uniquely identify a repo, but this answer is only relevant if you have write permissions.

Note: I've some other scripts that tracks thoses git uuids and allow me to locate where are the associated repo on my file system. But this is out of scope.

Apple answered 14/2, 2022 at 19:17 Comment(1)
This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From ReviewNosology

© 2022 - 2024 — McMap. All rights reserved.