I would like to include the current git hash in the output of a Python script (as a the version number of the code that generated that output).
How can I access the current git hash in my Python script?
I would like to include the current git hash in the output of a Python script (as a the version number of the code that generated that output).
How can I access the current git hash in my Python script?
The git describe
command is a good way of creating a human-presentable "version number" of the code. From the examples in the documentation:
With something like git.git current tree, I get:
[torvalds@g5 git]$ git describe parent v1.0.4-14-g2414721
i.e. the current head of my "parent" branch is based on v1.0.4, but since it has a few commits on top of that, describe has added the number of additional commits ("14") and an abbreviated object name for the commit itself ("2414721") at the end.
From within Python, you can do something like the following:
import subprocess
label = subprocess.check_output(["git", "describe"]).strip()
fatal: No names found, cannot describe anything.
–
Pyromania git describe --always
will fallback to the last commit if no tags are found –
Tojo <last tag>-<num commits after tag>-<hash>
I had to use git describe --long --tags
–
Raggletaggle >>> label = subprocess.check_output(["git", "describe"]) fatal: No names found, cannot describe anything. Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 573, in check_output raise CalledProcessError(retcode, cmd, output=output) subprocess.CalledProcessError: Command '['git', 'describe']' returned non-zero exit status 128
–
Decibel git describe
normally requires at least one tag. If you don't have any tags, use the --always
option. See the git describe documentation for more information. –
Mohl fatal: Not a valid object name parent
–
Paulapauldron parent
is an example provided in the documentation. You would use your own branch name there. –
Mohl <branch_name>
or emphasised in text –
Paulapauldron git describe --always
is no good because it returns the annotated tag if one exists –
Apomorphine No need to hack around getting data from the git
command yourself. GitPython is a very nice way to do this and a lot of other git
stuff. It even has "best effort" support for Windows.
After pip install gitpython
you can do
import git
repo = git.Repo(search_parent_directories=True)
sha = repo.head.object.hexsha
Something to consider when using this library. The following is taken from gitpython.readthedocs.io
Leakage of System Resources
GitPython is not suited for long-running processes (like daemons) as it tends to leak system resources. It was written in a time where destructors (as implemented in the
__del__
method) still ran deterministically.In case you still want to use it in such a context, you will want to search the codebase for
__del__
implementations and call these yourself when you see fit.Another way assure proper cleanup of resources is to factor out GitPython into a separate process which can be dropped periodically
ImportError: No module named gitpython
. You cannot rely on the end user having gitpython
installed, and requiring them to install it before your code works makes it not portable. Unless you are going to include automatic installation protocols, at which point it is no longer a clean solution. –
Donny pip
/ requirements.txt
) on all platforms. What's not "clean"? –
Indignation pip
is not available on all systems. For that matter, neither is the external internet access needed by pip
to install said packages. –
Donny import numpy as np
can be assumed throughout the whole of stackoverflow but installing gitpython is beyond 'clean' and 'portable'. I think this is by far the best solution, because it does not reinvent the wheel, hides away the ugly implementation and does not go around hacking the answer of git from subprocess. –
Malta subprocess
is a standard method for interacting with CLI programs from within Python. Installing 3rd party libraries as a crux to solve every simple problem in Python is not a great practice and causes issues the moment you need to run your code on any other system. If you want to hide the 'ugly implementation', then use a function. If the code is never going to be run by anyone or anywhere else, then of course use whatever solution you like. –
Donny pip
or the ability to easily install pip
. In these modern scenarios, a pip
solution is just as portable as a "standard library" solution. –
Xenon This post contains the command, Greg's answer contains the subprocess command.
import subprocess
def get_git_revision_hash() -> str:
return subprocess.check_output(['git', 'rev-parse', 'HEAD']).decode('ascii').strip()
def get_git_revision_short_hash() -> str:
return subprocess.check_output(['git', 'rev-parse', '--short', 'HEAD']).decode('ascii').strip()
when running
print(get_git_revision_hash())
print(get_git_revision_short_hash())
you get output:
fd1cd173fc834f62fa7db3034efc5b8e0f3b43fe
fd1cd17
subprocess.check_output(['git', 'rev-parse', '--abbrev-ref', 'HEAD'])
for the branch name –
Disestablish .decode('ascii').strip()
to decode the binary string (and remove the line break). –
Lovash universal_newlines=True
to get a string. –
Mcnair cwd=os.path.dirname(os.path.realpath(__file__))
as a parameter for check_output
–
Extortionary The git describe
command is a good way of creating a human-presentable "version number" of the code. From the examples in the documentation:
With something like git.git current tree, I get:
[torvalds@g5 git]$ git describe parent v1.0.4-14-g2414721
i.e. the current head of my "parent" branch is based on v1.0.4, but since it has a few commits on top of that, describe has added the number of additional commits ("14") and an abbreviated object name for the commit itself ("2414721") at the end.
From within Python, you can do something like the following:
import subprocess
label = subprocess.check_output(["git", "describe"]).strip()
fatal: No names found, cannot describe anything.
–
Pyromania git describe --always
will fallback to the last commit if no tags are found –
Tojo <last tag>-<num commits after tag>-<hash>
I had to use git describe --long --tags
–
Raggletaggle >>> label = subprocess.check_output(["git", "describe"]) fatal: No names found, cannot describe anything. Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 573, in check_output raise CalledProcessError(retcode, cmd, output=output) subprocess.CalledProcessError: Command '['git', 'describe']' returned non-zero exit status 128
–
Decibel git describe
normally requires at least one tag. If you don't have any tags, use the --always
option. See the git describe documentation for more information. –
Mohl fatal: Not a valid object name parent
–
Paulapauldron parent
is an example provided in the documentation. You would use your own branch name there. –
Mohl <branch_name>
or emphasised in text –
Paulapauldron git describe --always
is no good because it returns the annotated tag if one exists –
Apomorphine Here's a more complete version of Greg's answer:
import subprocess
print(subprocess.check_output(["git", "describe", "--always"]).strip().decode())
Or, if the script is being called from outside the repo:
import subprocess, os
print(subprocess.check_output(["git", "describe", "--always"], cwd=os.path.dirname(os.path.abspath(__file__))).strip().decode())
Or, if the script is being called from outside the repo and you like pathlib
:
import subprocess
from pathlib import Path
print(subprocess.check_output(["git", "describe", "--always"], cwd=Path(__file__).resolve().parent).strip().decode())
os.chdir
, the cwd=
arg can be used in check_output
to temporary changes the working directory before executing. –
Derosier If subprocess isn't portable and you don't want to install a package to do something this simple you can also do this.
import pathlib
def get_git_revision(base_path):
git_dir = pathlib.Path(base_path) / '.git'
with (git_dir / 'HEAD').open('r') as head:
ref = head.readline().split(' ')[-1].strip()
with (git_dir / ref).open('r') as git_hash:
return git_hash.readline().strip()
I've only tested this on my repos but it seems to work pretty consistantly.
numpy
has a nice looking multi-platform routine in its setup.py
:
import os
import subprocess
# Return the git revision as a string
def git_version():
def _minimal_ext_cmd(cmd):
# construct minimal environment
env = {}
for k in ['SYSTEMROOT', 'PATH']:
v = os.environ.get(k)
if v is not None:
env[k] = v
# LANGUAGE is used on win32
env['LANGUAGE'] = 'C'
env['LANG'] = 'C'
env['LC_ALL'] = 'C'
out = subprocess.Popen(cmd, stdout = subprocess.PIPE, env=env).communicate()[0]
return out
try:
out = _minimal_ext_cmd(['git', 'rev-parse', 'HEAD'])
GIT_REVISION = out.strip().decode('ascii')
except OSError:
GIT_REVISION = "Unknown"
return GIT_REVISION
numpy
found it necessary to "construct a minimal environment"? (assuming they had good reason to) –
Fakieh env
dict was necessary for cross-platform functionality. Yuji's answer does not, but perhaps that works on both UNIX and Windows. –
Halfassed .decode('ascii')
works - otherwise the encoding is unknown. –
Mcnair from numpy.setup import git_version
and it didn't work –
Zucchetto setup.py
, it is not part of the numpy
package, so it isn't possible to import it from numpy
. To use it, you would need to add this method to your own code somewhere. –
Halfassed This is an improvement of Yuji 'Tomita' Tomita answer.
import subprocess
def get_git_revision_hash():
full_hash = subprocess.check_output(['git', 'rev-parse', 'HEAD'])
full_hash = str(full_hash, "utf-8").strip()
return full_hash
def get_git_revision_short_hash():
short_hash = subprocess.check_output(['git', 'rev-parse', '--short', 'HEAD'])
short_hash = str(short_hash, "utf-8").strip()
return short_hash
print(get_git_revision_hash())
print(get_git_revision_short_hash())
if you want a bit more data than the hash, you can use git-log
:
import subprocess
def get_git_hash():
return subprocess.check_output(['git', 'log', '-n', '1', '--pretty=tformat:%H']).strip()
def get_git_short_hash():
return subprocess.check_output(['git', 'log', '-n', '1', '--pretty=tformat:%h']).strip()
def get_git_short_hash_and_commit_date():
return subprocess.check_output(['git', 'log', '-n', '1', '--pretty=tformat:%h-%ad', '--date=short']).strip()
for full list of formating options - check out git log --help
I ran across this problem and solved it by implementing this function. https://gist.github.com/NaelsonDouglas/9bc3bfa26deec7827cb87816cad88d59
from pathlib import Path
def get_commit(repo_path):
git_folder = Path(repo_path,'.git')
head_name = Path(git_folder, 'HEAD').read_text().split('\n')[0].split(' ')[-1]
head_ref = Path(git_folder,head_name)
commit = head_ref.read_text().replace('\n','')
return commit
r = get_commit('PATH OF YOUR CLONED REPOSITORY')
print(r)
If you don't have Git available for some reason, but you have the git repo (.git
folder is found), you can fetch the commit hash from .git/fetch/heads/[branch]
.
For example, I've used a following quick-and-dirty Python snippet run at the repository root to get the commit id:
git_head = '.git\\HEAD'
# Open .git\HEAD file:
with open(git_head, 'r') as git_head_file:
# Contains e.g. ref: ref/heads/master if on "master"
git_head_data = str(git_head_file.read())
# Open the correct file in .git\ref\heads\[branch]
git_head_ref = '.git\\%s' % git_head_data.split(' ')[1].replace('/', '\\').strip()
# Get the commit hash ([:7] used to get "--short")
with open(git_head_ref, 'r') as git_head_ref_file:
commit_id = git_head_ref_file.read().strip()[:7]
I had a problem similar to the OP, but in my case I'm delivering the source code to my client as a zip file and, although I know they will have python installed, I cannot assume they will have git. Since the OP didn't specify his operating system and if he has git installed, I think I can contribute here.
To get only the hash of the commit, Naelson Douglas's answer was perfect, but to have the tag name, I'm using the dulwich python package. It's a simplified git client in python.
After installing the package with pip install dulwich --global-option="--pure"
one can do:
from dulwich import porcelain
def get_git_revision(base_path):
return porcelain.describe(base_path)
r = get_git_revision("PATH OF YOUR REPOSITORY's ROOT FOLDER")
print(r)
I've just run this code in one repository here and it showed the output v0.1.2-1-gfb41223
, similar to what is returned by git describe, meaning that I'm 1 commit after the tag v0.1.2 and the 7-digit hash of the commit is fb41223.
It has some limitations: currently it doesn't have an option to show if a repository is dirty and it always shows a 7-digit hash, but there's no need to have git installed, so one can choose the trade-off.
Edit: in case of errors in the command pip install
due to the option --pure
(the issue is explained here), pick one of the two possible solutions:
pip install urllib3 certifi && pip install dulwich --global-option="--pure"
pip install dulwich
. This will install some platform dependent files in your system, but it will improve the package's performance.If you are like me :
Then (this will not work on shell because shell doesn't detect current file path, replace BASE_DIR by your current file path) :
import os
import raven
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
print(raven.fetch_git_sha(BASE_DIR))
That's it.
I was looking for another solution because I wanted to migrate to sentry_sdk and leave raven but maybe some of you want to continue using raven for a while.
Here was the discussion that get me into this stackoverflow issue
So using the code of raven without raven is also possible (see discussion) :
from __future__ import absolute_import
import os.path
__all__ = 'fetch_git_sha'
def fetch_git_sha(path, head=None):
"""
>>> fetch_git_sha(os.path.dirname(__file__))
"""
if not head:
head_path = os.path.join(path, '.git', 'HEAD')
with open(head_path, 'r') as fp:
head = fp.read().strip()
if head.startswith('ref: '):
head = head[5:]
revision_file = os.path.join(
path, '.git', *head.split('/')
)
else:
return head
else:
revision_file = os.path.join(path, '.git', 'refs', 'heads', head)
if not os.path.exists(revision_file):
# Check for Raven .git/packed-refs' file since a `git gc` may have run
# https://git-scm.com/book/en/v2/Git-Internals-Maintenance-and-Data-Recovery
packed_file = os.path.join(path, '.git', 'packed-refs')
if os.path.exists(packed_file):
with open(packed_file) as fh:
for line in fh:
line = line.rstrip()
if line and line[:1] not in ('#', '^'):
try:
revision, ref = line.split(' ', 1)
except ValueError:
continue
if ref == head:
return revision
with open(revision_file) as fh:
return fh.read().strip()
I named this file versioning.py and I import "fetch_git_sha" where I need it passing file path as argument.
Hope it will help some of you ;)
© 2022 - 2024 — McMap. All rights reserved.
git rev-parse HEAD
from the command line. The output syntax should be obvious. – Zeelandsubprocess.check_output(['git', 'rev-parse', '--short', 'HEAD']).decode('ascii').strip()
after havingimport subprocess
– Decibel