How to get Git log with short stat in one line?
Asked Answered
P

9

37

Following command outputs following lines of text on console

git log --pretty=format:"%h;%ai;%s" --shortstat
ed6e0ab;2014-01-07 16:32:39 +0530;Foo
 3 files changed, 14 insertions(+), 13 deletions(-)

cdfbb10;2014-01-07 14:59:48 +0530;Bar
 1 file changed, 21 insertions(+)

5fde3e1;2014-01-06 17:26:40 +0530;Merge Baz
772b277;2014-01-06 17:09:42 +0530;Qux
 7 files changed, 72 insertions(+), 7 deletions(-)

I'm interested in having above format to be displayed like this

ed6e0ab;2014-01-07 16:32:39 +0530;Foo;3;14;13
cdfbb10;2014-01-07 14:59:48 +0530;Bar;1;21;0
5fde3e1;2014-01-06 17:26:40 +0530;Merge Baz;0;0;0
772b277;2014-01-06 17:09:42 +0530;Qux;7;72;7

This will be consumed in some report which can parse semicolon separated values. The thing is the text "\n 3 files changed, 14 insertions(+), 13 deletions(-)" (new line included) gets converted to 3;14;13 (without new line) One possible corner case is text like "5fde3e1;2014-01-06 17:26:40 +0530;Merge Baz" which doesn't have such line. In that case I want ;0;0;0

Overall the goal is to analyze file change stats over a period of time. I read the git log documentation but couldn't find any format which will help me to render in this format. The best I came up was the above command mentioned.

So any command or shell script which can generate the expected format would be of great help.

Thanks!

Paralytic answered 15/1, 2014 at 12:28 Comment(0)
H
19
git log --oneline --pretty="@%h" --stat | grep -v \| | tr "\n" " " |  tr "@" "\n"

This will show something like this:

a596f1e   1 file changed, 6 insertions(+), 3 deletions(-) 
4a9a4a1   1 file changed, 6 deletions(-) 
b8325fd   1 file changed, 65 insertions(+), 4 deletions(-) 
968ef81   1 file changed, 4 insertions(+), 5 deletions(-) 
Hayes answered 24/11, 2016 at 3:46 Comment(1)
FYI, --pretty will override --oneline, so there's no need to specify both. Also, --stat will also output the file names which we then need to remove with grep -v \|, so you can use --shortstat instead to avoid having another line item to parse (example output). So the whole thing can be simplified to git log --pretty="@%h" --shortstat | tr "\n" " " | tr "@" "\n"Dextroamphetamine
C
10

This is, unfortunately, impossible to achieve using only git log. One has to use other scripts to compensate for something most people aren't aware of: some commits don't have stats, even if they are not merges.

I have been working on a project that converts git log to JSON and to get it done I had to do what you need: get each commit, with stats, in one line. The project is called Gitlogg and you're welcome to tweak it to your needs: https://github.com/dreamyguy/gitlogg

Below is the relevant part of Gitlogg, that will get you close to what you'd like:

git log --all --no-merges --shortstat --reverse --pretty=format:'commits\tcommit_hash\t%H\tcommit_hash_abbreviated\t%h\ttree_hash\t%T\ttree_hash_abbreviated\t%t\tparent_hashes\t%P\tparent_hashes_abbreviated\t%p\tauthor_name\t%an\tauthor_name_mailmap\t%aN\tauthor_email\t%ae\tauthor_email_mailmap\t%aE\tauthor_date\t%ad\tauthor_date_RFC2822\t%aD\tauthor_date_relative\t%ar\tauthor_date_unix_timestamp\t%at\tauthor_date_iso_8601\t%ai\tauthor_date_iso_8601_strict\t%aI\tcommitter_name\t%cn\tcommitter_name_mailmap\t%cN\tcommitter_email\t%ce\tcommitter_email_mailmap\t%cE\tcommitter_date\t%cd\tcommitter_date_RFC2822\t%cD\tcommitter_date_relative\t%cr\tcommitter_date_unix_timestamp\t%ct\tcommitter_date_iso_8601\t%ci\tcommitter_date_iso_8601_strict\t%cI\tref_names\t%d\tref_names_no_wrapping\t%D\tencoding\t%e\tsubject\t%s\tsubject_sanitized\t%f\tcommit_notes\t%N\tstats\t' |
  sed '/^[ \t]*$/d' |               # remove all newlines/line-breaks, including those with empty spaces
  tr '\n' 'ò' |                     # convert newlines/line-breaks to a character, so we can manipulate it without much trouble
  tr '\r' 'ò' |                     # convert carriage returns to a character, so we can manipulate it without much trouble
  sed 's/tòcommits/tòòcommits/g' |  # because some commits have no stats, we have to create an extra line-break to make `paste -d ' ' - -` consistent
  tr 'ò' '\n' |                     # bring back all line-breaks
  sed '{
      N
      s/[)]\n\ncommits/)\
  commits/g
  }' |                              # some rogue mystical line-breaks need to go down to their knees and beg for mercy, which they're not getting
  paste -d ' ' - -                  # collapse lines so that the `shortstat` is merged with the rest of the commit data, on a single line

Note that I've used the tab character ( \t ) to separate fields as ; could have been used on the commit message.

Another important part of this script is that each line must begin with an unique string (in this case it's commits). That's because our script needs to know where the line begins. In fact, whatever comes after the git log command is there to compensate for the fact that some commits might not have stats.

But it strikes me that what you want to achieve is to have commits neatly outputted in a format you can reliably consume. Gitlogg is perfect for that! Some of its features are:

  • Parse the git log of multiple repositories into one JSON file.
  • Introduced repository key/value.
  • Introduced files changed, insertions and deletions keys/values.
  • Introduced impact key/value, which represents the cumulative changes for the commit (insertions - deletions).
  • Sanitise double quotes " by converting them to single quotes ' on all values that allow or are created by user input, like subject.
  • Nearly all the pretty=format: placeholders are available.
  • Easily include / exclude which keys/values will be parsed to JSON by commenting out/uncommenting the available ones.
  • Easy to read code that's thoroughly commented.
  • Script execution feedback on console.
  • Error handling (since path to repositories needs to be set correctly).

Success, the JSON was parsed and saved. Success, the JSON was parsed and saved.

Error 001 Error 001: path to repositories does not exist.

Error 002 Error 002: path to repositories exists, but is empty.

Claudetta answered 28/5, 2016 at 10:27 Comment(0)
K
7

combining all answers above, here are my 2 cents in case anyone is looking:

echo "commit id,author,date,comment,changed files,lines added,lines deleted" > res.csv 
git log --since='last year'  --date=local --all --pretty="%x40%h%x2C%an%x2C%ad%x2C%x22%s%x22%x2C" --shortstat | tr "\n" " " | tr "@" "\n" >> res.csv
sed -i 's/ files changed//g' res.csv
sed -i 's/ file changed//g' res.csv
sed -i 's/ insertions(+)//g' res.csv
sed -i 's/ insertion(+)//g' res.csv
sed -i 's/ deletions(-)//g' res.csv
sed -i 's/ deletion(-)//g' res.csv

and either save it into git-logs-into-csv.sh file or just copy/paste into console.

I think it's relatively self-explaining but just in case:

  • --all takes logs from all branches
  • --since limits the number of commits we want to look at
  • --shortstat - to get some idea what was done in the commit
Kmeson answered 5/7, 2019 at 20:19 Comment(2)
sed -i gives an error on macosx. Here is how i modified the script... echo "commit id,author,date,comment,changed files,lines added,lines deleted" > res.csv git log --since='last 35 days' --date=local --all --pretty="%x40%h%x2C%an%x2C%ad%x2C%x22%s%x22%x2C" --shortstat | tr "\n" " " | tr "@" "\n" >> res.csv cat res.csv | sed -E 's/ files changed//g' | sed -E 's/ file changed//g' | sed -E 's/ insertions?//g' | \ sed -E 's/ insertions?//g' | sed -E 's/ deletions?//g' | sed -E 's/\(\+\)//g' | sed -E 's/\(-\)//g' > commits.csv rm res.csv cat commits.csvHeliocentric
This almost works but breaks if there is a @ in a comment as that will cause a linebreak.Fishery
H
5

git doesn't support stat info with plain --format, which is shame :( but it's easy to script it away, here's my quick and dirty solution, should be quite readable:

#!/bin/bash

format_log_entry ()
{
    read commit
    read date
    read summary
    local statnum=0
    local add=0
    local rem=0
    while true; do
        read statline
        if [ -z "$statline" ]; then break; fi
        ((statnum += 1))
        ((add += $(echo $statline | cut -d' ' -f1)))
        ((rem += $(echo $statline | cut -d' ' -f2)))
    done
    if [ -n "$commit" ]; then
        echo "$commit;$date;$summary;$statnum;$add;$rem"
    else
        exit 0
    fi
}

while true; do
    format_log_entry
done

I'm sure, that it can be scripted better, but hey - it's both quick AND dirty ;)

usage:

$ git log --pretty=format:"%h%n%ai%n%s" --numstat | ./script

Please note, that format, that you specified is not bulletproof. Semicolon can appear in commit summary, which will break number of fields in such line - you can either move summary to end of line or escape it somehow - how do you want to do it?

Haywire answered 15/1, 2014 at 17:26 Comment(0)
A
4

This is one approach with awk.

awk 'BEGIN{FS="[,;]"; OFS=";"} /;/ {a=$0} /^ /{gsub(/[a-z(+-) ]/,"") gsub(",",";"); print a,$0}'

For the given input it returns:

ed6e0ab;2014-01-07 16:32:39 +0530;Foo;3;14;13
cdfbb10;2014-01-07 14:59:48 +0530;Bar;1;21
772b277;2014-01-06 17:09:42 +0530;Qux;7;72;7

Still not working for lines like 5fde3e1;2014-01-06 17:26:40 +0530;Merge Baz that do not have a 3 files changed, 14 insertions(+), 13 deletions(-) after it.

Advise answered 15/1, 2014 at 12:46 Comment(2)
Ok... I'm not awk expert, but I'm getting following text ";1;10+);10-)" in end... basically extra +) and -) ... I'm sure this can be changed... not sure how.Paralytic
Maybe you need to escape + and these symbols in the gsub() function. In my awk it is not necessary.Advise
A
2

Following up @user2461539 to parse it into columns. Works with more complex cols like "Subject" too. Hack away to choose your own suitable delimiters. Currently need to cut subject line as it'll truncate other columns when it overflows.

#!/bin/bash
# assumes "_Z_Z_Z_" and "_Y_Y_" "_X_X_" as unused characters 
# Truncate subject line sanitized (%f) or not (%s) to 79 %<(79,trunc)%f
echo commit,author_name,time_sec,subject,files_changed,lines_inserted,lines_deleted>../tensorflow_log.csv;
git log --oneline --pretty="_Z_Z_Z_%h_Y_Y_\"%an\"_Y_Y_%at_Y_Y_\"%<(79,trunc)%f\"_Y_Y__X_X_"  --stat    \
    | grep -v \| \
    | sed -E 's/@//g' \
    | sed -E 's/_Z_Z_Z_/@/g' \
    |  tr "\n" " "   \
    |  tr "@" "\n" |sed -E 's/,//g'  \
    | sed -E 's/_Y_Y_/, /g' \
    | sed -E 's/(changed [0-9].*\+\))/,\1,/'  \
    | sed -E 's/(changed [0-9]* deleti.*-\)) /,,\1/' \
    | sed -E 's/insertion.*\+\)//g' \
    | sed -E 's/deletion.*\-\)//g' \
    | sed -E 's/,changed/,/' \
    | sed -E 's/files? ,/,/g'  \
    | sed -E 's/_X_X_ $/,,/g'  \
    | sed -E 's/_X_X_//g'>>../tensorflow_log.csv
Antonio answered 26/3, 2017 at 3:7 Comment(0)
M
0

I put something like this in my ~/.bashrc:

function git-lgs() {
   git --no-pager log --numstat --format=%ai "$1" | sed ':a;N;$!ba;s/\n\n/\t/g' | sed 's/\(\t[0-9]*\t*[0-9]*\).*/\1/'
}

Where git-lgs's argument is the filename for which you want to display the log.

Mitchmitchael answered 1/10, 2016 at 5:18 Comment(0)
I
0

Format the log output into a tab-separated form so it's more parseable, include shortstat, then pipe the output through awk and rearrange it:

git log -c --shortstat --first-parent --format="commit%x09%H%x09%cI%x09%sd"| \
  awk -v FS="\t" '/^commit/ { commit=$2; timestamp=$3; summary=$4 } /^ [0-9]+ files? changed/ { stats=$1; printf "https://github.com/timabell/gitopolis/commit/%s\t%s\t%-55s\t\"%s\"\n", commit, timestamp, stats, summary}'

Explanation

Format string

  • commit literal string for awk to find
  • %x09 hex code for a tab. (Tabs are added with %x09 here to avoid copy-paste problems but you can use `ctrl-v+tab to insert literal tabs in your terminal)
  • %H sha1 of commit
  • %x09 tab
  • %cI iso format commit date
  • %x09 tab
  • %sd" commit summary

Awk arguments

  • -v FS="\t" changes Field Separator to tab

Awk command

  • /^commit/ { commit=$2; timestamp=$3; summary=$4 } is triggered when commit line is consumed and assigns field 2,3 & 4 to named variables
  • /^ [0-9]+ files? changed/ is triggered when the stats line is consumed
  • { is the start of what to run on that match
    • stats=$1; assigns the first field (the entire stat line) to named variable stats, the semicolon ends the statement
    • printf "https://github.com/timabell/gitopolis/commit/%s\t%s\t%-55s\t\"%s\"\n", commit, timestamp, stats, summary' outputs a formatted line
      • %s inserts a string from list of variables
      • %-52s pads the string to make everything line up nicely
      • \t inserts a tab character (nicer alignment than space, also more easily parsed by further tooling)
      • \" escapes a quote so we can quote the summary
      • \n newline to start the next record in the output
  • } is the end of what to run on that match

Example

output of shortstat

commit  58fb0e1d111f8131c58eb244bc6c3ae6cff42886        2023-05-05T16:54:58+01:00       Add product hunt badged

 1 file changed, 11 insertions(+), 1 deletion(-)
commit  66c0d44676d0f7c3f88444ac1877c1bd00a75a31        2023-04-26T10:43:14+01:00       cargo upgraded

 1 file changed, 5 insertions(+), 5 deletions(-)
commit  ea90a1a7efc448834cb15cf12c0d9a7f31e95350        2023-04-26T10:32:43+01:00       cargo updated

 1 file changed, 150 insertions(+), 95 deletions(-)

output of awk

https://github.com/timabell/gitopolis/commit/58fb0e1d111f8131c58eb244bc6c3ae6cff42886   2023-05-05T16:54:58+01:00        1 file changed, 11 insertions(+), 1 deletion(-)        "Add product hunt badged"
https://github.com/timabell/gitopolis/commit/66c0d44676d0f7c3f88444ac1877c1bd00a75a31   2023-04-26T10:43:14+01:00        1 file changed, 5 insertions(+), 5 deletions(-)        "cargo upgraded"
https://github.com/timabell/gitopolis/commit/ea90a1a7efc448834cb15cf12c0d9a7f31e95350   2023-04-26T10:32:43+01:00        1 file changed, 150 insertions(+), 95 deletions(-)     "cargo updated"
Inference answered 26/5, 2023 at 9:58 Comment(0)
K
0

This python Script could help you out Considering you have already logged the output to a file in this format

Git Log Command

git log --pretty=format:"| %h | %ad | %s | " --date=short --all --decorate --shortstat --author="Your-github-username" > commit_log.txt

Expected format

| d0c23db2 | 2023-11-03 | feat: added tests for formatos xml |
 2 files changed, 58 insertions(+), 3 deletions(-)

Now the python code

def read_file():
    with open("commit_log.txt", "r") as f:
        data = f.read()
        f.close()
    
    items = clean_empty(data.split("\n"))
    
    new = []
    i = 0
    while i <= len(items) + 1:
        try:
            new.append("| ".join([items[i], items[i+1] + " |"]))
            new.append("\n")
        except IndexError:
            break
        i += 2
    
    with open("commits.txt", "w") as f:
        f.writelines(new)
        f.close()

def clean_empty(item:list) -> list:
    for i in item:
        if i == "":
            item.remove(i)
    return item

read_file()

Result

| d0c23db2 | 2023-11-03 | feat: added tests for formatos xml |feat: added tests for formatos xml |  2 files changed, 58 insertions(+), 3 deletions(-) |

| d1c3aaa0 | 2023-11-03 | feat: added tests for formatos txt and formatos json |feat: added tests for formatos txt and formatos json |  4 files changed, 130 insertions(+), 7 deletions(-) |

Hope it helps

Kellykellyann answered 3/11, 2023 at 15:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.