Can you get the number of lines of code from a GitHub repository?
Asked Answered
H

23

874

In a GitHub repository you can see “language statistics”, which displays the percentage of the project that’s written in a language. It doesn’t, however, display how many lines of code the project consists of. Often, I want to quickly get an impression of the scale and complexity of a project, and the count of lines of code can give a good first impression. 500 lines of code implies a relatively simple project, 100,000 lines of code implies a very large/complicated project.

So, is it possible to get the lines of code written in the various languages from a GitHub repository, preferably without cloning it?


The question “Count number of lines in a git repository” asks how to count the lines of code in a local Git repository, but:

  1. You have to clone the project, which could be massive. Cloning a project like Wine, for example, takes ages.
  2. You would count lines in files that wouldn’t necessarily be code, like i13n files.
  3. If you count just (for example) Ruby files, you’d potentially miss massive amount of code in other languages, like JavaScript. You’d have to know beforehand which languages the project uses. You’d also have to repeat the count for every language the project uses.

All in all, this is potentially far too time-intensive for “quickly checking the scale of a project”.

Honeywell answered 12/11, 2014 at 7:26 Comment(11)
Do you want the lines in all revisions or just the latest revision?Jaquelin
@Schwern: Didn't really think about that. The latest commit of the master branch, I suppose.Honeywell
@Abizern: Is that a valid reason for closing a question? I'm trying to find that in the guidelines. My plan was to ask on SO first. If that proved futile, I'd ask Github customer support and post their information as an answer here.Honeywell
@Abizern: See on-topic. It says you can ask questions about "software tools commonly used by programmers".Honeywell
@Honeywell 1 I've solved with git clone --depth 1. As for 2 and 3, I suspect there is software out there which can do the analysis for you, and you can do a lot of guessing based on file extensions, but I'm having a hell of a time coming up with a good search term to find said software. Maybe you need to ask another question.Jaquelin
@Honeywell Ah ha! github.com/github/linguistJaquelin
My apologies however there are far more answers than the accepted one on that page :)Wellspoken
There's an online tool at codetabs.com/count-loc/count-loc-online.html, haven't tried if it's any good.Barmen
count-loc is good, but limited to repos of <500mb. It errors out on my own company's open source database repo. :/Leastways
The fact is, github doesn't provide any apis through which you can get the loc of your github files. What else you can do is, if you can Sonar server integrated with your github, Sonar does provide apis to get LOC of your filesAiaia
In Windows this command works as well: (Get-ChildItem -Recurse -File | Get-Content | Measure-Object -Line).LinesConvolvulus
H
48

Not currently possible on Github.com or their API-s

I have talked to customer support and confirmed that this can not be done on github.com. They have passed the suggestion along to the Github team though, so hopefully it will be possible in the future. If so, I'll be sure to edit this answer.

Meanwhile, Rory O'Kane's answer is a brilliant alternative based on cloc and a shallow repo clone.

Honeywell answered 14/11, 2014 at 11:34 Comment(1)
Not directly, but their Statistics API has all the data you need to calculate it yourself. See my answer below for a quick script that does this.Town
B
716

You can run something like

git ls-files | xargs wc -l

Which will give you the total count!

lines of code

You can also add more instructions. Like just looking at the JavaScript files.

git ls-files | grep '\.js' | xargs wc -l
Bordeaux answered 14/1, 2018 at 21:15 Comment(24)
The short answer to the question (finding this number using github) is No. Your approach is the second best alternative, specially since we can filter out whatever files we need to count out.Lifelike
If you want to filter, e.g., Python code: git ls-files | grep '\.py' | xargs wc -l.Dictation
I was doing xargs to wc -l all files manually then use awk to sum the column, OMG this is so much easier.Misdoubt
This simple approach includes comments in files. Comments and blank lines are not always considered "lines of code".Durtschi
Well, documentation is a huge part of code. Where would you draw the line really if you kick out comments. What about comments that contain code info like params, what about comments that disable ESLint for the next line — what about lines which are 80% comments after some code. See where I am going with this.Bordeaux
Excellent answer. I'm not a bash expert, but I've had luck using sed with this to ignore blank lines and comments (in this example, ruby comments, which are prepended by #). git ls-files | xargs sed "/^\s*#/d;/^\s*$/d" | wc -lBaucis
line-count.herokuapp.com seems to not recognize a large number of languages, including Haskell.Scrappy
This method includes assets like pictures and documents. How do we specify multiple file extensions?Ulster
@AhmadAwais An irrelevant question : How did you create the terminal interface graphics ? green and purple background for commands is very beatifulBenumb
This method does not work. xargs wc -l does not pass the entire file list to a single wc invocation - for a large repository, it will split the list of files into smaller lists (to avoid exceeding maximum command length restrictions), and the last "total" will only be the total of the last wc. If you scroll up you'll see other "total" lines. From man xargs: "-n number Set the maximum number of arguments taken from standard input for each invocation of utility...The current default value for number is 5000." So if you have more than 5000 files, the result will be incorrect.Sarcocarp
if you want the total you have to grep total and sum them because xargs can issue the command more than onceSpermatophyte
git ls-files --exclude-standard -- ':!:**/*.[pjs][npv]g' ':!:**/*.ai' ':!:.idea' ':!:**/*.eslintrc' ':!:package-lock.json' | xargs wc -l to exclude common file types that should ~probably~ be ignored from Neonexus here -> gist.github.com/mandiwise/dc53cb9da00856d7cdbbDegeneration
I do agree with @LoganPickup, but still 1 vote for this approach, rather than having to install a new package just to count the lines. To avoid the last line "total", we could use "| tail -n+1", to get rid of > 5000 files I think we still could solve it by command lineBoggs
Definitely the short answer is the best.Brewery
@sam no worries, it's my theme called Shades of Purple you can install any version of it from ShadesOfPurple.pro/more (what you see above is iTerm2 and Zsh themes).Bordeaux
What terminal is that and may I know the colour theme and zsh theme?Stenotype
Note: link doesn't work as of recentKwa
Only js files inside of src folder git ls-files | grep 'src' | grep '\.js' | xargs wc -lRecrystallize
Searching for multiple extensions: git ls-files | egrep '\.go|\.ts' | xargs wc -l.Hallucinogen
I ended up using git ls-files | xargs -I {} find "{}" -type f -exec wc -l {} \; because I have a lot of file names with spaces in them and the original command git ls-files | xargs wc -l chokes on spaces in file names.Meilen
Search only interested file types: git ls-files | egrep -i '\.(vue|ts|js)$' | xargs wc -lDunseath
powershell: $totalLines = 0 $lineCounts = git ls-files | Where-Object { $_ -match '\.java$' } | ForEach-Object { $filePath = $_ if (Test-Path $filePath) { $lineCount = (Get-Content $filePath).Count $totalLines += $lineCount [PSCustomObject]@{Lines=$lineCount; File=$filePath} } else { Write-Warning "Could not find file at path $filePath" } } $lineCounts | Format-Table -AutoSize Write-Host "Total Lines: $totalLines"Arriviste
bc <<< $(git ls-files | grep '\.go$' | xargs wc -l | grep total | sed 's/total//g' | tr '\n' '+' | sed 's/\+$//g')Strafe
For window users the command "git ls-files | xargs wc -l" won't work so try this one: This will give you the numbers of lines of each file: git ls-files | foreach { Get-Content $_ | Measure-Object -Line | Select Lines } This will give you the sum of it: git ls-files | foreach { Get-Content $_ | Measure-Object -Line | Select -ExpandProperty Lines } | Measure-Object -Sum | Select -ExpandProperty SumCreosote
T
458

A shell script, cloc-git

You can use this shell script to count the number of lines in a remote Git repository with one command:

#!/usr/bin/env bash
git clone --depth 1 "$1" temp-linecount-repo &&
  printf "('temp-linecount-repo' will be deleted automatically)\n\n\n" &&
  cloc temp-linecount-repo &&
  rm -rf temp-linecount-repo

Installation

This script requires CLOC (“Count Lines of Code”) to be installed. cloc can probably be installed with your package manager – for example, brew install cloc with Homebrew. There is also a docker image published under mribeiro/cloc.

You can install the script by saving its code to a file cloc-git, running chmod +x cloc-git, and then moving the file to a folder in your $PATH such as /usr/local/bin.

Usage

The script takes one argument, which is any URL that git clone will accept. Examples are https://github.com/evalEmpire/perl5i.git (HTTPS) or [email protected]:evalEmpire/perl5i.git (SSH). You can get this URL from any GitHub project page by clicking “Clone or download”.

Example output:

$ cloc-git https://github.com/evalEmpire/perl5i.git
Cloning into 'temp-linecount-repo'...
remote: Counting objects: 200, done.
remote: Compressing objects: 100% (182/182), done.
remote: Total 200 (delta 13), reused 158 (delta 9), pack-reused 0
Receiving objects: 100% (200/200), 296.52 KiB | 110.00 KiB/s, done.
Resolving deltas: 100% (13/13), done.
Checking connectivity... done.
('temp-linecount-repo' will be deleted automatically)


     171 text files.
     166 unique files.                                          
      17 files ignored.

http://cloc.sourceforge.net v 1.62  T=1.13 s (134.1 files/s, 9764.6 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Perl                           149           2795           1425           6382
JSON                             1              0              0            270
YAML                             2              0              0            198
-------------------------------------------------------------------------------
SUM:                           152           2795           1425           6850
-------------------------------------------------------------------------------

Alternatives

Run the commands manually

If you don’t want to bother saving and installing the shell script, you can run the commands manually. An example:

$ git clone --depth 1 https://github.com/evalEmpire/perl5i.git
$ cloc perl5i
$ rm -rf perl5i

Linguist

If you want the results to match GitHub’s language percentages exactly, you can try installing Linguist instead of CLOC. According to its README, you need to gem install linguist and then run linguist. I couldn’t get it to work (issue #2223).

Theron answered 12/3, 2015 at 14:43 Comment(10)
The original question specified without cloning the repo.Piccoloist
@Piccoloist My script doesn’t clone the whole repo; it passes --depth 1 to only download the most recent commit. For most repos, this avoids the original question’s concern about cloning taking too long.Mier
@RoryO'Kane can we use cloc to get the lines of code in a github repository with out cloning the repo to our machine ( through online ). the above given cloc-git aslo first clones to project before starts counting the no of linesJohen
@KasunSiyambalapitiya Sorry, I don’t know of any online website that runs cloc for you. In order for cloc to count lines in code, your computer has to download that code, though only temporarily. Note that even web browsers are technically downloading web pages when you visit them; they just save them to memory instead of to disk.Mier
@RoryO'Kane yeah that fine, but there will be a problem when the repo is too big. Any way is there is a way to get the output of the cloc to a array or some variable in bashJohen
@KasunSiyambalapitiya Since the answer to that question could be complicated, you should ask that in a new question. Standalone questions can have comments of their own and multi-line answers.Mier
How to apply this answer if I want # lines from particular time in history and not from the latest commit?Inconformity
@Inconformity Follow the “Run the commands manually” section, but increase the --depth limit or remove it altogether so that the commit from history will be downloaded, not just the latest commit. Then before running cloc, run git checkout b25fb1, where b25fb1 is the SHA, branch name, or tag name of the commit in history you want to count the lines of.Mier
This seems to be the most useful answer. However, I think that the alternative solution with manually running the commands should be at the top as it is the most straight-forward one.Pottage
Might seem obvious, but if you have the code on your local computer already, there is no need to clone again and you can just run cloc on the repo.Duration
C
222

I created an extension for Google Chrome browser - GLOC which works for public and private repos.

Counts the number of lines of code of a project from:

  • project detail page
  • user's repositories
  • organization page
  • search results page
  • trending page
  • explore page

enter image description here enter image description here enter image description here enter image description here enter image description here enter image description here

Cola answered 23/2, 2017 at 19:30 Comment(25)
upvoted although it doesn't seem to work for private repositoriesDefamatory
@MichailMichailidis Thank you for your suggestion. I'll fix it.Cola
Thanks! also for github.com/AngularClass/angular2-webpack-starter I was getting negative number of lines yesterdayDefamatory
It's a horrible case. I'll check it soon. Thank you for help @MichailMichailidisCola
@ArtemSolovev just tried your plugin - it's pretty good :) Not a huge fan of the gradient but the functionality is there.. so thanks!Unidirectional
@Unidirectional actually i love the gradient since it makes it easier to understand that the page is being modified by an extension.Benzoate
@Taurus my comment was not meant as a CR - from a usability standpoint the gradient does the job (for the reason you mentioned) I meant that I am not a fan of the chosen colors but that's just my (subjective) opinion. Cheers :)Unidirectional
@Unidirectional One question, What does "CR" mean ? I searched it up but "Carriage Return" and "Copyright" did not make sense, i am digesting it as "Criticism". My humble opinion, but again, i believe that the outlandish colors do a great job of expressing what i mentioned, however, i do agree that another choice of color would look more appealing and probably express the point just as well.Benzoate
doesn't work for repos in private organizations that I am a member of, any chance to fix that?Bussard
@hellyale sure. in a few weeksCola
@hellyale It works for private repos too. Update extension. There is more new features to useCola
@Taurus design was changed todayCola
@Artem_Solovev Love it!Benzoate
I guess this just count lines, not lines of code. Compared to SonarQubes counting of loc, this is factor 2-3 bigger...Allaround
It would be nice to show number of lines in each repo in the repo listRooke
Look at the second screen @IgorCovaCola
@ArtemSolovev this work only after refresh (F5), if you just move from Overview to Repositories - then the number of rows for each repo is not shownRooke
@IgorCova I know it. To add this functionality I need to sacrifice 10KB for one lineCola
@IgorCova If you know more optimal method welcome to PRCola
@IgorCova Now it works how you proposed) 13KB not so big deal so i moved back jQueryCola
@ArtemSolovev, do you have any plan on adding support for counting SLOC instead of LOC? thanksIdolla
@ShihabShahriarKhan Hi man. By the end of the Sept. it will be released. You can subscrube to this issue github.com/artem-solovev/gloc/issues/104Cola
wow, feels great to see such a prompt reply from the creator, thanks againIdolla
might want to add a disclaimer that you're the author (pretty sure it's required by SO)Bordiuk
@Bordiuk thank you, just mentioned that. Have a good day )Cola
T
100

If you go to the graphs/contributors page, you can see a list of all the contributors to the repo and how many lines they've added and removed.

Unless I'm missing something, subtracting the aggregate number of lines deleted from the aggregate number of lines added among all contributors should yield the total number of lines of code in the repo. (EDIT: it turns out I was missing something after all. Take a look at orbitbot's comment for details.)

UPDATE:

This data is also available in GitHub's API. So I wrote a quick script to fetch the data and do the calculation:

'use strict';

async function countGithub(repo) {
    const response = await fetch(`https://api.github.com/repos/${repo}/stats/contributors`)
    const contributors = await response.json();
    const lineCounts = contributors.map(contributor => (
        contributor.weeks.reduce((lineCount, week) => lineCount + week.a - week.d, 0)
    ));
    const lines = lineCounts.reduce((lineTotal, lineCount) => lineTotal + lineCount);
    window.alert(lines);
}

countGithub('jquery/jquery'); // or count anything you like

Just paste it in a Chrome DevTools snippet, change the repo and click run.

Disclaimer (thanks to lovasoa):

Take the results of this method with a grain of salt, because for some repos (sorich87/bootstrap-tour) it results in negative values, which might indicate there's something wrong with the data returned from GitHub's API.

UPDATE:

Looks like this method to calculate total line numbers isn't entirely reliable. Take a look at orbitbot's comment for details.

Town answered 22/8, 2015 at 9:3 Comment(9)
Right. But in some cases where the project is a large open-source community project, this sort of count isn't feasible.Sarmentose
@Sarmentose Definitely. However, this data is also available in GitHub's API, so you can write a script to calculate the total number of lines pretty easily. I updated my answer with a quick script that I just wrote up.Town
It would be more simple to use the code_frequecy API. Giving: fetch("https://api.github.com/repos/jquery/jquery/stats/code_frequency").then(x=>x.json()).then(x=>alert(x.reduce((total,changes)=>total+changes[1]+changes[2],0)))Furtive
Hmmm... Interesting: test your code on sorich87/bootstrap-tour . The result is negative.Furtive
@Furtive You're right, it looks like there's something wrong with the data the API is returning for that repo. I did a quick approximation manually using the data on the contributors page and the result does look like it'll end up being negative (31349 - 18169 + 72669 - 87594 + 26774 - 27695 + 1211 - 428 = -1883). I tried your version with code_frequency and it also seems to return a negative value, but a different one. =/Town
The data in the add/remove section does as noted not add up to the total lines of code in the repository at a specific time, it is an aggregate of lines added or removed per user over the whole commit history. The data content is not faulty, it's just different to what you assumed it would be.Fiacre
@Fiacre I might be missing something, but shouldn't the aggregate of all lines added by each user subtracted by the aggregate of all lines deleted by each user over the entire commit history be equal to the current number of lines in the repo? I also really can't think of a scenario where that number can be negative (a repo with more lines deleted than added over its lifetime??).Town
@Town I think you're disregarding that lines added/removed in one commit can be the same as other commits, f.e. when merging branches etc. which still count towards the same total. Additionally, f.e. the Github contributions stats for user profiles are only counted from the default branch or gh-pages, so there might be something similar going on for the commit/line stats: help.github.com/articles/… . Also note that the user profile stats only count the previous year, but I think that the commit stats on the graph page are permanent.Fiacre
@Fiacre Thanks, the results make a lot more sense now. I've updated the post.Town
J
50

You can clone just the latest commit using git clone --depth 1 <url> and then perform your own analysis using Linguist, the same software Github uses. That's the only way I know you're going to get lines of code.

Another option is to use the API to list the languages the project uses. It doesn't give them in lines but in bytes. For example...

$ curl https://api.github.com/repos/evalEmpire/perl5i/languages
{
  "Perl": 274835
}

Though take that with a grain of salt, that project includes YAML and JSON which the web site acknowledges but the API does not.

Finally, you can use code search to ask which files match a given language. This example asks which files in perl5i are Perl. https://api.github.com/search/code?q=language:perl+repo:evalEmpire/perl5i. It will not give you lines, and you have to ask for the file size separately using the returned url for each file.

Jaquelin answered 12/11, 2014 at 7:32 Comment(8)
Cool, didn't know about that. Can you confirm that it can't be done on the Github website, though?Honeywell
I can't confirm it, but I don't see anything in the API or on the Github web site that will give you lines. It's all bytes or percentages. What's your rationale for doing it through the API instead of cloning?Jaquelin
Ok, thanks for the info though. I'll ask Github support.Honeywell
Linguist looks cool, but how do you get it to show you lines of code though? It looks like it shows bytes by default, just like the API.Honeywell
@Honeywell Dunno, you might have to patch it.Jaquelin
According to the CLOC tool, the evalEmpire/perl5i repository currently has 6382 lines of Perl, compared to 272319 bytes reported by the GitHub API call you give. That’s 43 bytes per line. So perhaps a good rule of thumb is to divide the API’s returned numbers by 40 to get lines of code.Mier
@RoryO'Kane Since everyone's coding style is different, some's lines are longer, while others' are shorter. It's not very accurate to do that.Aplasia
This method is wrong now. I tested and found it give a much big value.Caughey
H
48

Not currently possible on Github.com or their API-s

I have talked to customer support and confirmed that this can not be done on github.com. They have passed the suggestion along to the Github team though, so hopefully it will be possible in the future. If so, I'll be sure to edit this answer.

Meanwhile, Rory O'Kane's answer is a brilliant alternative based on cloc and a shallow repo clone.

Honeywell answered 14/11, 2014 at 11:34 Comment(1)
Not directly, but their Statistics API has all the data you need to calculate it yourself. See my answer below for a quick script that does this.Town
D
40

From the @Tgr's comment, there is an online tool : https://codetabs.com/count-loc/count-loc-online.html

LOC counting example for strimzi/strimzi-kafka-operator repository

Derain answered 20/5, 2020 at 11:47 Comment(1)
Too bad it doesn't work for big repositories like Git.Folium
A
25

You can use tokei:

cargo install tokei
git clone --depth 1 https://github.com/XAMPPRocky/tokei
tokei tokei/

Output:

===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 BASH                    4           48           30           10            8
 JSON                    1         1430         1430            0            0
 Shell                   1           49           38            1           10
 TOML                    2           78           65            4            9
-------------------------------------------------------------------------------
 Markdown                4         1410            0         1121          289
 |- JSON                 1           41           41            0            0
 |- Rust                 1           47           38            5            4
 |- Shell                1           19           16            0            3
 (Total)                           1517           95         1126          296
-------------------------------------------------------------------------------
 Rust                   19         3750         3123          119          508
 |- Markdown            12          358            5          302           51
 (Total)                           4108         3128          421          559
===============================================================================
 Total                  31         6765         4686         1255          824
===============================================================================

Tokei has support for badges:

Count Lines

[![](https://tokei.rs/b1/github/XAMPPRocky/tokei)](https://github.com/XAMPPRocky/tokei)

By default the badge will show the repo's LoC(Lines of Code), you can also specify for it to show a different category, by using the ?category= query string. It can be either code, blanks, files, lines, comments.

Count Files

[![](https://tokei.rs/b1/github/XAMPPRocky/tokei?category=files)](https://github.com/XAMPPRocky/tokei)

Anaclinal answered 8/4, 2021 at 15:24 Comment(1)
or once tokei is istalled u can simply navigate to the folder directory and find it out too. Example:- $ tokei ./src /*this would give you the number of lines in the src folder*/Bulimia
C
17

Hey all this is ridiculously easy...

  1. Create a new branch from your first commit
  2. When you want to find out your stats, create a new PR from main
  3. The PR will show you the number of changed lines - as you're doing a PR from the first commit all your code will be counted as new lines

And the added benefit is that if you don't approve the PR and just leave it in place, the stats (No of commits, files changed and total lines of code) will simply keep up-to-date as you merge changes into main. :) Enjoy.

enter image description here

Crackbrain answered 2/11, 2020 at 22:37 Comment(3)
but what if the first commit contains 10000lines, then this number doesn't show that 10000lines right?Robb
If you can afford to ignore first commit then this is a great quick way to check. +1Gaye
If you do this backwards and open a PR deleting all your code, then the number of lines deleted will be the total lines in the project (minus ignored files). Just do yourself a favor and don't merge it.Coeternal
B
14

You can use GitHub API to get the sloc like the following function

function getSloc(repo, tries) {

    //repo is the repo's path
    if (!repo) {
        return Promise.reject(new Error("No repo provided"));
    }

    //GitHub's API may return an empty object the first time it is accessed
    //We can try several times then stop
    if (tries === 0) {
        return Promise.reject(new Error("Too many tries"));
    }

    let url = "https://api.github.com/repos" + repo + "/stats/code_frequency";

    return fetch(url)
        .then(x => x.json())
        .then(x => x.reduce((total, changes) => total + changes[1] + changes[2], 0))
        .catch(err => getSloc(repo, tries - 1));
}

Personally I made an chrome extension which shows the number of SLOC on both github project list and project detail page. You can also set your personal access token to access private repositories and bypass the api rate limit.

You can download from here https://chrome.google.com/webstore/detail/github-sloc/fkjjjamhihnjmihibcmdnianbcbccpnn

Source code is available here https://github.com/martianyi/github-sloc

Bruiser answered 29/3, 2017 at 10:15 Comment(4)
For the chrome extension how is SLOC determined? All file types? Exclude specific directories?Rocket
@BrettReinhard It's based on the number of additions and deletions per week, I think it includes all files.Bruiser
Doesn't that just return the number of changes in the last week?Wheedle
@Johannes'fish'Ziemke No, it returns every weekBruiser
L
13

Open terminal and run the following:

curl -L "https://api.codetabs.com/v1/loc?github=username/reponame"
Lysin answered 13/6, 2020 at 1:57 Comment(3)
Unfortunately, this does not work for private repos.Steepen
Does not work any longer. API responds with "Moved Permanently" for any repo.Hennessey
@Hennessey it's still working for me. Mind you repo has to be public. You can try their UI(codetabs.com/count-loc/count-loc-online.html) to be sureLysin
F
12

Firefox add-on Github SLOC

I wrote a small firefox addon that prints the number of lines of code on github project pages: Github SLOC

Furtive answered 14/1, 2016 at 14:46 Comment(4)
Great plugin, very helpful! Do you know if it's possible to make it work with private repos? It seems to be only showing LOC on public repos.Pasadena
The link is dead and after searching manually, it seems, that sadly this plugin doesn't exist anymore.Unfavorable
There's a request up for making GLOC available for Firefox too, and the developer seems open to the idea: github.com/artem-solovev/gloc/issues/23Intramundane
@Intramundane It's done now: addons.mozilla.org/en-US/firefox/addon/glocOsber
C
11
npm install sloc -g
git clone --depth 1 https://github.com/vuejs/vue/
sloc ".\vue\src" --format cli-table
rm -rf ".\vue\"

Instructions and Explanation

  1. Install sloc from npm, a command line tool (Node.js needs to be installed).
npm install sloc -g
  1. Clone shallow repository (faster download than full clone).
git clone --depth 1 https://github.com/facebook/react/
  1. Run sloc and specifiy the path that should be analyzed.
sloc ".\react\src" --format cli-table

sloc supports formatting the output as a cli-table, as json or csv. Regular expressions can be used to exclude files and folders (Further information on npm).

  1. Delete repository folder (optional)

Powershell: rm -r -force ".\react\" or on Mac/Unix: rm -rf ".\react\"

Screenshots of the executed steps (cli-table):

sloc output as acli-table

sloc output (no arguments):

sloc output without arguments

It is also possible to get details for every file with the --details option:

sloc ".\react\src" --format cli-table --details     
Corinacorine answered 2/10, 2019 at 19:26 Comment(3)
This doesn't appear to work for R files like .R or .RmdAbshire
@Abshire It should work. R is documented as a supported language npmjs.com/package/sloc#supported-languages Otherwise create an issue on github github.com/flosse/sloc/issuesCorinacorine
You might also want to to try out SCC: github.com/boyter/sccCorinacorine
D
10

You could use ghloc.vercel.app - it allows to count lines in any public Github repository.

Dede answered 3/7, 2023 at 17:48 Comment(1)
Just what I was looking for. Easy to use with quick LOC count. Bonus that it also provides info on existence of common repo items (aka "health")Sparkie
P
8

If the question is "can you quickly get NUMBER OF LINES of a github repo", the answer is no as stated by the other answers.

However, if the question is "can you quickly check the SCALE of a project", I usually gauge a project by looking at its size. Of course the size will include deltas from all active commits, but it is a good metric as the order of magnitude is quite close.

E.g.

How big is the "docker" project?

In your browser, enter api.github.com/repos/ORG_NAME/PROJECT_NAME i.e. api.github.com/repos/docker/docker

In the response hash, you can find the size attribute:

{
    ...
    size: 161432,
    ...
}

This should give you an idea of the relative scale of the project. The number seems to be in KB, but when I checked it on my computer it's actually smaller, even though the order of magnitude is consistent. (161432KB = 161MB, du -s -h docker = 65MB)

Parahydrogen answered 31/3, 2015 at 0:22 Comment(0)
N
7

Pipe the output from the number of lines in each file to sort to organize files by line count. git ls-files | xargs wc -l |sort -n

Nonmetal answered 5/2, 2020 at 4:7 Comment(1)
Even gives me a total at the bottom, this is by far the easiest and fastest way.Drawstring
O
7

A lot of answers here, some overly complicated. Here is a simple approach for 2023:

git ls-files > list.txt && cloc --list-file=list.txt

A text file called list.txt includes the filenames in your git repo, then cloc runs on the file list, respecting the gitignore.

You will need to to install cloc.

Note that this method requires the cloned repo on your system - not exactly what the original poster was asking for.

cloc

Ontology answered 17/5, 2023 at 14:35 Comment(2)
The premise of the original answer was without cloning the repository, i.e. without having it available locally. You've entirely skipped that part in your answer, assuming that the repository is already available locally.Honeywell
Yup, I made a note about this.Ontology
B
6

This is so easy if you are using Vscode and you clone the project first. Just install the Lines of Code (LOC) Vscode extension and then run LineCount: Count Workspace Files from the Command Pallete.

The extension shows summary statistics by file type and it also outputs result files with detailed information by each folder.

Bicarbonate answered 21/9, 2021 at 17:35 Comment(0)
L
3

There in another online tool that counts lines of code for public and private repos without having to clone/download them - https://klock.herokuapp.com/

screenshot

Lh answered 23/7, 2020 at 16:32 Comment(2)
Looked promising but very strange that you have to sign up for it.Plead
I think it is because it doesn't want to exceed the API request limit from one account, so it asks everyone to login so it's counted towards their own account. But "This application will be able to read and write all public and private repository data." is not a proportional risk to ask people to take.Hennessey
B
3

None of the answers here satisfied my requirements. I only wanted to use existing utilities. The following script will use basic utilities:

  • Git
  • GNU or BSD awk
  • GNU or BSD sed
  • Bash

Get total lines added to a repository (subtracts lines deleted from lines added).

#!/bin/bash
git diff --shortstat 4b825dc642cb6eb9a060e54bf8d69288fbee4904 HEAD | \
sed 's/[^0-9,]*//g' | \
awk -F, '!($2 > 0) {$2="0"};!($3 > 0) {$3="0"}; {print $2-$3}'

Get lines of code filtered by specified file types of known source code (e.g. *.py files or add more extensions, etc).

#!/bin/bash
git diff --shortstat 4b825dc642cb6eb9a060e54bf8d69288fbee4904 HEAD -- *.{py,java,js} | \
sed 's/[^0-9,]*//g' | \
awk -F, '!($2 > 0) {$2="0"};!($3 > 0) {$3="0"}; {print $2-$3}'

4b825dc642cb6eb9a060e54bf8d69288fbee4904 is the id of the "empty tree" in Git and it's always available in every repository.

Sources:

Bridlewise answered 26/10, 2021 at 23:5 Comment(0)
C
2

shields.io has a badge that can count up all the lines for you here. Here is an example of what it looks like counting the Raycast extensions repo:

https://img.shields.io/tokei/lines/github/raycast/extensions

Christychristye answered 19/3, 2022 at 4:44 Comment(0)
D
1

You can use sourcegraph, an open source search engine for code. It can connect to your GitHub account, index the content, and then on the admin section you would see the number of lines of code indexed. enter image description here

Dasyure answered 23/10, 2022 at 7:56 Comment(0)
Q
0

I made an NPM package specifically for this usage, which allows you to call a CLI tool and providing the directory path and the folders/files to ignore

it goes like this:

npm i -g @quasimodo147/countlines

to get the $ countlines command in your terminal

then you can do countlines . node_modules build dist

Quinacrine answered 9/7, 2022 at 19:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.