List files not tracked by Git LFS
Asked Answered
T

3

12

I'm initializing a new Git repo with a huge pile of files. The repo is using Git LFS. I want to ensure that I've told LFS to track all files that should be handled, before I make my first commit.

I see that git lfs ls-files will list all the files that ARE tracked by LFS. However, (a) I want the opposite: all files in the repo that aren't tracked by LFS (and are in .gitignore), and (b) this command only works after you have committed files.

Does anyone have some git-fu or Ubuntu-fu to list all the files in the repo that aren't ignored and aren't matched by the track patterns in the various .gitattribute files Git LFS uses?


The closest I've come is this command that lists files in the repo over 100kB, and then manually scanning all the files and hope that I've them covered by a tracking pattern.

find . -type f -exec du -Sha -t 100000 {} +
Trehalose answered 22/3, 2017 at 22:29 Comment(0)
S
13

Even though the question concerns files that have not been commited, let me suggest a solution to list the files tracked by git but not git-lfs after commit, which you can do by concatenating the list of files tracked by git (git ls-files) with those tracked by git-lfs (git lfs ls-files | cut -d' ' -f3-) and then only take the files that are unique in this list:

{ git ls-files && git lfs ls-files | cut -d' ' -f3-; } | sort | uniq -u

After which you could edit your commit (git rm --cached and git commit --amend) if you notice a file that has sneaked in...

At the pre-commit stage, proceeding by watching the untracked files list and successively using git lfs track and git add should be quite safe.

Beware, that empty files are not considered LFS objects by the specification, so they will not be listed by git lfs ls-files

Sad answered 14/9, 2017 at 9:43 Comment(3)
The range in the cut command (-f3) should be -f3- in case there's a space in one of the paths.Lontson
A git-alias version of this that works regardless of where you are in the directory structure of a git repository: ` lfs-untracked = "!_() { ((git ls-files | egrep \"^${GIT_PREFIX}\") && (git lfs ls-files ${GIT_PREFIX:+-I ${GIT_PREFIX}} | cut -d' ' -f3-)) | sort | uniq -u ; }; _"`Antilogarithm
Sorted by file size: ({ git ls-files && git lfs ls-files | cut -d' ' -f3-; } | sort | uniq -u) | xargs stat -c '%s %n' | numfmt --to=iec | sort -hCenturial
I
1

Let me give some ideas:

To get a list of all files in your repository:

find . -type f > all.txt

To get a list of all files that will be tracked by LFS:

set -f; for f in $(cat .gitattributes | cut -d ' ' -f 1); do find . -name $f; done > lfs.txt

To get a list of all files that will NOT be tracked by LFS:

grep -f lfs.txt -F -w -v all.txt > non-lfs.txt
Invincible answered 23/3, 2017 at 12:0 Comment(0)
C
1

You can use git-lfs-migrate in "info" mode, e.g. like this to list non-LFS files larger than 100 kB:

git lfs migrate info --everything --pointers=ignore --above=100kB --top=10
  • --everything means all refs (branches etc.) should be examined, not just the current branch.
  • --pointers=ignore to ignore all files that have already been converted to LFS (just to speed up the operation a bit, otherwise the tool will get the size of all existing LFS objects as well)
  • --above=100kB to list files larger than 100 kB
  • --top=10 to show top 10 entries (default: 5)
Cymbal answered 29/7 at 7:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.