How do I find files that do not end with a newline/linefeed?
Asked Answered
P

15

45

How can I list normal text (.txt) filenames, that don't end with a newline?

e.g.: list (output) this filename:

$ cat a.txt
asdfasdlsad4randomcharsf
asdfasdfaasdf43randomcharssdf
$ 

and don't list (output) this filename:

$ cat b.txt
asdfasdlsad4randomcharsf
asdfasdfaasdf43randomcharssdf

$
Primero answered 7/1, 2011 at 23:1 Comment(5)
Are you just looking for a wide display of files form a folder down? Your question is not very clear by the example above..Ettie
What does "normal txt" mean? Are you talking about files that ends with a blank line (\n\n) or just files that ends with a newline? You could use od -c filename to print unambiguous representation of the file.Shiflett
Just to emphasize: newline is not the same as blank line. A newline is a single character - it delimits what we see as "lines". A blank line is simply a "line" with no characters, typically 2 consecutive newline characters with nothing in-between, or the first line in a file that begins with a newline. Some people call lines consisting of only whitespace "blank" lines as well, and reserve the term "empty line" for 2 consecutive newline characters. You should be clear about what you want.Soriano
Note that in the example you posted, the first file does end with a newline, and the second ends with two newlines.Ubangi
normal text files (according to POSIX) always end with a newline. also consider the two comments aboveOchone
C
42

Use pcregrep, a Perl Compatible Regular Expressions version of grep which supports a multiline mode using -M flag that can be used to match (or not match) if the last line had a newline:

pcregrep -LMr '\n\Z' .

In the above example we are saying to search recursively (-r) in current directory (.) listing files that don't match (-L) our multiline (-M) regex that looks for a newline at the end of a file ('\n\Z')

Changing -L to -l would list the files that do have newlines in them.

pcregrep can be installed on MacOS with the homebrew pcre package: brew install pcre

Copyhold answered 19/12, 2013 at 17:25 Comment(6)
I should point out that the answer given by @dennis-williamson also fails for files that has spaces in them. At least it did for me.Copyhold
I added a set of missing quotes in my answer that should take care of that problem.Colporteur
Just a note for future readers: this pcregrep command is correct for files that do not contain empty lines. Counterexample: printf "a\n\nb" | pcregrep -M '\n$' - will print a (and thus with running with -L will print nothing).Sela
Use \Z instead of $ (i.e. pcregrep -LMr '\n\Z' .) to avoid the issue @Sela mentioned.Spirituous
And in case you need to add a newline to them: pcregrep -LMr '\n\Z' . | xargs sed -i -e '$a\'Moneybags
I don't understand why people always use short options in answers... I get it, it is way faster to type, and maybe also to memorize, but it is also way harder to understand, and people end up copy-pasting commands without even knowing what they are doingSoothsay
C
41

Ok it's my turn, I give it a try:

find . -type f -print0 | xargs -0 -L1 bash -c 'test "$(tail -c 1 "$0")" && echo "No new line at end of $0"'
Cuda answered 5/9, 2014 at 13:17 Comment(6)
I don't like this dot you added in my answer, you editors. I use GNU find here. Which implementation of find don't yet support in 2020 not giving the path?Cuda
BSD find, which is what is in macOS, requires the path to be specified.Rhyme
Sad there's distro lagging behind :,-( In the meantime I bet there's distros that has no find at all :,-(Cuda
@JulienPalard The path is not optional in the current (IEEE Std 1003.1-2017) Posix standard so it's not really lagging behind. Giving no path could mean use the current directory or use my home directory or use the root or whatever on different platforms. Giving an error when not given a path is fully compliant.Curculio
worked perfectly for me on macosZenger
This can also be executed with find . -type f -exec bash -c 'test "$(tail -c 1 "{}")" && echo "No new line at end of {}"' \; which avoids using xargs as find's -exec generally makes it unnecessary.Heartwarming
K
20

If you have ripgrep installed:

rg -Ul '[^\n]\z'

That regular expression matches any character which is not a newline, and then the end of the file. Multi-line mode (-U) must be enabled to match on line terminators.

Kopple answered 14/1, 2020 at 12:33 Comment(0)
C
12

Give this a try:

find . -type f -exec sh -c '[ -z "$(sed -n "\$p" "$1")" ]' _ {} \; -print

It will print filenames of files that end with a blank line. To print files that don't end in a blank line change the -z to -n.

Colporteur answered 7/1, 2011 at 23:27 Comment(5)
The answers that use for ... find ... do will fail if there are filenames that contain spaces.Colporteur
you are correct about for.. find.. mywiki.wooledge.org/BashPitfalls#for_i_in_.24.28ls_.2A.mp3.29Shiflett
Does not work for me. With -z it prints nothing, with -n it prints all files.Frumentaceous
@AlexeyInkin: If you create a file using this command does the find (with the -z test) output its name? echo -e 'foo\n' > extranewline Note that the command is intended to find files that end with a blank line (the last two characters in the file are two newlines).Colporteur
@DennisWilliamson yes, it does. But that file has two newlines at the end. The original question was about one newline. I use Ubuntu.Frumentaceous
V
6

If you are using 'ack' (http://beyondgrep.com) as a alternative to grep, you just run this:

ack -v '\n$'

It actually searches all lines that don't match (-v) a newline at the end of the line.

Viviennevivify answered 3/8, 2016 at 16:53 Comment(1)
Easy, simple solution. Add '-l' to get just the matched files and not the line.Rigorism
F
4

The best oneliner I could come up with is this:

git grep --cached -Il '' | xargs -L1 bash -c 'if test "$(tail -c 1 "$0")"; then echo "No new line at end of $0"; exit 1; fi'

This uses git grep, because in my use-case I want to ensure files commited to a git branch have ending newlines.

If this is required outside of a git repo, you can of course just use grep instead.

grep -RIl '' . | xargs -L1 bash -c 'if test "$(tail -c 1 "$0")"; then echo "No new line at end of $0"; exit 1; fi'

Why I use grep? Because you can easily filter out binary files with -I.

Then the usual xargs/tail thingy found in other answers, with the addition to exit with 1 if a file has no newline. So this can be used in a pre-commit githook or CI.

Fredella answered 24/1, 2019 at 14:27 Comment(1)
Nice work. But git grep version, (tail) gives error if the filename contains space in it.Smallclothes
A
3

This should do the trick:

#!/bin/bash

for file in `find $1 -type f -name "*.txt"`;
do
        nlines=`tail -n 1 $file | grep '^$' | wc -l`
        if [ $nlines -eq 1 ]
                then echo $file
        fi
done;

Call it this way: ./script dir

E.g. ./script /home/user/Documents/ -> lists all text files in /home/user/Documents ending with \n.

Ammonite answered 7/1, 2011 at 23:26 Comment(1)
The first improvement is to put IFS=$'\n' before for. It allows to handle files with spaces. The second improvement is to replace $nlines -eq 1 with $nlines -eq 0 because author needs "filenames, that doesn't end with a newline".Pustulate
B
3

This example

  • Works on macOS (BSD) and GNU/Linux
  • Uses standard tools: find, grep, sh, file, tail, od, tr
  • Supports paths with spaces

Oneliner:

find . -type f -exec sh -c 'file -b "{}" | grep -q text' \; -exec sh -c '[ "$(tail -c 1 "{}" | od -An -a | tr -d "[:space:]")" != "nl" ]' \; -print

More readable version

  • Find under current directory
    • Regular files
    • That 'file' (brief mode) considers text
    • Whose last byte (tail -c 1) is not represented by od's named character "nl"
    • And print their paths
#!/bin/sh
find . \
    -type f \
    -exec sh -c 'file -b "{}" | grep -q text' \; \
    -exec sh -c '[ "$(tail -c 1 "{}" | od -An -a | tr -d "[:space:]")" != "nl" ]' \; \
    -print

Finally, a version with a -f flag to fix the offending files (requires bash).

#!/bin/bash
# Finds files without final newlines
# Pass "-f" to also fix those files
fix_flag="$([ "$1" == "-f" ] && echo -true || echo -false)"
find . \
    -type f \
    -exec sh -c 'file -b "{}" | grep -q text' \; \
    -exec sh -c '[ "$(tail -c 1 "{}" | od -An -a | tr -d "[:space:]")" != "nl" ]' \; \
    -print \
    $fix_flag \
    -exec sh -c 'echo >> "{}"' \;
Blest answered 6/5, 2021 at 21:54 Comment(0)
Q
2

This is kludgy; someone surely can do better:

for f in `find . -name '*.txt' -type f`; do
    if test `tail -c 1 "$f" | od -c | head -n 1 | tail -c 3` != \\n; then
        echo $f;
    fi
done

N.B. this answers the question in the title, which is different from the question in the body (which is looking for files that end with \n\n I think).

Questionnaire answered 7/1, 2011 at 23:22 Comment(0)
I
2

Since your question has the perl tag, I'll post an answer which uses it:

find . -type f -name '*.txt' -exec perl check.pl {} +

where check.pl is the following:

#!/bin/perl 

use strict;
use warnings;

foreach (@ARGV) {
    open(FILE, $_);

    seek(FILE, -2, 2);

    my $c;

    read(FILE,$c,1);
    if ( $c ne "\n" ) {
        print "$_\n";
    }
    close(FILE);
}

This perl script just open, one per time, the files passed as parameters and read only the next-to-last character; if it is not a newline character, it just prints out the filename, else it does nothing.

Infielder answered 7/1, 2011 at 23:58 Comment(1)
What if the last character is not a newline (of course it's not a valid text file)?Colporteur
T
2

Most solutions on this page do not work for me (FreeBSD 10.3 amd64). Ian Will's OSX solution does almost-always work, but is pretty difficult to follow : - (

There is an easy solution that almost-always works too : (if $f is the file) :

sed -i '' -e '$a\' "$f"

There is a major problem with the sed solution : it never gives you the opportunity to just check (and not append a newline).

Both the above solutions fail for DOS files. I think the most portable/scriptable solution is probably the easiest one, which I developed myself : - )

Here is that elementary sh script which combines file/unix2dos/tail. In production, you will likely need to use "$f" in quotes and fetch tail output (embedded into the shell variable named last) as \"$f\"

if file $f | grep 'ASCII text' > /dev/null; then
    if file $f | grep 'CRLF' > /dev/null; then
        type unix2dos > /dev/null || exit 1
        dos2unix $f
        last="`tail -c1 $f`"
        [ -n "$last" ] && echo >> $f
        unix2dos $f
    else
        last="`tail -c1 $f`"
        [ -n "$last" ] && echo >> $f
    fi
fi

Hope this helps someone.

Tempe answered 31/7, 2017 at 0:44 Comment(0)
L
1

Another option:

$ find . -name "*.txt" -print0 | xargs -0I {} bash -c '[ -z "$(tail -n 1 {})" ] && echo {}'
Labors answered 7/1, 2011 at 23:38 Comment(2)
Thank you so much, this is the only example in this thread that actually works (on OSX)Guib
...actually, this doesn't seem to find the right filesGuib
G
1

This example works for me on OSX (many of the above solutions did not)

for file in `find . -name "*.java"`
do
  result=`od -An -tc -j $(( $(ls -l $file  | awk '{print $5}') - 1 )) $file`
  last_char=`echo $result | sed 's/ *//'`
  if [ "$last_char" != "\n" ]
  then
    #echo "Last char is .$last_char."
    echo $file
  fi
done
Guib answered 10/9, 2015 at 14:6 Comment(0)
S
0

Here another example using little bash build-in commands and which:

  • allows you to filter for extension (e.g. | grep '\.md$' filters only the md files)
  • pipe more grep commands for extending the filter (like exclusions | grep -v '\.git' to exclude the files under .git
  • use the full power of grep parameters to for more filters or inclusions

The code basically, iterates (for) over all the files (matching your chosen criteria grep) and if the last 1 character of a file (-n "$(tail -c -1 "$file")") is not not a blank line, it will print the file name (echo "$file").

The verbose code:

for file in $(find . | grep '\.md$')
do
    if [ -n "$(tail -c -1 "$file")" ]
    then
        echo "$file"
    fi
done

A bit more compact:

for file in $(find . | grep '\.md$')
do
    [ -n "$(tail -c -1 "$file")" ] && echo "$file"
done

and, of course, the 1-liner for it:

for file in $(find . | grep '\.md$'); do [ -n "$(tail -c -1 "$file")" ] && echo "$file"; done
Schluter answered 19/5, 2021 at 20:37 Comment(0)
H
0

I think this is the most understandable script:

for FN in `find . -type f` ; do if [[ `cat "$FN" | tail -c 1 | xxd -p` != '0a' ]] ; then echo "$FN" ; fi ; done
Herzberg answered 31/8, 2023 at 11:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.