In Perl, how to remove ^M from a file?
Asked Answered
F

11

39

I have a script that is appending new fields to an existing CSV, however ^M characters are appearing at the end of the old lines so the new fields end up on a new row instead of the same one. How do I remove ^M characters from a CSV file using Perl?

Fjeld answered 16/3, 2009 at 14:48 Comment(1)
Use binmode(STDIN, ":crlf") or PERLIO=:unix:crlf (see [https://mcmap.net/q/409039/-properly-detect-line-endings-of-a-file-in-perl]).Noblenobleman
F
15

You found out you can also do this:

$line=~ tr/\015//d;
Fjeld answered 16/3, 2009 at 17:36 Comment(1)
not as readable as \r - anyone looking at that (or yourself in a year's time) would be glad of a comment stating what it doesNoli
D
53

^M is carriage return. You can do this:

$str =~ s/\r//g
Duaneduarchy answered 16/3, 2009 at 14:51 Comment(0)
W
29

Or a 1-liner:

perl -p -i -e 's/\r\n$/\n/g' file1.txt file2.txt ... filen.txt
Woodwind answered 16/3, 2009 at 16:36 Comment(3)
It's so easy to remember this one as Perl Pie.Pilocarpine
On windows passing *.txt with this command does not work. It gives: Can't open *.txt: Invalid argument. Anyone?Acidulous
No need for global 'g' as '$' matches only end of line.Cadence
F
15

You found out you can also do this:

$line=~ tr/\015//d;
Fjeld answered 16/3, 2009 at 17:36 Comment(1)
not as readable as \r - anyone looking at that (or yourself in a year's time) would be glad of a comment stating what it doesNoli
S
9

Slightly unrelated, but to remove ^M from the command line using Perl, do this:

perl -p -i -e "s/\r\n/\n/g" file.name
Summerlin answered 16/3, 2009 at 17:45 Comment(0)
U
6

I prefer a more general solution that will work with either DOS or Unix input. Assuming the input is from STDIN:

while (defined(my $ln = <>))
  {
    chomp($ln);
    chop($ln) if ($ln =~ m/\r$/);

    # filter and write
  }
Unciform answered 5/8, 2013 at 15:34 Comment(0)
P
3

This one liner replaces all the ^M characters:

dos2unix <file-name>

You can call this from inside Perl or directly on your Unix prompt.

Pamphylia answered 3/8, 2012 at 23:4 Comment(0)
S
2

To convert DOS style to UNIX style line endings:

for ($line in <FILEHANDLE>) {
   $line =~ s/\r\n$/\n/;
}

Or, to remove UNIX and/or DOS style line endings:

for ($line in <FILEHANDLE>) {
   $line =~ s/\r?\n$//;
}
Shoddy answered 16/3, 2009 at 14:51 Comment(2)
wouldn't that remove the newlines, too?Costplus
I guess that depends on your goal. I edited to show both strategies.Shoddy
K
1

This is what solved my problem. ^M is a carriage return, and it can be easily avoided in a Perl script.

while(<INPUTFILE>)
{
     chomp;
     chop($_) if ($_ =~ m/\r$/);
}
Kaffiyeh answered 17/3, 2016 at 6:57 Comment(1)
Does that remove ^M from a CSV file? Changing the input file? Does it create some output file that will not have them?Heisenberg
H
0

Little script I have for that. A modification of it helped to filter out some other non-printable characters in cross-platform legacy files.

#!/usr/bin/perl
# run this as
# convert_dos2unix.pl < input_file > output_file
undef $/;
$_ = <>;
s/\r//ge;
print;
Herold answered 18/9, 2016 at 4:28 Comment(0)
V
0

perl command to convert dos line ending to unix line ending with backup of the original file:

perl -pi.bak -e 's/\r\n/\n/g' filename

This command generates filename with unix line ending and leaves the original file as filename.bak.

Vaniavanilla answered 17/12, 2021 at 18:32 Comment(0)
F
-1

In vi hit :.

Then s/Control-VControl-M//g.

Control-V Control-M are obviously those keys. Don't spell it out.

Featherstone answered 16/3, 2009 at 17:45 Comment(1)
It's a bad idea to include non-printing characters like carriage return verbatim in source code like this. Far better to use the \r escape that is (a) easy to see and (b) won't get lost if the source is reformatted.Bismuth

© 2022 - 2024 — McMap. All rights reserved.