Difference between CR LF, LF and CR line break types
Asked Answered
S

10

1162

I'd like to know the difference (with examples if possible) between CR LF (Windows), LF (Unix) and CR (Macintosh) line break types.

Seducer answered 12/10, 2009 at 4:47 Comment(7)
Very similar, but not an exact duplicate. \n is typically represented by a linefeed, but it's not necessarily a linefeed.Eggert
CR and LF are ASCII and Unicode control characters while \r and \n are abstractions used in certain programming languages. Closing this question glosses over fundamental differences between the questions and perpetuates misinformation.Eggert
@AdrianMcCarthy It's a problem with the way close votes act as answers in a way; an answer claiming the two were the same could be downvoted and then greyed out as very, very wrong, but it only takes 4 agreeing votes (comparable to upvotes) to have a very wrong close happen, with no way to counter the vote until after it's happened.Snafu
This formulation of the question is admittedly better, but it is still for all practical purposes the same question.Immethodical
@JukkaK.Korpela: No, it really isn't. \n doesn't mean the same thing in all programming languages.Eggert
@AdrianMcCarthy, what operating system puts "CR" and/or "LF" at the end of a line to indicate that it is the end of the line? I have never heard of any. If you are saying that CR and LF are representations of the codes for carriage return and line feed then that makes sense and "\r" and "\n" are also representations.Copyread
@user34660: I don't understand your question. '\r' maps to a carriage return control character (typically abbreviated to CR) in many systems. '\n', however, does not necessarily represent a linefeed control character, as explained in my answer on the linked question. It's exactly this distinction that makes this question not a duplicate of the linked question.Eggert
F
466

It's really just about which bytes are stored in a file. CR is a bytecode for carriage return (from the days of typewriters) and LF similarly, for line feed. It just refers to the bytes that are placed as end-of-line markers.

There is way more information, as always, on Wikipedia.

Fetal answered 12/10, 2009 at 4:52 Comment(5)
I think it's also useful to mention that CR is the escape character \r and LF is the escape character \n. In addition, Wikipedia:Newline.Screening
In Simple words CR and LF is just end of line and new line according to this link , is this correct ?Shelbashelbi
@shaijut CR stands for Carriage Return. That was what returned the carriage on typewriters. So, mostly correct.Aynat
The superior LFCR option is sadly missing. Its benefit is that by doing the Line Feed first, the Selectric golfball can't smear the just printed line with still fresh ink upon executing the Carriage Return </s>Hypercorrect
Actually, it's not a typewriter but "teletype", old computer client terminals with mechanical print heads and paper, where CR/LF were required for computers to behave properly. If you just did CR, you would have a bunch of characters on top of each other on a paper. If you just did LF, your text lines would slowly migrate to the right on the paper. CR/LF were required for proper teletype based computing. An old Star Trek game would dump "F<LF><BEL><BEL>I<LF><BEL><BEL>R<LF><BEL><BEL>I<LF><BEL><BEL>N<LF><BEL><BEL>G<LF><BEL><BEL>" diagonally down the page.Betulaceous
N
1110

CR and LF are control characters, respectively coded 0x0D (13 decimal) and 0x0A (10 decimal).

They are used to mark a line break in a text file. As you indicated, Windows uses two characters the CR LF sequence; Unix (and macOS starting with Mac OS X 10.0) only uses LF; and the classic Mac OS (before 10.0) used CR.

An apocryphal historical perspective:

As indicated by Peter, CR = Carriage Return and LF = Line Feed, two expressions have their roots in the old typewriters / TTY. LF moved the paper up (but kept the horizontal position identical) and CR brought back the "carriage" so that the next character typed would be at the leftmost position on the paper (but on the same line). CR+LF was doing both, i.e., preparing to type a new line. As time went by the physical semantics of the codes were not applicable, and as memory and floppy disk space were at a premium, some OS designers decided to only use one of the characters, they just didn't communicate very well with one another ;-)

Most modern text editors and text-oriented applications offer options/settings, etc. that allow the automatic detection of the file's end-of-line convention and to display it accordingly.

Nonconcurrence answered 12/10, 2009 at 4:52 Comment(8)
so actually Windows is the only OS that uses these characters properly, Carriage Return, followed by a Line Feed.Carduaceous
Would it be accurate, then, to say that a text file created on Windows is the most compatible of the three i.e. the most likely to display on all three OS subsets?Giltzow
@Hashim it might display properly but trying to run a textual shell script with carriage returns will usually result in an errorForthcoming
In Simple words CR and LF is just end of line and new line according to this link , is this correct ?Shelbashelbi
I've found that some Windows-style files (CR+LF) can display with double newlines on other systems. Presumably the editor that displays the text supports both Carriage Return and Line Feed as newline delimiters, and as such may create 2 lines where 1 was intended. So while CR+LF might be the most compatible, I don't think it is without issue.Shied
Rolf - that statement assumes that keeping old terminology/technology in new technology is correct. CRLF = 2 bytes. CR = 1, LF = 1. With as often as they are used, that actually translates to a huge amount of data. Once again, Windows has chosen to be different from the entirety of the *NIX world.Angus
Is there a technical reason (for typewriters) why LF CR is unheard of?Urushiol
@Urushiol The short is answer is because CR takes a long time on a teleprinter. Putting CR before LF gives the carriage time to get back to the other side. Sometimes even with CRLF you'd have to send NULs to give it more time.Monadism
B
619

This is a good summary I found:

The Carriage Return (CR) character (0x0D, \r) moves the cursor to the beginning of the line without advancing to the next line. This character is used as a new line character in Commodore and early Macintosh operating systems (Mac OS 9 and earlier).

The Line Feed (LF) character (0x0A, \n) moves the cursor down to the next line without returning to the beginning of the line. This character is used as a new line character in Unix-based systems (Linux, Mac OS X, etc.)

The End of Line (EOL) sequence (0x0D 0x0A, \r\n) is actually two ASCII characters, a combination of the CR and LF characters. It moves the cursor both down to the next line and to the beginning of that line. This character is used as a new line character in most other non-Unix operating systems including Microsoft Windows, Symbian and others.

Source

Byington answered 12/10, 2009 at 4:54 Comment(4)
The "vertical tab"-character moves the cursor down and keep the position in the line, not the LF-character. The LF is EOL.Executory
@TaylorLeese Are /r/n and /n/r same?Nod
Thanks for highlighting: Classic MacOS: CR = \r, Unix and MacOS: LF = \n, Windows: CRLF = \r\n.Akiko
@Nod Developers will often split a string or perform other operations with the exact sequence \r\n, so \n\r would not match. Also one would think that text editors also treat the two characters as one sequence and don't separatly go "oh, now I have to go down one line" and "oh, now I have to move to the front". Were it so then yes, you could freely swap the order aroundRhoda
F
466

It's really just about which bytes are stored in a file. CR is a bytecode for carriage return (from the days of typewriters) and LF similarly, for line feed. It just refers to the bytes that are placed as end-of-line markers.

There is way more information, as always, on Wikipedia.

Fetal answered 12/10, 2009 at 4:52 Comment(5)
I think it's also useful to mention that CR is the escape character \r and LF is the escape character \n. In addition, Wikipedia:Newline.Screening
In Simple words CR and LF is just end of line and new line according to this link , is this correct ?Shelbashelbi
@shaijut CR stands for Carriage Return. That was what returned the carriage on typewriters. So, mostly correct.Aynat
The superior LFCR option is sadly missing. Its benefit is that by doing the Line Feed first, the Selectric golfball can't smear the just printed line with still fresh ink upon executing the Carriage Return </s>Hypercorrect
Actually, it's not a typewriter but "teletype", old computer client terminals with mechanical print heads and paper, where CR/LF were required for computers to behave properly. If you just did CR, you would have a bunch of characters on top of each other on a paper. If you just did LF, your text lines would slowly migrate to the right on the paper. CR/LF were required for proper teletype based computing. An old Star Trek game would dump "F<LF><BEL><BEL>I<LF><BEL><BEL>R<LF><BEL><BEL>I<LF><BEL><BEL>N<LF><BEL><BEL>G<LF><BEL><BEL>" diagonally down the page.Betulaceous
R
234

Summarized succinctly:

Carriage Return (Mac pre-OS X)

  • CR
  • \r
  • ASCII code 13

Line Feed (Linux, Mac OS X)

  • LF
  • \n
  • ASCII code 10

Carriage Return and Line Feed (Windows)

  • CRLF
  • \r\n
  • ASCII code 13 and then ASCII code 10

If you see ASCII code in a strange format, they are merely the number 13 and 10 in a different radix/base, usually base 8 (octal) or base 16 (hexadecimal).

ASCII chart

Radiocarbon answered 31/8, 2016 at 22:7 Comment(3)
The \r and \n only works in some programming languages, although it seems to be universal among programming languages that use backslash to indicate special characters.Thalassic
@Thalassic yes, backslash is the commonly designated character to "escape" what follows it.Radiocarbon
And RISCOS uses \n\rCorpuz
P
53

Jeff Atwood has a blog post about this: The Great Newline Schism

Here is the essence from Wikipedia:

The sequence CR+LF was in common use on many early computer systems that had adopted teletype machines, typically an ASR33, as a console device, because this sequence was required to position those printers at the start of a new line. On these systems, text was often routinely composed to be compatible with these printers, since the concept of device drivers hiding such hardware details from the application was not yet well developed; applications had to talk directly to the teletype machine and follow its conventions. The separation of the two functions concealed the fact that the print head could not return from the far right to the beginning of the next line in one-character time. That is why the sequence was always sent with the CR first. In fact, it was often necessary to send extra characters (extraneous CRs or NULs, which are ignored) to give the print head time to move to the left margin. Even after teletypes were replaced by computer terminals with higher baud rates, many operating systems still supported automatic sending of these fill characters, for compatibility with cheaper terminals that required multiple character times to scroll the display.

Pairoar answered 20/1, 2010 at 19:53 Comment(10)
+1 It is by this simple understanding that I always remember in what order the combination comes. Even today we can still see this mechanical logic in any inktjet-printer (I love to understand since I hate to learn). My other memory-tricks are: "mac? Return to sender" and "NewLineFeed" (to remember that NL===LF and to remember the \n , since CR already has the R in it's abbreviation)Sudanic
I'm dubious of the claim that dividing the process of going to the next line into two control codes was necessary for timing. I don't doubt that there were timing issues, but serial communication always had some buffering and flow control. In terms of the actual driver hardware, then, yet, I could imagine there were some timing issues to overcome that may have been solved by the equivalent of adding NULs to fill the time to return the print head to the margin, but I'd like to see better citations before believing that's the reason CR and LF were distinct operations.Eggert
"I'm dubious ... two control codes was necessary for timing". That's not what it says. It says that the extra CRs and NULs are here for giving time for it to come back, not the original CR LF.Nonlinearity
@Adrian Will you take persona experience? 1) In my old teletype days, the printer we used required <CR><CR><LF> - so of course I experimented with just one <CR>. I sent <CR><LF>A after a long line, and you could hear the A being printed before the carriage fully returned.Jemie
@Adrian 2) Don't forget, this was in the electro-mechanical era, where each character did exactly one function. We often emphasised a word by printing the line, then sending <CR><CR> and typing the correct number of spaces, then re-printing the same word: a primitive form of bolding.Jemie
@Adrian 3) And finally, this was using Baudot (or Murray code), not ASCII. Five data bits, between one start bit and one-and-a-half stop bits. How can you have half a bit? By waiting half a bit time before starting to send the next character, to give the print head time to return to center.Jemie
@Adrian And how can you store the full alphabet, plus numbers, plus punctuation in only five data bits? You can't. You need to 'shift' the whole carriage from a bank of letters to a bank of figures and back, which you did by sending either <FigureShift> or <LetterShift> codes. It was always fun if one of those codes was corrupted - the rest of the data was in the wrong bank - but you got good at reading it anyway! Just don't give it to the recipient like that: retype it before forwarding...Jemie
@JulienRousseau: The first bold sentence suggests that splitting the operation helps with the timing, as does the claim that the CR (which is the slower operation) comes first. If you read the context of the paragraph quoted from an old version of the Wikipedia article, you'll find additional claims that using CR+LF was for timing. That's what I find specious.Eggert
@John Burger: The ASR 33, the teletype used to support the now out-of-date Wikipedia quote, used ASCII. Having separate commands for line feed and carriage return was useful for approximating effects like bold, underlining, and letters with accents. The old version of the Wikipedia article, quoted in part here, dubiously claimed that representing a new line with CR+LF was done intentionally because of the carriage return operation was slow. That's the part I'm skeptical of. Given the existence of CR and LF, combining them is a natural way to represent a newline.Eggert
procedural paradigm strikes again, causing a mess for everyone and everyone in the future.Radiocarbon
S
18

CR - ASCII code 13

LF - ASCII code 10.

Theoretically, CR returns the cursor to the first position (on the left). LF feeds one line, moving the cursor one line down. This is how in the old days you controlled printers and text-mode monitors.

These characters are usually used to mark end of lines in text files. Different operating systems used different conventions. As you pointed out, Windows uses the CR/LF combination while pre-OS X Macs use just CR and so on.

Selfoperating answered 12/10, 2009 at 4:55 Comment(0)
W
11

CR and LF are a special set of characters that help us format our code.

  1. CR (\r) stands for CARRIAGE RETURN. It puts the cursor at the beginning of a line, but it doesn't create a new line. This is how classic Mac OS works (not applicable today unless you are dealing with old files).

  2. LF (\n) stands for LINE FEED. It creates a new line, but it doesn't put the cursor at the beginning of that line. The cursor stays back at the end of the last line. This is how Unix (including macOS) and Linux work.

  3. CRLF (\r\n) creates a new line as well as puts the cursor at the beginning of the new line. This is how we see it in Windows OS.

Git uses LF by default. So when we use Git on Windows it throws a warning like "CRLF will be replaced by LF" and automatically converts all CRLF into LF, so that code becomes compatible.

NB: Don't worry...see this less as a warning and more as a notice thing.

Wheelhorse answered 21/6, 2022 at 15:26 Comment(1)
Re "This is how MAC OS works": Only the old Macs (Classic Mac OS).Longways
J
8

The sad state of "record separators" or "line terminators" is a legacy of the dark ages of computing.

Now, we take it for granted that anything we want to represent is in some way structured data and conforms to various abstractions that define lines, files, protocols, messages, markup, whatever.

But once upon a time this wasn't exactly true. Applications built-in control characters and device-specific processing. The brain-dead systems that required both CR and LF simply had no abstraction for record separators or line terminators. The CR was necessary in order to get the teletype or video display to return to column one and the LF (today, NL, same code) was necessary to get it to advance to the next line. I guess the idea of doing something other than dumping the raw data to the device was too complex.

Unix and Mac actually specified an abstraction for the line end, imagine that. Sadly, they specified different ones. (Unix, ahem, came first.) And naturally, they used a control code that was already "close" to S.O.P.

Since almost all of our operating software today is a descendent of Unix, Mac, or Microsoft operating software, we are stuck with the line ending confusion.

Jeter answered 12/10, 2009 at 5:10 Comment(0)
U
7

Systems based on ASCII or a compatible character set use either LF (Line feed, 0x0A, 10 in decimal) or CR (Carriage return, 0x0D, 13 in decimal) individually, or CR followed by LF (CR+LF, 0x0D 0x0A); These characters are based on printer commands: The line feed indicated that one line of paper should feed out of the printer, and a carriage return indicated that the printer carriage should return to the beginning of the current line.

Here is the details.

Underpants answered 12/10, 2009 at 4:52 Comment(0)
G
1

NL is derived from EBCDIC NL = 0x15 which would logically compare to CRLF 0x0D 0x0A ASCII... This becomes evident when physically moving data from mainframes to midrange. Colloquially (as only arcane folks use EBCDIC), NL has been equated with either CR or LF or CRLF.

Griner answered 16/6, 2017 at 0:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.