What is the value of '\n' under C compilers for old Mac OS?
Asked Answered
G

5

46

Background:

In versions of Mac OS up to version 9, the standard representation for text files used an ASCII CR (carriage return) character, value decimal 13, to mark the end of a line.

Mac OS 10, unlike earlier releases, is UNIX-like, and uses the ASCII LF (line feed) character, value decimal 10, to mark the end of a line.

The question is, what are the values of the character constants '\n' and '\r' in C and C++ compilers for Mac OS releases prior to OS X?

There are (at least) two possible approaches that could have been taken:

  1. Treat '\n' as the ASCII LF character, and convert it to and from CR on output to and input from text streams (similar to the conversion between LF and CR-LF on Windows systems); or
  2. Treat '\n' as the ASCII CR character, which requires no conversion on input or output.

There would be some potential problems with the second approach. One is that code that assumes '\n' is LF could fail. (Such code is inherently non-portable anyway.) The other is that there still needs to be a distinct value for '\r', and on an ASCII-based system CR is the only sensible value. And the C standard doesn't permit '\n' == '\r' (thanks to mafso for finding the citation, 5.2.2 paragraph 3), so some other value would have to be used for '\r'.

What is the output of this C program when compiled and executed under Mac OS N, for N less than 10?

#include <stdio.h>
int main(void) {
    printf("'\\n' = %d\n", '\n');
    printf("'\\r' = %d\n", '\r');
    if ('\n' == '\r') {
        printf("Hmm, this could be a problem\n");
    }
}

The question applies to both C and C++. I presume the answer would be the same for both.

The answer could also vary from one C compiler to another -- but I would hope that compiler implementers would have maintained consistency with each other.

To be clear, I am not asking what representation old releases of Mac OS used to represent end-of-line in text files. My question is specifically and only about the values of the constants '\n' and '\r' in C or C++ source code. I'm aware that printing '\n' (whatever its value is) to a text stream causes it to be converted to the system's end-of-line representation (in this case, ASCII CR); that behavior is required by the C standard.

Garey answered 31/7, 2014 at 18:4 Comment(29)
added mac-classic tag...Swanskin
AFAIK, the values of \n and \r have always been ASCII newline and ASCII carriage return -- 0x0A and 0x0D. But I gather that early (pre-*nix) versions of MacOs used more like a DOS/Windows line terminator -- \r\n in sequence.Longobard
@GradyPlayer: Thanks. Deleted osx tag (automatically changed from "macos", which is what I typed).Garey
I can't provide the exact output, but I am pretty sure old macs (and some other oses) used carriage returns (\r) to create a new line: en.wikipedia.org/wiki/NewlineDoner
@HotLicks: No, I'm fairly sure that old Mac OS used just \r, not \r\n, to mark end-of-line in files. But my question is about the constants in C or C++ code.Garey
if you are using ascii, then the codes will be the same, as for the appropriate line endings it changes a lot depending on context... there are possible scenarios where "\n", "\r" or "\r\n" could be appropriate.Swanskin
The constants have not changed, since they are mnemonically tied to the ASCII character names. (Though of course there may have been, somewhere in history, a bastardized C compiler that mapped them differently.)Longobard
of course if you were using a different character encoding it would be different... UTF-8 or MacRoman are the same, as they keep the core of ASCII as the sameSwanskin
@HotLicks: '\n' is tied to the term "newline", which does not appear in ASCII. UNIX has a convention of using LF (line feed) to mark the end of a line.Garey
Are you sure your test program covers your question entirely? What if printf("bla\n"); actually prints a carriage return, but printf("%d\n", '\n'); don't give the ASCII value of the carriage return?Xiaoximena
Well, there isn't a \l because you couldn't tell it from \1 -- \n was used instead, and the "newline" moniker was hung on it.Longobard
Booting up my old iMac now... I'll have an answer soon. (Mac OS 8.6)Grainy
@ouah: Yes, the test program covers the question I'm actually asking. I'm aware that printf("bla\n") will print an ASCII CR character, because that's what the OS uses to mark end-of-line in text files.Garey
@HotLicks: That's an interesting explanation. Can you cite a source for it?Garey
Understand that the whole concept is tied to the Teletype and its kin. The carriage-return caused the print head to fly back to the left margin, and the line-feed caused the platen to advance the paper one line. These were physical operations that didn't care what software was being used. *nix systems, when you routed output to a TTY, would scan the outgoing stream and insert line-feed after each carriage-return. Most other systems expected the user to supply both characters.Longobard
Classic Mac used \r as the line separator in text files. Since the C I/O library is responsible for transforming between filesystem and in-memory representation, I'd guess \ns were serialized as \r.Vitia
@KeithThompson - I read it somewhere once.Longobard
To address how standard conforming '\n' = '\r' would be: Seems unchanged since C89… C11 5.2.2 p3: “Each of these escape sequences shall produce a unique implementation-defined value […]”. So, no, it's not conforming. But what about a character set using 0xd for '\n' and 0xa for '\r'?Adrian
I'm vaguely recalling that, when they built the first Apple machines, Jobs and Wozniak got the CR/LF thing backwards in their CRT-based TTY substitute, and had to fudge it in the software. I suppose this confusion could have been carried forward into the early Macs, but it became untenable when "portable" C programs started being passed around.Longobard
@HotLicks The way I remember it, old macs used '\r' precisely in the same way as UNIX systems use '\n'. There was never a DOS-like combination in use. Microsoft users always annoyed us with that extra character - the mac users with the line-feed, the unix users with the carriage-return...Closet
The thing that is being ignored by some is that the printf and scanf family functions didn't just send and \n and \r that occurred in their first argument (as opposed to converted arguments) on unmodified. So that when I wrote K&R style utilities they produced files that worked on the machine they were run on and needed translation to work properly on a unix or dos machine. And as @Xiaoximena you can't test that with a numeric comparison of the values of the character literals or by printing the results after a %d conversion.Dugaid
@dmckee: printf and scanf work as if they repeatedly called fputc or fgetc respectively. The following are all equivalent: printf("\n"), printf("%s", "\n"), printf("%c", '\n')` and fputc('\n', stdout). Conversion of '\n' (whatever it is) to and from the system's end-of-line marker happens for I/O to any text stream. That conversion is not what I'm asking about. I am specifically asking about the values of '\n' and '\r' -- and printf("%d\n", '\n') does answer that question.Garey
The compiler's representation of the constant '\n' and the runtime library's decision of which character to translate to an end-of-line representation do have to agree for the implementation to be conforming. (For UNIX-like systems that's trivial, since '\n' is 10, i.e. LF, which is the system's end-of-line representation).Garey
Fair enough. There just seems to be a lot of noise in the comments from people who apparently never had to deal with the way these systems worked on the ground.Dugaid
@Adrian I don't think you are right. \n is still unique and different from memory than \r. It's just that the classic Mac C I/O library is responsible for translating between \r and \n when writing to or reading from a file.Vitia
@Adrian ah OK, sorry then.Vitia
On early teletypes, a carriage return issued near the right margin would have to be followed by a non-printable character in order to ensure that the print head could reach the left edge of the paper before the next printable character arrived (there was zero buffering). Requiring newlines to be punched on tape as a CR/LF sequence didn't add any real overhead. In cases where carriage-return speed wasn't an issue, many printers offered an option to advance the paper when given a CR alone; some others offered an option to perform a carriage return when given an LF alone.Megaphone
What would have been ideal would have been to have separate codes for CR only, LF only, and CR+LF. Having CR and LF both have one bit that was set, and having the CR+LF code set both bits, would have been no more complicated than what was actually done (if anything it would have been a few transistors cheaper), but a newline would still have had to be two characters (a CR+LF character and a NUL) so there would have been no advantage until buffered printers came along.Megaphone
@cmaster: The "DOS-like combination" predates DOS by well over a decade, going back to the ASR-33 teletype (1963). I think the discrepancy over which character should be considered the "newline" probably stems from the fact that on many terminals, a line LF would move the cursor to the left side of the next line, but a CR was easier to type. Thus, if one copies stdin to a file and then outputs the file to the console, something will have to translate the typed CR into an LF. Macintosh stores the file as typed, while Unix stores it as it should be output.Megaphone
F
45

The values of the character constants \r and \n was the exact same in Classic Mac OS environments as it was everywhere else: \r was CR was ASCII 13 (0x0d); \n was LF was ASCII 10 (0x0a). The only thing that was different on Classic Mac OS was that \r was used as the "standard" line ending in text editors, just like \n is used on UNIX systems, or \r\n on DOS and Windows systems.

Here's a screenshot of a simple test program running in Metrowerks CodeWarrior on Mac OS 9, for instance:

Example program running in CodeWarrior

Keep in mind that Classic Mac OS systems didn't have a system-wide standard C library! Functions like printf() were only present as part of compiler-specific libraries like SIOUX for CodeWarrior, which implemented C standard I/O by writing output to a window with a text field in it. As such, some implementations of standard file I/O may have performed some automatic translation between \r and \n, which may be what you're thinking of. (Many Windows systems do similar things for \r\n if you don't pass the "b" flag to fopen(), for instance.) There was certainly nothing like that in the Mac OS Toolbox, though.

Forestforestage answered 31/7, 2014 at 19:7 Comment(15)
Do you have a reference for this? BTW, any conforming hosted C implementation must include the full standard library, including printf; such an implementation needn't be provided by the OS.Garey
Does the mean stdout (in text mode) emitted 13 on both fputc('\n') [translated due to text mode] and fputc('\r')?Beutler
@chux Talking about what character is "emitted" is a little messy here because classic Mac OS didn't have true file descriptors — some standard I/O implementations display output in graphical form only! Even when I/O went to a text editor, there's no guarantee those would preserve control characters.Forestforestage
It might be useful to try to find classic-mac ports of unix tools for accessing portable binary and text files and see what the code did.Snapback
@duskwuff: But stdio can open files (fopen) and which bytes are written there is clearly testable.Iodometry
@R.. If you can give me an example of what you want to test, I can see about running that later. The machine I've got is pretty unstable, though, and getting even this test to run (and getting the screenshot back off the machine) was kind of a pain, so I'd rather not dink around too much.Forestforestage
@duskwuff: I think determining how text file translations occur is explicitly outside the scope of the question. The question is about the actual value of '\n' which you established well. I already upvoted your answer and I think it should be the accepted answer.Iodometry
One thing to remember in the old days was that basically you didn't have a shell (unless you were using the one that came with code warrior, which was just a program that wired a text view control to their std lib io) and you didn't ever use printf, you almost always dealt with things like writing to a file with some Mac Toolbox call that used Pascal Strings anyway... I am so glad that you are still able to run CodeWarrior!Swanskin
Having never been a Mac programmer, I still have questions. How many compilers were available for the Mac? Were they all the same in their interpretation of \r and \n? Which compiler was considered standard?Selfgovernment
@MarkRansom There were several compiler available. Turbo C from Borland, MPW C and CodeWarrior were the common ones. There were others including a couple of gcc ports (a 1.37 with it own goofy IDE, and a early version 2 that I never had the kit to run) The ones that could be run under MPW all did the same thing.Dugaid
This answer is correct. There was also, however, a variation of the standard library available that interpreted files opened in text mode. That is, if the file contained a \r (13) you'd get a \n (10) when you read it in text mode. On output, you'd write a \r (13) and it would actually be written to disk as a \n (10).Eliezer
are there no terminal in classic Mac OS?Towardly
@LưuVĩnhPhúc None whatsoever! It is a very different environment from what you're probably familiar with.Forestforestage
What a strange OS! Even Windows have a place for you to type commandsTowardly
@LưuVĩnhPhúc: well there was Apple's MPW which was a wonderful environment for developers - you could use it like a shell (or terminal), but every "session" was also an editable text document - I miss it.Gobbet
S
5

I've done a search and found this page with an old discussion where especially the following can be found:

The Metrowerks MacOS implementation goes a step further by reversing the significance of CR and LF with regard to the '\r' and '\n' escapes in i/o involving a file, but not in any other context. This means that if you open a FILE or fstream in text mode, every '\r' will be output there as an LF as well as every '\n' being output as CR, and the same is true of input - the escape-to-ASCII-binary correspondences are reversed. They are not reversed however in memory, e.g. with sprintf() to a buffer or with a std::stringstream. I find this confusing and, if not non-standard, at least worse than other implementations.

It turns out there is a workaround with MSL - if you open the file in binary mode then '\n' always == LF and '\r' always == CR. This is what I wanted but in getting this information I also got a lot of justification from folks over there that this was the "standard" way to get what I wanted, when I feel like this is more like a workaround for a bug in their implementation. After all, CR and LF are 7-bit ASCII values and I'd expect to be able to use them in a standard way with a file opened in text mode.

(An answer makes clear that this is indeed not a violation of the standard.)

So obviously there was at least one implementation which used \n and \r with the usual ASCII values, but translated them in (non-binary) file output (by just exchanging them).

Soak answered 31/7, 2014 at 21:19 Comment(2)
Assuming that '\n' == 10 (ASCII LF), writing '\n' to a text stream would have to translate it to CR, since that's the system's end-of-line marker. Additionally translating '\r' to LF makes some sense, I suppose, and opening a file in binary mode would inhibit the translation. (This is all standard C stuff). So under Metroworks, the output of the program in my question would be '\n' = 10 \r = 13. As far as my question is concerned, that's consistent with duskwuff's answer.Garey
@KeithThompson The compiler I used to illustrate my answer was Metrowerks CodeWarrior!Forestforestage
M
2

On older Mac compilers, the roles of \r and \n where reversed: We had '\n' == 13 and '\r' == 10, while today '\n' == 10 and '\r' == 13. Great fun during the transition phase. Write a '\n' to a file with an old compiler, read the file with a new compiler, and get a '\r' (of course, both times you actually had a number 13).

Morales answered 31/7, 2014 at 18:27 Comment(18)
I don't think this answers OP's question. OP is not asking about text file translations but the actual value of the expression '\n'.Iodometry
but this was done by convention I presume right? to allow for code compatibility rather than C language adherence? I checked MacRoman, and /n(lf) is still 0xa and \r(cr) is still 0xdSwanskin
I think you have that backwards; '\n' == 10 and '\r' == 13 would be the modern version.Garey
The problem of reading and writing to the file would also occur, if I/O functions did the conversions (as in Windows with \n to \r\n). Just to be sure: Does '\n' == 13 evaluate to 1 on such a system?Adrian
Can you tell when that switch happened? Because I don't remember '\n' being anything other than the line feed character back then, but that might have to do with the time I started programming on the mac.Closet
@R..: Apart from the apparent reversal I mentioned in my previous comment, this does seem to answer the question I asked; it specifically says '\n' == 13'. (The last part of the answer is about text file translations, which isn't what I asked about.) Waiting for clarification from gnasher729.Garey
This is not true. \r is always CR is ASCII 13; \n is always LF is ASCII 10. Classic Mac OS just used \r as the standard line ending instead of \n, just like DOS/Windows use the \r\n sequence.Forestforestage
(Where by "is always", of course I only mean that to apply to the common ASCII systems: UNIX, classic and modern Mac OS, and Windows. EBCDIC systems and so on are a completely different matter!)Forestforestage
@duskwuff: But we have an assertion that under Classic Mac OS, '\n' == 13. I think we need to see the actual output of the program in my question to settle this.Garey
@KeithThompson I'll boot up my old Mac in a few minutes and get a screenshot. :)Forestforestage
@duskwuff When you experiment, note that text mode IO operations do conversions. So try saving "foo\n" to files opened in both text and binary modes and see what bytes get actually written.Globin
@KeithThompson After endless wrestling with old computers and old file formats… added a screenshot to my answer!Forestforestage
@duskwuff has provided screenshots indicating that '\n' == 10 and '\r == 13; presumably '\n' would be translated to CR on output. Your answer seems to imply that the values of the '\n' and '\r' literals were reversed, which would be inconsistent with the evidence of the screenshots. Could it be that different C compiles behaved differently?Garey
@KeithThompson I can't rule that out, as I can't remember how to build an executable under MPW, and I don't have any other compilers available. :) As far as I'm aware, though, Codewarrior was not unusual in this respect, though.Forestforestage
Fixed the numbers; sorry about that.Morales
The CodeWarrior C compiler switched when they started producing MacOS X code. Actually, the same compiler could produce code for pre-MacOS X and for MacOS X, and '\n' would be translated differently.Morales
@gnasher729: What exactly do you mean by "'\n' would be translated differently"? There are two translations occurring: the compile-time translation of the constant '\n' in source code, and the run-time translation of that value to the system's end-of-line marker on output (and vice versa on input). The first is the only thing I'm asking about. Do you assert that the output of the program in my question under CodeWarrior would be '\n' = 13 '\r' = 10? If so, then CodeWarrior behaved quite differently from what we see in duskwuff's screenshot (and I think he used CodeWarrior).Garey
@KeithThompson Yes, that's what I used. (Added a bit to my answer to make that clear.)Forestforestage
S
1

C-language specification:

5.2.2
...
2 Alphabetic escape sequences representing nongraphic characters in the execution character set are intended to produce actions on display devices as follows:
...
\n (new line) Moves the active position to the initial position of the next line.
\r (carriage return) Moves the active position to the initial position of the current line.

so \n represents the appropriate char in that character encoding... in ASCII is the LF char

Swanskin answered 31/7, 2014 at 18:20 Comment(6)
I don't believe this answers the question. That section also describes the intended behavior of \f and \v; few systems actually behave as described.Garey
@KeithThompson I am looking for a document that supports a swap of char values... by the compilers... I believe that they worked this way, I have some old Classic programs that seem to enforce this idea, but I can't run them right now.Swanskin
@GradyPlayer: if you have some executable files from that period you could examine your strings with a hex viewer. I'm pretty sure a modern one won't silently translate \r back to \n again ;-)Unblinking
I think this does answer the question... We didn't really run portable C programs on the Mac, they were originally written in Pascal and then later C/C++, but we didn't even have a console... so the concept of a newline was basically just in the native controls and in text files, and I remember every mac program reading \n, \r, and \r\n as a newline... where would printf even print?Swanskin
@GradyPlayer: The question was about the actual values of the '\n' and '\r' character constants, not about how I/O behaves. duskwuff has been able to compile and run the portable C program from my question and produce a screenshot.Garey
@KeithThompson That is in SIOUX, which is actually how I learned C, but I think the invariant holds \n == lf... if you output to a file "hello\nthere\rworld\r\n..." it would depend on what application opened the file on how it was interpreted, but it would be 4 lines if you opened it in BBEdit...Swanskin
K
1

I don't have an old Mac compiler to check if they follow this, but the numeric value of '\n' should be the same as the ASCII new line character (given that those compilers used ASCII compatible encoding as the execution encoding, which I believe they did). '\r' should have the same numeric value as the ASCII carriage return.

The library or OS functions that handle writing text mode files is responsible for converting the numeric value of '\n' to whatever the OS uses to terminate lines. The numeric values of these characters at runtime are determined entirely by the execution character set.

Thus, since we're still ASCII compatible execution encodings the numeric values should be the same as with classic Mac compilers.

Kellda answered 31/7, 2014 at 18:40 Comment(15)
ASCII does not imply that the LF character is a "newline" character; in fact ASCII has no newline character. A conforming C implementation for old Mac OS could have '\n' == 13 && '\r' == 10 (with no conversion needed on input or output) or it could have '\n' == 10 && '\r' == 13 (and convert LF to CR on text output and CR to LF on text input. The C and ASCII standards by themselves do not answer my question, which is about which of the two (or more) valid choices was made by the authors of C and C++ compilers for classic Mac OS.Garey
That would be a conforming implementation, but it would mean the encoding wasn't ascii. The ascii character doesn't need to be named 'newline' to mean the same thing. The question boils down entirely to 'what is the execution encoding used by classic Mac computers.'Kellda
How so? What in the ASCII standard implies that LF is the correct character to use for newline?Garey
Well, I'm sure I could find a rational, but I can't find a copy of ANSI Document X3.4-1986 (R1997). 'line feed' certainly causes printing on the current line to end.Kellda
And how closely do existing C implementations follow the ASCII standard? Printing a formfeed or vertical tab to a terminal (say, an xterm window) typically does nothing, for example. I don't think we an use the requirements of the ASCII standard to infer how C compilers implement the '\n' character constant.Garey
Obviously what an ascii control character actually does is up to the mechanism that's reading it, but in my experience console programs do implement most of the ascii control characters and C implementations do map the escapes to the sensible character. \a causes a sound to play (or the screen to flash if the console program is configured that way), \b does move the cursor back one character, \f clears the screen, \r does start printing from the beginning of the line without also moving down to a new line, \v moves down several lines, \t does produce a horizontal tab.Kellda
It might depend on what sort of device your program is emulating, VT100 or whatever.Kellda
I'm not sure I've ever seen a terminal emulator clear the screen on \f. I've just tried xterm, Gnome terminal, lxterm, xfce4-terminal, and PuTTY; none of them behave that way.Garey
@KeithThompson I'm getting that behavior on PuTTY now.Kellda
Correction: I haven't been able to try PuTTY. By "that behavior" do you mean that the scree is cleared?Garey
Yeah. Well, scrolled down to where the next printed line appears at the top of the window.Kellda
There are different areas: First, the compiler must decide what the value of '\n' and '\r' is. That value has changed from pre-MacOS X to MacOS X compilers. Next, when printing '\n' as a char to a text file (not a binary file), printf must replace that char value with the OS-dependent correct character(s). What the correct characters for a line separator are has also changed from MacOS to MacOS X. But third, if you want to write a utility today that translates Windows-style text files (con'd)Morales
to MacOS 9 style text files, then you really can't rely on what printf does, and what the compiler says '\n' is; you just have to know that Windows typically used two specific chars and MacOS 9 used another char.Morales
@gnasher729: Your first point, the values of '\n' and '\r', is the only thing I'm actually asking about, and the screenshot in duskwuff's answer indicates that those values have not changed from pre-MacOS X to MacOS X compilers. I'd be very interested in seeing evidence, preferably in the form of copy-and-pasted or screen-captured output of the program in my question, that indicates otherwise (which would imply that different pre-MacOS X compilers behaved differently).Garey
ASCII code 10 is "linefeed". Traditionally, it would advance paper without resetting the carriage position, or on display terminals it would move the cursor to the next line without resetting it to the left edge. While some terminals had an option to automatically advance paper upon receipt of a carriage return, someone (perhaps Digital Equipment Corporation) came up with the brilliant observation that resetting the carriage without advancing paper was useful more often than advancing paper without resetting the carriage, and thus implemented LF as a double-duty code.Megaphone

© 2022 - 2024 — McMap. All rights reserved.