Does C++20 mandate source code being stored in files?
Asked Answered
K

2

110

A slightly strange question, however, if I remember correctly, C++ source code doesn't require a file system to store its files.

Having a compiler that scans handwritten papers via a camera would be a conforming implementation. Although practically not making that much sense.

However C++20 now adds source location with file_name. Does this now imply that source code should always be stored in a file?

Komsomol answered 18/8, 2019 at 20:40 Comment(11)
This has been in C since forever - __FILE__. Class source_location just allows you to get it at function call site.Diary
Can't you give filename to your handwritten papers?Psittacosis
I think it is an implementation detail whether the source code is in files, or something else. If the compiler can be fed source code through stdin, the source could be in a database.Peria
Good point, I forgot about preprocessor alreadyKomsomol
My example may be a bit off, but if you use some on-the-fly compiler, such as TCC you can always supply some human readable source name for the sake of error reporting even though you compile directly from memory. That is having a "file name" does not imply being stored as a file at all.Decuple
Surely it's the implementation files such as <iostream> that may not be files (if you see what I mean), not the files written by developers?Procaine
If you run the OS in a container chances are that there are no separate physical "files", however exactly you care to define those in this complicated word. Windows allegedly wanted to migrate to a database file system for Windows 8 (but scrapped that), which could conceivably be in the cloud, so there would be no files whatsoever. The compiler likely "opens" something having some properties of a "file" (hey, I can read and close!), but even for that there is no guarantee in an integrated environment or an interpreter where everything may be in memory. These terms are purely conceptual.Sacrosanct
@Psittacosis such as "<stdin>"? ☻Purdy
@mirabilos: "<stdin>" seems more appropriated for source provided by stdin. "Handwritten paper 01" might be more appropriate in such case.Psittacosis
Your question is a little circular. What is a file ? What is a file system ?Consubstantial
There was "Everything is file" philosophy and now "why source must be in file"...Everara
P
116

No, source code doesn't have to come from a file (nor go to a file).

You can compile (and link) C++ completely within a pipe, putting your compiler in the middle, e.g.

generate_source | g++ -o- -xc++ - | do_something_with_the_binary

and it's been like that for decades. See also:

The introduction of std::source_location in C++20 doesn't change this state of affairs. It's just that some code will not have a well-defined source location (or it may be well-defined, but not very meaningful). Actually, I'd say that the insistence on defining std::source_location using files is a bit myopic... although in fairness, it's just a macro-less equivalent of __FILE__ and __LINE__ which already exist in C++ (and C).

@HBv6 notes that if you print the value of __FILE__ when compiling using GCC from the standard input stream:

echo -e '#include <iostream>\n int main(){std::cout << __FILE__ ;}' | g++ -xc++  -

running the resulting executable prints <stdin>.

Source code can even come from the Internet.

@Morwenn notes that this code:

#include <https://raw.githubusercontent.com/Morwenn/poplar-heap/master/poplar.h>

// Type your code here, or load an example.
void poplar_sort(int* data, size_t size) {
    poplar::make_heap(data, data + size);
    poplar::sort_heap(data, data + size);
}

works on GodBolt (but won't work on your machine - no popular compiler supports this.)

Are you a language lawyer? Ok, so let's consult the standard..

The question of whether C++ program sources need to come from files is not answered clearly in the language standard. Looking at a draft of the C++17 standard (n4713), section 5.1 [lex.separate] reads:

  1. The text of the program is kept in units called source files in this document. A source file together with all the headers (20.5.1.2) and source files included (19.2) via the preprocessing directive #include, less any source lines skipped by any of the conditional inclusion (19.1) preprocessing directives, is called a translation unit.

So, the source code is not necessarily kept in a file per se, but in a "unit called a source file". But then, where do the includes come from? One would assume they come from named files on the filesystem... but that too is not mandated.

At any rate, std::source_location does not seem to change this wording in C++20 or to affect its interpretation.

Placeeda answered 18/8, 2019 at 21:30 Comment(18)
That pipe is a "source file" for the purposes of the standard.Farnesol
@melpomene: The units are just called source files, it doesn't say that they actually have to be source files. But I'll edit the answer to include this.Placeeda
Are we in Humpty Dumpty land, where words mean whatever we want them to mean? We're going to call them source files, even though they might not actually be what computer programmers normally mean by that phrase?Tray
The standard also says "kept", yet in the pipe example the text of the program isn't even kept anywhere, it's generated dynamically on the fly. It would probably have been better if it said something like "If the text of the program is in a named file, source_location::file_name contains the name of that file.Tray
@Barmar: The answer seems to be "Yes, we are." But hey - don't shoot the messenger, I didn't write the standard :-)Placeeda
"I'd say that the insistence on defining source_location using files is somewhat short-sighted or myopic" while I see your point, this is the exact kind of mindset that's kept C++ ages behind other languages. That is the reason we don't have a C++ repository to manage and pull libraries from, why we don't have standard stack trace, or #pragma once standardised, or even a file system library (until now), or a memory model until c++11 and the list goes on and on. [...]Patrilineal
[...] I understand there are difficulties defining these in a language meant to be implemented on any platform present or future, but still, they are keeping C++ from becoming a truly modern language, despite all of the recent efforts. Imho not having a library manager is just inexcusable at this point.Patrilineal
@bolov: That's even more incendiary than what I said... also, comments on this answer are not the appropriate venue for this discussion. I also disagree at least in part with your claims.Placeeda
"We're going to call them source files, even though they might not actually be...?" -- if they're stored somewhere, and they're referenced by names, they're files in a filesystem. Whether that filesystem is ext2, NTFS, WebDAV or IMAP is irrelevant. The filesystem is just an abstraction over a name : byte-sequence mapping.Prosser
@RogerLipscombe: The source in a pipe is stored somewhere - in memory; and - you can't reference it by name.Placeeda
Just tried this with GCC: "echo '#include <stdio.h>\nint main(){printf("%s\\n", __FILE__); return 1;}' | gcc -o test -xc -" (without quotes). When executed, it prints out <stdin>.Unpaid
That's a bonus feature of gcc, not C++.Prosser
@RogerLipscombe: Does it contradict the standard?Placeeda
I don't see how it's a "bonus feature". It's a compliant result of a standard feature.Cranium
Here is a funny thing about terms and names and concepts in standards (and sciences): they're usually atomic. That is, "source file" is not necessarily a "file" that is "source", in fact, the term "file" may simply not be defined — compare with numbers in the maths: there is no such thing as just a "number", only "natural nmber", "rational number", "real number", etc.Bethelbethena
@Joker_vD: Well, maybe, but it doesn't say source-file or "source file" in quotes. Still, good point.Placeeda
The ability of Godbolt's compiler explorer to include files over the internet when specifying an URL in a #include directive is another example of file mapping that bypasses the traditional filesystem. The implementation is certainly a hack, but the standard definition seems to allow it.Aikoail
@Morwenn: Link to an example?Placeeda
I
53

Even before C++20, the standard has had:

__FILE__

The presumed name of the current source file (a character string literal).

The definition is the same for source_location::file_name.

As such, there has not been a change in regard to support for file system-less implementations in C++20.

The standard doesn't exactly define what "source file" means, so whether it refers to a file system may be up to interpretation. Presumably, it could be conforming for an implementation to produce "the handwritten note that you gave to me just then" if that indeed identifies the "source file" in that implementation of the language.


In conclusion: Yeah, sources are referred to as "files" by the standard, but what a "file" is and whether a file system is involved is unspecified.

Infarct answered 18/8, 2019 at 20:48 Comment(7)
What does "presumed" mean in this context? Can there be an ambiguity?Polk
I'm wondering, does my initial claim (not stored in file) contain a mistake, or is there a behavior for that case?Komsomol
@Polk I don't know exactly the intention of the "presumption" qualification of the rule, but I presume :) that it is an clarification that the file name doest need to be absolute or canonical, but rather a relative name from perspective of the compiler is sufficient. I could be wrong.Infarct
I can just see scanner-c++ returning "Left-cabinet, third-drawer, fourth red-tabbed folder, page 17".Phosgene
FWIW, in the POSIX sense, a pipe (or any other file-ish thing) is a "file" - as such, stdin/stdout are "files", just not disk files etc. in this sense.Eblis
@Yksisarvinen: The Committee often makes allowances for situations where obscure implementations might have good reasons to do something contrary to commonplace behavior. In so doing, it relies upon compiler writers to judge whether their customers would find the commonplace behavior more or less useful than some alternative. The fact that such things are left to implementers' judgment may be viewed as an "ambiguity", but it's a deliberate one, since good compiler writers will know more about their customers' needs than the Committee ever could.Tudela
@dmckee ... in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.”Renounce

© 2022 - 2024 — McMap. All rights reserved.