Why does/did C allow implicit function and typeless variable declarations?
Asked Answered
R

5

17

Why is it sensible for a language to allow implicit declarations of functions and typeless variables? I get that C is old, but allowing to omit declarations and default to int() (or int in case of variables) doesn't seem so sane to me, even back then.

So, why was it originally introduced? Was it ever really useful? Is it actually (still) used?

Note: I realise that modern compilers give you warnings (depending on which flags you pass them), and you can suppress this feature. That's not the question!


Example:

int main() {
  static bar = 7; // defaults to "int bar"
  return foo(bar); // defaults to a "int foo()"
}

int foo(int i) {
  return i;
}
Rimester answered 6/8, 2012 at 19:54 Comment(18)
That feature has been removed from the language. Unfortunately, many compilers still accept it by default.Pesthole
possible duplicate of Concept of "auto" keyword in cUnderglaze
It doesn't as of C99. Modern compilers don't emit a warning as you say; they emit an error. You aren't using a "modern" compiler. I'm guessing visual studio.Daddylonglegs
@DanielFischer: Thanks, I wasn't aware of that (I'm behind on my standards reading). Anyway, I was mostly concerned why it would have been introduced in the first place.Rimester
Programmers are lazy, and int is a powerfully universal type. Being able to omit it lets you write lots of quick-and-dirty code very succinctly -- "brevity is the soul of wit", as William Shatner said. It's only recently that space (both disk and screen) has become some abundant and programs so large that clean, readable style is preferable to brevity.Weatherworn
@0A0D: Using auto here is only to get the compiler to understand that bar is supposed to be a variable. I could have used static as well.Rimester
My understanding is that "back then" C was "a better assembly", heavily borrowing from the underlying PDP architecture, and mapping to it very neatly. Unfortunately, the language borrowed the scarce type structure as well.Mackintosh
How is this not a real question? Seriously, what's going on?Rimester
I agree. It will be closed by those over-eager fellows who maintain that this is subjective. It is not; there was obviously a reason behind making these rules a part of the standard. They would likely answer this in the form of "because it said so in the standard". IMO Stack Overflow is worse off for this sort of moderation. Oh well, voted to reopenDaddylonglegs
I agree, it's a real (and not uninteresting) question. But it's off topic and/or not constructive, I think. We could all only guess why.Pesthole
@EdS., Daniel: I'm a terrible judge on what belongs on programmers.SE (I'm almost invariably wrong), but would this be a candidate?Rimester
I think it's a better fit here. Just wait for it to be opened again.Daddylonglegs
I'm not familiar with Programmers, but it may well be. Somebody should ask a mod over there.Pesthole
It's been opened again, post away.Daddylonglegs
The processor itself does not deal with types, only values and memory locations.Plasticine
"We could all only guess why." -- Those who are familiar with the history or are willing to do research can do far better than guess.Wizardry
"heavily borrowing from the underlying PDP architecture, and mapping to it very neatly" -- This is largely a myth. See "More History" in cm.bell-labs.com/who/dmr/chist.htmlWizardry
@bitmask: very good question. +1 from me.Denature
W
15

See Dennis Ritchie's "The Development of the C Language": http://web.archive.org/web/20080902003601/http://cm.bell-labs.com/who/dmr/chist.html

For instance,

In contrast to the pervasive syntax variation that occurred during the creation of B, the core semantic content of BCPL—its type structure and expression evaluation rules—remained intact. Both languages are typeless, or rather have a single data type, the 'word', or 'cell', a fixed-length bit pattern. Memory in these languages consists of a linear array of such cells, and the meaning of the contents of a cell depends on the operation applied. The + operator, for example, simply adds its operands using the machine's integer add instruction, and the other arithmetic operations are equally unconscious of the actual meaning of their operands. Because memory is a linear array, it is possible to interpret the value in a cell as an index in this array, and BCPL supplies an operator for this purpose. In the original language it was spelled rv, and later !, while B uses the unary *. Thus, if p is a cell containing the index of (or address of, or pointer to) another cell, *p refers to the contents of the pointed-to cell, either as a value in an expression or as the target of an assignment.

This typelessness persisted in C until the authors started porting it to machines with different word lengths:

The language changes during this period, especially around 1977, were largely focused on considerations of portability and type safety, in an effort to cope with the problems we foresaw and observed in moving a considerable body of code to the new Interdata platform. C at that time still manifested strong signs of its typeless origins. Pointers, for example, were barely distinguished from integral memory indices in early language manuals or extant code; the similarity of the arithmetic properties of character pointers and unsigned integers made it hard to resist the temptation to identify them. The unsigned types were added to make unsigned arithmetic available without confusing it with pointer manipulation. Similarly, the early language condoned assignments between integers and pointers, but this practice began to be discouraged; a notation for type conversions (called `casts' from the example of Algol 68) was invented to specify type conversions more explicitly. Beguiled by the example of PL/I, early C did not tie structure pointers firmly to the structures they pointed to, and permitted programmers to write pointer->member almost without regard to the type of pointer; such an expression was taken uncritically as a reference to a region of memory designated by the pointer, while the member name specified only an offset and a type.

Programming languages evolve as programming practices change. In modern C and the modern programming environment, where many programmers have never written assembly language, the notion that ints and pointers are interchangeable may seem nearly unfathomable and unjustifiable.

Wizardry answered 6/8, 2012 at 20:26 Comment(5)
SRB suggested to DMR adding void as a type in C. "It saved the instruction loading register for return value" Plus when SRB arrived "C types were like PL/1 namely offsets from any base you liked". C changed to Algol68 types and a few other "A68isms" ended up in C... [Note: Algol68 had strong typing]Lao
@Lao And there was his algol-influenced shell (research.swtch.com/shmacro). It's interesting that Algol60 -> CPL -> BCPL -> B -> C, and {Algol60, CPL} ->ALGOL68 -> C. Strachey's CPL was a very sophisticated language, and its a pity that many of its excellent features were lost along the way (to reappear elsewhere, such as in Haskell and other functional languages).Wizardry
I had spotted the "switch" stmts in C being similar to PL/I's "switch", thought that was all. And I was surprised to find C previously had "PL/I's type offsets from any base". Interesting how C (a language designed to be tiny) managed to include influences from so many languages. Certainly BCPL warrants a peek.Lao
@Lao CPL much more than BCPL ... it was designed by this guy: en.wikipedia.org/wiki/Christopher_StracheyWizardry
Link is broken. See web.archive.org/web/20080902003601/http://cm.bell-labs.com/who/…Powwow
F
14

It's the usual story — hysterical raisins (aka 'historical reasons').

In the beginning, the big computers that C ran on (DEC PDP-11) had 64 KiB for data and code (later 64 KiB for each). There was a limit to how complex you could make the compiler and still have it run. Indeed, there was scepticism that you could write an O/S using a high-level language such as C, rather than needing to use assembler. So, there were size constraints. Also, we are talking a long time ago, in the early to mid 1970s. Computing in general was not as mature a discipline as it is now (and compilers specifically were much less well understood). Also, the languages from which C was derived (B and BCPL) were typeless. All these were factors.

The language has evolved since then (thank goodness). As has been extensively noted in comments and down-voted answers, in strict C99, implicit int for variables and implicit function declarations have both been made obsolete. However, most compilers still recognize the old syntax and permit its use, with more or less warnings, to retain backwards compatibility, so that old source code continues to compile and run as it always did. C89 largely standardized the language as it was, warts (gets()) and all. This was necessary to make the C89 standard acceptable.

There is still old code around using the old notations — I spend quite a lot of time working on an ancient code base (circa 1982 for the oldest parts) which still hasn't been fully converted to prototypes everywhere (and that annoys me intensely, but there's only so much one person can do on a code base with multiple millions of lines of code). Very little of it still has 'implicit int' for variables; there are too many places where functions are not declared before use, and a few places where the return type of a function is still implicitly int. If you don't have to work with such messes, be grateful to those who have gone before you.

Franchescafranchise answered 6/8, 2012 at 20:12 Comment(5)
Wouldn't the compiler get even bigger if you included these two features? Or were there no "proper" function declarations in C in these days?Rimester
Don't forget that the prototype notation was not formally added to C until the C89 standard was released (and was adapted from C++ which had demonstrated that it was beneficial). That added to the complexity of C compilers. Prior to that, you might have written: main(){ static bar = 7; return foo(bar); } foo(i) { return i; } without the namby-pamby molly-coddling presented by prototypes and headers and so on.Franchescafranchise
Ah yes, the size constraints... not only the size of the compiled code but the size of the source files as well.Milly
I'd say the typelessness of BCPL (was B also typeless?) is likely the biggest reason. And it's definitely worth mentioning that "implicit int" was removed from the language by the C99 standard; static bar = 7; is now a syntax error.Chicanery
In the beginning, C ran on the PDP-7, which had 8K of 18-bit words. The PDP-11 that they switched to only had 24K bytes, and they had to remain compatible with such machines even after they got PDP-11/45's with split I and D space.Wizardry
P
8

Probably the best explanation for "why" comes from here:

Two ideas are most characteristic of C among languages of its class: the relationship between arrays and pointers, and the way in which declaration syntax mimics expression syntax. They are also among its most frequently criticized features, and often serve as stumbling blocks to the beginner. In both cases, historical accidents or mistakes have exacerbated their difficulty. The most important of these has been the tolerance of C compilers to errors in type. As should be clear from the history above, C evolved from typeless languages. It did not suddenly appear to its earliest users and developers as an entirely new language with its own rules; instead we continually had to adapt existing programs as the language developed, and make allowance for an existing body of code. (Later, the ANSI X3J11 committee standardizing C would face the same problem.)

Systems programming languages don't necessarily need types; you're mucking around with bytes and words, not floats and ints and structs and strings. The type system was grafted onto it in bits and pieces, rather than being part of the language from the very beginning. As C has moved from being primarily a systems programming language to a general-purpose programming language, it has become more rigorous in how it handles types. But, even though paradigms come and go, legacy code is forever. There's still a lot of code out there that relies on that implicit int, and the standards committee is reluctant to break anything that's working. That's why it took almost 30 years to get rid of it.

Palladous answered 6/8, 2012 at 20:27 Comment(0)
U
6

A long, long time ago, back in the K&R, pre-ANSI days, functions looked quite different than they do today.

add_numbers(x, y)
{
    return x + y;
}

int ansi_add_numbers(int x, int y); // modern, ANSI C

When you call a function like add_numbers, there is an important difference in the calling conventions: all types are "promoted" when the function is called. So if you do this:

// no prototype for add_numbers
short x = 3;
short y = 5;
short z = add_numbers(x, y);

What happens is x is promoted to int, y is promoted to int, and the return type is assumed to be int by default. Likewise, if you pass a float it is promoted to double. These rules ensured that prototypes weren't necessary, as long as you got the right return type, and as long as you passed the right number and type of arguments.

Note that the syntax for prototypes is different:

// K&R style function
// number of parameters is UNKNOWN, but fixed
// return type is known (int is default)
add_numbers();

// ANSI style function
// number of parameters is known, types are fixed
// return type is known
int ansi_add_numbers(int x, int y);

A common practice back in the old days was to avoid header files for the most part, and just stick the prototypes directly in your code:

void *malloc();

char *buf = malloc(1024);
if (!buf) abort();

Header files are accepted as a necessary evil in C these days, but just as modern C derivatives (Java, C#, etc.) have gotten rid of header files, old-timers didn't really like using header files either.

Type safety

From what I understand about the old old days of pre-C, there wasn't always much of a static typing system. Everything was an int, including pointers. In this old language, the only point of function prototypes would be to catch arity errors.

So if we hypothesize that functions were added to the language first, and then a static type system was added later, this theory explains why prototypes are optional. This theory also explains why arrays decay to pointers when used as function arguments -- since in this proto-C, arrays were nothing more than pointers which get automatically initialized to point to some space on the stack. For example, something like the following may have been possible:

function()
{
    auto x[7];
    x += 1;
}

Citations

On typelessness:

Both languages [B and BCPL] are typeless, or rather have a single data type, the 'word,' or 'cell,' a fixed-length bit pattern.

On the equivalence of integers and pointers:

Thus, if p is a cell containing the index of (or address of, or pointer to) another cell, *p refers to the contents of the pointed-to cell, either as a value in an expression or as the target of an assignment.

Evidence for the theory that prototypes were omitted due to size constraints:

During development, he continually struggled against memory limitations: each language addition inflated the compiler so it could barely fit, but each rewrite taking advantage of the feature reduced its size.

Unearth answered 6/8, 2012 at 20:26 Comment(3)
Originally, you might well have written: add_numbers(x,y){return x+y;} (with no unnecessary spaces in that transcription; using 4 lines of code in source files).Franchescafranchise
char *malloc();void * was also a late addition (as was plain void), making its way into the language only a little before the C89 standard. Horrible, aren't I?Franchescafranchise
@JonathanLeffler: Fascinating. I take void * for granted these days, but it makes sense that it used to be char *, since the standard now says that void * and char * have to have the same representation.Unearth
S
2

Some food for thought. (It's not an answer; we actually know the answer — it's permitted for backward compatibility.)

And people should look at COBOL code base or f66 libraries before saying why it's not cleaned up in 30 years or so!

gcc with its default does not spit out any warnings.

With -Wall and gcc -std=c99 do spit out the correct thing

main.c:2: warning: type defaults to ‘int’ in declaration of ‘bar’
main.c:3: warning: implicit declaration of function ‘foo’

The lint functionality built into modern gcc is showing its color.

Interestingly the modern clone of lint, the secure lint — I mean splint — gives only one warning by default.

main.c:3:10: Unrecognized identifier: foo
  Identifier used in code has not been declared. (Use -unrecog to inhibit
  warning)

The llvm C compiler clang which also has a static analyser built into it like gcc, spits out the two warnings by default.

main.c:2:10: warning: type specifier missing, defaults to 'int' [-Wimplicit-int]
  static bar = 7; // defaults to "int bar"
  ~~~~~~ ^
main.c:3:10: warning: implicit declaration of function 'foo' is invalid in C99
      [-Wimplicit-function-declaration]
  return foo(bar); // defaults to a "int foo()"
         ^

People used to think we don't need backward compatibility for 80's stuff. All the code must be cleaned up or replaced. But it turns out it's not the case. A lot of production code stays in prehistoric non-standard times.

EDIT:

I didn't look through other answers before posting mine. I may have misunderstood the intention of the poster. But the thing is there was a time when you hand compiled your code, and use toggle to put the binary pattern in memory. They didn't need a "type system". Nor does the PDP machine in front of which Richie and Thompson posed like this :

Don't look at the beard, look at the "toggles", which I heard were used to bootstrap the machine.

K&R

And also look how they used to boot UNIX in this paper. It's from the Unix 7th edition manual.

http://wolfram.schneider.org/bsd/7thEdManVol2/setup/setup.html

The point of the matter is they didn't need so much software layer managing a machine with KB sized memory. Knuth's MIX has 4000 words. You don't need all these types to program a MIX computer. You can happily compare a integer with pointer in a machine like this.

I thought why they did this is quite self-evident. So I focused on how much is left to be cleaned up.

Salchunas answered 6/8, 2012 at 20:33 Comment(4)
"we actually know the answer, its permitted for backward compatibility" -- That's only an answer if one gets no further than the title of the question rather than reading and understanding what the OP really wants to know.Wizardry
I didn't notice his EDIT. I guess i could expand my post reflecting the issue behind it.Salchunas
The edit was irrelevant to the point. Apparently you also didn't notice the other answers that actually address the issue.Wizardry
The PDP-11 toggles allowed entering binary values into memory; they served the same purpose as an EPROM loader. That it had toggle switches has no bearing at all on the question or on the value of type systems. I programmed the PDP-11 in C both before and after type safety was added, and type safety had all the benefits it has now. The amount of memory is irrelevant: the PDP-11 C compiler and UNIX operating system were complex software that greatly benefit from type safety. This non-answer reflects deep misunderstandings about types, programming languages, software systems, and computers.Wizardry

© 2022 - 2024 — McMap. All rights reserved.