Is C++ considered weakly typed? Why?
Asked Answered
M

6

29

I've always considered C++ to be one of the most strongly typed languages out there.
So I was quite shocked to see Table 3 of this paper state that C++ is weakly typed.

Apparently,

C and C++ are considered weakly typed since, due to type-casting, one can interpret a field of a structure that was an integer as a pointer.

Is the existence of type casting all that matters? Does the explicit-ness of such casts not matter?

More generally, is it really generally accepted that C++ is weakly typed? Why?

Malevolent answered 5/11, 2014 at 9:23 Comment(13)
C++ isn't weakly typed, but you can subvert the type system if you want to. So one could argue that it isn't fully strongly typed.Economically
.. and if it allows you (type punning is tricky)Boogiewoogie
So how is C# considered strongly typed? Can't you do exactly the same thing with unsafe or by using the Marshal class?Malevolent
@Mehrdad I think "weakly typed" is a quite subjective term. "Strictly typed" and "statically typed" vs. "loosely typed" and "dynamically typed" are more objective, more precise words. From what I can tell, generally people use "weakly typed" as a diminutive-pejorative term which means "I don't like the notion of types in this language". It's sort of an argumentum ad hominem (or rather, argumentum ad linguam) for those who can't bring up professional-technical arguments against a particular language.Arceliaarceneaux
@TheParamagneticCroissant: I see. So what would "strictly typed" mean?Malevolent
Also the quote would suggest that it may be dynamically typed, rather than weakly typed.Economically
I believe you can do int x = false; because C++ allows conversion between bool and int (note that the compiler may WARN about it, but if it's not an error, then the language is weakly typed at least when it comes to bool and int combinations)Rios
@Mehrdad It also has slightly different interpretations; the generally accepted meaning is "the compiler generates errors if types don't match up". Another interpretation is that "there are no or few implicit conversions". Based on this, C++ can actually be considered a strictly typed language, and most often it is considered as such.Arceliaarceneaux
@Mehrdad Also, there are some programmers, especially beginners unfamiliar with a lot of languages, who don't intend to or can't make the distinction between "strict" and "static", "loose" and "dynamic", and conflate the two - otherwise orthogonal - concepts based on their limited experience (i. e. the correlation of dynamism and loose typing in popular scripting languages, for example). In reality, parts of C++ (virtual calls) impose the requirement that the type system be partially dynamic, but other things in the standard require that it be strict. Again, this is not a problem.Arceliaarceneaux
Bjarne Stroustrup's book mentions that C++ is a strongly typed language on page 2. Who would know better than him :-)Smaragdite
@MatsPetersson : I think it's worth pointing out that warnings can also be promoted to errors (-Werror with g++ for example).Blenny
That paper contradicts itself, and even cited quote contains two false states. I.e. "weakly typed" is a term they invented by inverting "strictly typed". Second, "one can interpret a field of a structure that was an integer as a pointer" seems to be oddly narrow case which almost never ever happens or used (and generally is undefined behaviour) and may happen only through explicit cast structure which must dodge all restrictions language rules use to prohibit exactly thatHarping
They literally say that a gun is dangerous for its owner because owner can commit suicide with its help.Harping
P
39

That paper first claims:

In contrast, a language is weakly-typed if type-confusion can occur silently (undetected), and eventually cause errors that are difficult to localize.

And then claims:

Also, C and C++ are considered weakly typed since, due to type-casting, one can interpret a field of a structure that was an integer as a pointer.

This seems like a contradiction to me. In C and C++, the type-confusion that can occur as a result of casts will not occur silently -- there's a cast! This does not demonstrate that either of those languages is weakly-typed, at least not by the definition in that paper.

That said, by the definition in the paper, C and C++ may still be considered weakly-typed. There are, as noted in the comments on the question already, cases where the language supports implicit type conversions. Many types can be implicitly converted to bool, a literal zero of type int can be silently converted to any pointer type, there are conversions between integers of varying sizes, etc, so this seems like a good reason to consider C and C++ weakly-typed for the purposes of the paper.

For C (but not C++), there are also more dangerous implicit conversions that are worth mentioning:

int main() {
  int i = 0;
  void *v = &i;
  char *c = v;
  return *c;
}

For the purposes of the paper, that must definitely be considered weakly-typed. The reinterpretation of bits happens silently, and can be made far worse by modifying it to use completely unrelated types, which has silent undefined behaviour that typically has the same effect as reinterpreting bits, but blows up in mysterious yet sometimes amusing ways when optimisations are enabled.

In general, though, I think there isn't a fixed definition of "strongly-typed" and "weakly-typed". There are various grades, a language that is strongly-typed compared to assembly may be weakly-typed compared to Pascal. To determine whether C or C++ is weakly-typed, you first have to ask what you want weakly-typed to mean.

Preparatory answered 5/11, 2014 at 9:34 Comment(17)
+1 great point. But to answer the question you should also mention whether or not C++ is strongly typed by whatever the accepted definition is!Malevolent
@Mehrdad Agreed, and expanded my answer.Preparatory
Casts in C and C++ can occur silently, you know. Actually, both languages are full of subtle type traps. In my experience, most invocations of printf (and family) written by an average C/C++ programmer contain undefined behavior :)Duaneduarchy
@AndreyChernyakhovskiy No, they cannot. "Cast" means "explicit conversion" (or specifically, the syntax used to write an explicit conversion). There are implicit conversions, like I already noted in my answer, but they aren't called casts.Preparatory
@AndreyChernyakhovskiy: Unsafe conversions don't occur implicitly in C++, but they do in C (e.g. pointer conversions).Malevolent
@Mehrdad Ah, yes, I think you may have a good point there. I should add that to my answer.Preparatory
@hvd That's true, from the C/C++ language-lawyer point of view. However, even those of us who only program in C and C++ sometimes laxly use the word 'cast' with the meaning of 'conversion'.Duaneduarchy
@AndreyChernyakhovskiy If that's what the authors of the paper meant, then I agree that you have a good point, but the impression I got was that they were after void *p = ...; int i = (int)p;.Preparatory
Mehrdad, unfortunately, I have to disagree. int i; printf("%p, %i", &i, sizeof i);. There are two type system violations in this simple code.Duaneduarchy
@AndreyChernyakhovskiy: I guess I was talking about the language itself, not the standard library. If you look at it that way, you might as well claim that every language that allows calling C functions allows unsafe conversions.Malevolent
@Mehrdad, but you could write an implementation of printf yourself, and it would still have the same vulnerabilities. Therefore it is the language itself that provokes these type system violations.Duaneduarchy
@AndreyChernyakhovskiy: No. My implementation would have those vulnerabilities if it uses varargs, but va_arg performs an explicit conversion, not an implicit one.Malevolent
@Mehrdad, you couldn't avoid using varargs before C++11 to implement printf, could you?Duaneduarchy
For C++ (but not C), there are also more dangerous implicit conversions that are worth mentioning: Create an array of derived class (Derived a[10]), then pass it to a function that takes an array of base class (void f(Base x[]); f(a);). Watch it crash and burn. In other words, the implicit Derived * to Base * conversion is unsafe due to the existence of native arrays.Wateriness
I think one distinction that is not being made here (despite the fact that many people here are experienced and aware of it), is beyond the technical differences in C and C++ (legal conversions), there is a huge difference in the strength of typing in idiomatic code. For instance, C ellipses arguments are legal C++, but not idiomatic (replaced by variadic templates which are strongly typed). Another example: user-stateful callbacks in C are handled by function pointer taking void*. In C++ handled by function object/std::function. Generally void* usage in C++ is far, far more rare. Etc.Upheave
Your example is compiler-wise great but it does not reach to an obvious failure.Arkhangelsk
Not to say it's wrong, but there are implicit casts in C and C++ (different rules, by the way). In C++it can be happen mainly with arithmetic types and boolean. In C it may also happen to` void*` pointer. Implicit and contextual casts are well-definedHarping
A
13

"weakly typed" is a quite subjective term. I prefer the terms "strictly typed" and "statically typed" vs. "loosely typed" and "dynamically typed", because they are more objective and more precise words.

From what I can tell, people generally use "weakly typed" as a diminutive-pejorative term which means "I don't like the notion of types in this language". It's sort of an argumentum ad hominem (or rather, argumentum ad linguam) for those who can't bring up professional or technical arguments against a particular language.

The term "strictly typed" also has slightly different interpretations; the generally accepted meaning, in my experience, is "the compiler generates errors if types don't match up". Another interpretation is that "there are no or few implicit conversions". Based on this, C++ can actually be considered a strictly typed language, and most often it is considered as such. I would say that the general consensus on C++ is that it is a strictly typed language.

Of course we could try a more nuanced approach to the question and say that parts of the language are strictly typed (this is the majority of the cases), other parts are loosely typed (a few implicit conversions, e. g. arithmetic conversions and the four types of explicit conversion).

Furthermore, there are some programmers, especially beginners who are not familiar with more than a few languages, who don't intend to or can't make the distinction between "strict" and "static", "loose" and "dynamic", and conflate the two - otherwise orthogonal - concepts based on their limited experience (usually the correlation of dynamism and loose typing in popular scripting languages, for example).

In reality, parts of C++ (virtual calls) impose the requirement that the type system be partially dynamic, but other things in the standard require that it be strict. Again, this is not a problem, since these are orthogonal concepts.

To sum up: probably no language fits completely, perfectly into one category or another, but we can say which particular property of a given language dominates. In C++, strictness definitely does dominate.

Arceliaarceneaux answered 5/11, 2014 at 9:48 Comment(0)
E
5

Well, since the creator of C++, Bjarne Stroustrup, says in The C++ Programming Language (4th edition) that the language is strongly typed, I would take his word for it:

C++ programming is based on strong static type checking, and most techniques aim at achieving a high level of abstraction and a direct representation of the programmer’s ideas. This can usually be done without compromising run-time and space efficiency compared to lower-level techniques. To gain the benefits of C++, programmers coming to it from a different language must learn and internalize idiomatic C++ programming style and technique. The same applies to programmers used to earlier and less expressive versions of C++.

In this video lecture from 1994 he also states that the weak type system of C really bothered him, and that's why he made C++ strongly typed: The Design of C++ , lecture by Bjarne Stroustrup

Elfish answered 20/6, 2019 at 14:44 Comment(0)
T
3

In contrast, a language is weakly-typed if type-confusion can occur silently (undetected), and eventually cause errors that are difficult to localize.

Well, that can happen in C++, for example:

#define _USE_MATH_DEFINES
#include <iostream>
#include <cmath>
#include <limits>

void f(char n) { std::cout << "f(char)\n"; }
void f(int n) { std::cout << "f(int)\n"; }
void g(int n) { std::cout << "f(int)\n"; }

int main()
{
    float fl = M_PI;   // silent conversion to float may lose precision

    f(8 + '0'); // potentially unintended treatment as int

    unsigned n = std::numeric_limits<unsigned>::max();
    g(n);  // potentially unintended treatment as int
}

Also, C and C++ are considered weakly typed since, due to type-casting, one can interpret a field of a structure that was an integer as a pointer.

Ummmm... not via any implicit conversion, so that's a silly argument. C++ allows explicit casting between types, but that's hardly "weak" - it doesn't happen accidentally/silently as required by the site's own definition above.

Is the existence of type casting all that matters? Does the explicit-ness of such casts not matter?

Explicitness is a crucial consideration IMHO. Letting a programmer override the compiler's knowledge of types is one of the "power" features of C++, not some weakness. It's not prone to accidental use.

More generally, is it really generally accepted that C++ is weakly typed? Why?

No - I don't think it is accepted. C++ is reasonably strongly typed, and the ways in which it has been lenient that have historically caused trouble have been pruned back, such as implicit casts from void* to other pointer types, and finer grained control with explicit casting operators and constructors.

Tajuanatak answered 5/11, 2014 at 10:13 Comment(0)
T
1

In General:

There is a confusion around the subject. Some terms differ from book to book (not considering the internet here), and some may have changed over the years.

Below is what I've understood from the book "Engineering a Compiler" (2nd Edition).


1. Untyped Languages

Languages that have no types at all, like for example in assembly.


2. Weakly Typed Languages:

Languages that have poor type system. The definition here is intentionally ambiguous.


3. Strongly Typed Languages:

Languages where each expression have unambiguous type. PL can further categorised to:

  • A. Statically Typed: when every expression is assigned a type at compile time.
  • B. Dynamically Typed: when some expressions can only be typed at runtime.


What is C++ then?

Well, it's strongly typed for sure. And mostly it is statically typed. But as some expressions can only be typed at runtime, I guess it falls under the 3.B category.

PS1: A note from the book:

A strongly typed language, that can be statically checkable, might be implemented (for some reason) just with runtime checking.

PS2: Third Edition was recently released

I don't own it, so I don't know if anything had changed on this regard. But in general, the "Semantic Analysis" Chapter had changed both title and order in Table of Contents.

Titoism answered 26/11, 2022 at 12:12 Comment(0)
L
-1

Let me give you a simple example:

 if ( a + b )

C/C+= allows an implicit conversion from float to int to Boolean.

A strongly-typed language would not allow such an implicit conversion.

Liege answered 5/11, 2014 at 14:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.