Seeking clarification on apparent contradictions regarding weakly typed languages

Asked 29/3, 2012 at 16:34 Answered 10/8, 2014 at 18:26

181

I think I understand strong typing, but every time I look for examples for what is weak typing I end up finding examples of programming languages that simply coerce/convert types automatically.

For instance, in this article named Typing: Strong vs. Weak, Static vs. Dynamic says that Python is strongly typed because you get an exception if you try to:

Python

1 + "1"
Traceback (most recent call last):
File "", line 1, in ? 
TypeError: unsupported operand type(s) for +: 'int' and 'str'

However, such thing is possible in Java and in C#, and we do not consider them weakly typed just for that.

Java

  int a = 10;
  String b = "b";
  String result = a + b;
  System.out.println(result);

int a = 10;
string b = "b";
string c = a + b;
Console.WriteLine(c);

In this another article named Weakly Type Languages the author says that Perl is weakly typed simply because I can concatenate a string to a number and viceversa without any explicit conversion.

Perl

$a=10;
$b="a";
$c=$a.$b;
print $c; #10a

So the same example makes Perl weakly typed, but not Java and C#?.

Gee, this is confusing enter image description here

The authors seem to imply that a language that prevents the application of certain operations on values of different types is strongly typed and the contrary means weakly typed.

Therefore, at some point I have felt prompted to believe that if a language provides a lot of automatic conversions or coercion between types (as perl) may end up being considered weakly typed, whereas other languages that provide only a few conversions may end up being considered strongly typed.

I am inclined to believe, though, that I must be wrong in this interepretation, I just do not know why or how to explain it.

So, my questions are:

What does it really mean for a language to be truly weakly typed?
Could you mention any good examples of weakly typing that are not related to automatic conversion/automatic coercion done by the language?
Can a language be weakly typed and strongly typed at the same time?

Vermeil answered 29/3, 2012 at 16:34 Comment(11)

This baffles me too! As far as I understand (or think to understand), weakly typed language are said to be 'type safe' and they have implicit type conversion. – Judiciary 29/3, 2012 at 16:38

Strong versus weak typing is all about type conversion (what else could it be about?) If you want an example of a "very" weak language, watch this: destroyallsoftware.com/talks/wat. – Trogon 29/3, 2012 at 16:41

@Wildduck All languages provide type conversions, but not all of them are considered weakly typed. My examples shown below demonstrate how programmers consider a language weakly typed based on the same examples that are possible on other languages considered strongly typed. As such my question still prevails. What is the difference? – Vermeil 29/3, 2012 at 16:55

The short answer, I think, is that "Typedness" is not a binary state. Java and C# are more strongly typed but not absolutely. – Jihad 29/3, 2012 at 16:56

I believe this is better suited for Software Engineering. – Faizabad 29/3, 2012 at 19:39

So, when one starts to build a type system, initially the memory is an untyped repository of data. The designer of the type system strives to make this system to be as strongly typed as possible or as required by the design of problem to be solved by it. The level of "strongly-typedness" is based on trade-offs in the design of this system. This trade-offs may lead to "weakly-typedness" created by design or as consequence of other decisions. As such, absolute "weakly-typedness" or "strongly-typedness" may not be present in any existing programming language. – Vermeil 29/3, 2012 at 19:46

@edalorzo - Absolute "typedness" does exist. Python is completely strongly typed -- it never does a typecast unless you tell it to. JavaScript is completely weakly typed -- the only operation that depends on types at all is typeof (explicitly asking for the type of something). – Gospodin 30/3, 2012 at 2:1

@Brendan What about summing a float and an integer? Isn't the integer coerced into a float in Python? Would you say now that Python is not absolutely strongly typed? – Vermeil 30/3, 2012 at 4:43

@edalorzo yes, very much as Eric Lippert describes in the long answer. Although, the level striving obviously varies. – Jihad 30/3, 2012 at 10:1

Nice to have info alongside: does-untyped-also-mean-dynamically-typed-in-the-academic-cs-world? and typed-vs-strongly-typed-in-c-sharp? – Safeguard 24/7, 2014 at 2:7

I'm not familiar enough with Perl to know how its numerical addition operators work, but in a language called HyperTalk (circa 1990's), the expression "12"+"34" would yield "46", while "12"+"XYZ" would yield "12XYZ". That was definitely a weakly typed language [fortunately, it also allowed & to be used for string concatenation, and that operator never meant anything else]. I don't know how Perl would regard such things. – Globe 4/8, 2014 at 21:15

212

UPDATE: This question was the subject of my blog on the 15th of October, 2012. Thanks for the great question!

What does it really mean for a language to be "weakly typed"?

It means "this language uses a type system that I find distasteful". A "strongly typed" language by contrast is a language with a type system that I find pleasant.

The terms are essentially meaningless and you should avoid them. Wikipedia lists eleven different meanings for "strongly typed", several of which are contradictory. This indicates that the odds of confusion being created are high in any conversation involving the term "strongly typed" or "weakly typed".

All that you can really say with any certainty is that a "strongly typed" language under discussion has some additional restriction in the type system, either at runtime or compile time, that a "weakly typed" language under discussion lacks. What that restriction might be cannot be determined without further context.

Instead of using "strongly typed" and "weakly typed", you should describe in detail what kind of type safety you mean. For example, C# is a statically typed language and a type safe language and a memory safe language, for the most part. C# allows all three of those forms of "strong" typing to be violated. The cast operator violates static typing; it says to the compiler "I know more about the runtime type of this expression than you do". If the developer is wrong, then the runtime will throw an exception in order to protect type safety. If the developer wishes to break type safety or memory safety, they can do so by turning off the type safety system by making an "unsafe" block. In an unsafe block you can use pointer magic to treat an int as a float (violating type safety) or to write to memory you do not own. (Violating memory safety.)

C# imposes type restrictions that are checked at both compile-time and at runtime, thereby making it a "strongly typed" language compared to languages that do less compile-time checking or less runtime checking. C# also allows you to in special circumstances do an end-run around those restrictions, making it a "weakly typed" language compared with languages which do not allow you to do such an end-run.

Which is it really? It is impossible to say; it depends on the point of view of the speaker and their attitude towards the various language features.

Dirichlet answered 29/3, 2012 at 16:42 Comment(10)

But, Eric, that would implies the definition is based on taste and personal opinions, not on type theory. Is "weakly typed" then just theoretical concept without any existing implementations? – Vermeil 29/3, 2012 at 16:46

@edalorzo: It is based on taste and personal opinions about (1) what aspects of type theory are relevant and which are irrelevant, and (2) whether a language is required to enforce or merely encourage type restrictions. As I pointed out, one could reasonably say that C# is strongly typed because it allows and encourages static typing, and one could just as reasonably say that it is weakly typed because it allows the possibility to violate type safety. – Dirichlet 29/3, 2012 at 17:2

Hmm, this is quite interesting. So, what you are saying is that ultimately we are taking of type safety and we could say that a given language could be more weakly typed than another, but there is not an absolute definition of a given language that is absolutely weakly typed per se. What could we say of languages like machine language or assembly? Could we say that this are absolutely weakly typed? – Vermeil 29/3, 2012 at 17:4

@edalorzo: Correct. And as you will see shortly, every answer to this question is going to give a slightly different definition of what "strong" and "weak" mean to each answerer, further indicating that it is a matter of taste. There is no formal, clear definition of "strong" and "weak"; you always have to get clarification when someone uses those terms. – Dirichlet 29/3, 2012 at 17:4

@edalorzo: As for assembly, again, it is a matter of opinion. An assembly language compiler will not allow you to move a 64 bit double from the stack into a 32 bit register; it will allow you to move a 32 bit pointer to a 64 bit double from the stack into a 32 bit register. In that sense the language is "typesafe" -- it imposes a restriction on the legality of the program based on a type classification of data. Whether that restriction is "strong" or "weak" is a matter of opinion, but it is clearly a restriction. – Dirichlet 29/3, 2012 at 17:9

I think I see your point now, a truly weakly typed language would have to be totally untyped or monotyped, which in real life is practically impossible. As such, any language has certain definition of types, which are safe, and depending on the number of holes that the language provides to violate or manipulate its data or data types you may end up considering it more or less weakly typed, maybe even in certain contexts only. – Vermeil 29/3, 2012 at 17:13

@edalorzo: Correct. For example, the untyped lambda calculus is about as weakly typed as you can get. Every function is a function from a function to a function; any data can be passed to any function without restriction because everything is of "the same type". The validity of an expression in untyped lambda calculus depends only on its syntactic form, not on a semantic analysis that classifies certain expressions as having certain types. – Dirichlet 29/3, 2012 at 17:30

I see, yes the article I provided in the question does mention λ-calculus and pure Lisp as untyped or monotype examples. Unfortunately I am not familiar with either of these. I will certainly dedicate more time to understand them now. Your answer is quite interesting and certainly leave me with a lot of homework. I will delve a bit into λ-calculus to understand the concept much better. I can see people seem to agree with your answer. – Vermeil 29/3, 2012 at 19:24

@Mark I would give him another +1 for predicting that everyone would provide different interpretations on the subject. This "weakly typing" seems to be a "mythical concept" or an "urban legend", everyone has seen it, but no one can prove it exists :-) – Vermeil 30/3, 2012 at 15:31

@EricLippert so what you told us was true, from a certain point of view. We're going to find that many of the truths we cling to depend greatly on our own point of view. :) – Reconstitute 14/1, 2013 at 15:13

As others have noted, the terms "strongly typed" and "weakly typed" have so many different meanings that there's no single answer to your question. However, since you specifically mentioned Perl in your question, let me try to explain in what sense Perl is weakly typed.

The point is that, in Perl, there is no such thing as an "integer variable", a "float variable", a "string variable" or a "boolean variable". In fact, as far as the user can (usually) tell, there aren't even integer, float, string or boolean values: all you have are "scalars", which are all of these things at the same time. So you can, for example, write:

$foo = "123" + "456";           # $foo = 579
$bar = substr($foo, 2, 1);      # $bar = 9
$bar .= " lives";               # $bar = "9 lives"
$foo -= $bar;                   # $foo = 579 - 9 = 570

Of course, as you correctly note, all of this can be seen as just type coercion. But the point is that, in Perl, types are always coerced. In fact, it's quite hard for a user to tell what the internal "type" of a variable might be: at line 2 in my example above, asking whether the value of $bar is the string "9" or the number 9 is pretty much meaningless, since, as far as Perl is concerned, those are the same thing. Indeed, it's even possible for a Perl scalar to internally have both a string and a numeric value at the same time, as is e.g. the case for $foo after line 2 above.

The flip side of all this is that, since Perl variables are untyped (or, rather, don't expose their internal type to the user), operators cannot be overloaded to do different things for different types of arguments; you can't just say "this operator will do X for numbers and Y for strings", because the operator can't (won't) tell which kind of values its arguments are.

Thus, for example, Perl has and needs both a numeric addition operator (+) and a string concatenation operator (.): as you saw above, it's perfectly fine to add strings ("1" + "2" == "3") or to concatenate numbers (1 . 2 == 12). Similarly, the numeric comparison operators ==, !=, <, >, <=, >= and <=> compare the numeric values of their arguments, while the string comparison operators eq, ne, lt, gt, le, ge and cmp compare them lexicographically as strings. So 2 < 10, but 2 gt 10 (but "02" lt 10, while "02" == 2). (Mind you, certain other languages, like JavaScript, try to accommodate Perl-like weak typing while also doing operator overloading. This often leads to ugliness, like the loss of associativity for +.)

(The fly in the ointment here is that, for historical reasons, Perl 5 does have a few corner cases, like the bitwise logical operators, whose behavior depends on the internal representation of their arguments. Those are generally considered an annoying design flaw, since the internal representation can change for surprising reasons, and so predicting just what those operators do in a given situation can be tricky.)

All that said, one could argue that Perl does have strong types; they're just not the kind of types you might expect. Specifically, in addition to the "scalar" type discussed above, Perl also has two structured types: "array" and "hash". Those are very distinct from scalars, to the point where Perl variables have different sigils indicating their type ($ for scalars, @ for arrays, % for hashes)¹. There are coercion rules between these types, so you can write e.g. %foo = @bar, but many of them are quite lossy: for example, $foo = @bar assigns the length of the array @bar to $foo, not its contents. (Also, there are a few other strange types, like typeglobs and I/O handles, that you don't often see exposed.)

Also, a slight chink in this nice design is the existence of reference types, which are a special kind of scalars (and which can be distinguished from normal scalars, using the ref operator). It's possible to use references as normal scalars, but their string/numeric values are not particularly useful, and they tend to lose their special reference-ness if you modify them using normal scalar operations. Also, any Perl variable² can be blessed to a class, turning it into an object of that class; the OO class system in Perl is somewhat orthogonal to the primitive type (or typelessness) system described above, although it's also "weak" in the sense of following the duck typing paradigm. The general opinion is that, if you find yourself checking the class of an object in Perl, you're doing something wrong.

¹ Actually, the sigil denotes the type of the value being accessed, so that e.g. the first scalar in the array @foo is denoted $foo[0]. See perlfaq4 for more details.

² Objects in Perl are (normally) accessed through references to them, but what actually gets blessed is the (possibly anonymous) variable the reference points to. However, the blessing is indeed a property of the variable, not of its value, so e.g. that assigning the actual blessed variable to another one just gives you a shallow, unblessed copy of it. See perlobj for more details.

Sivas answered 30/3, 2012 at 8:31 Comment(0)

In addition to what Eric has said, consider the following C code:

void f(void* x);

f(42);
f("hello");

In contrast to languages such as Python, C#, Java or whatnot, the above is weakly typed because we lose type information. Eric correctly pointed out that in C# we can circumvent the compiler by casting, effectively telling it “I know more about the type of this variable than you”.

But even then, the runtime will still check the type! If the cast is invalid, the runtime system will catch it and throw an exception.

With type erasure, this doesn’t happen – type information is thrown away. A cast to void* in C does exactly that. In this regard, the above is fundamentally different from a C# method declaration such as void f(Object x).

(Technically, C# also allows type erasure through unsafe code or marshalling.)

This is as weakly typed as it gets. Everything else is just a matter of static vs. dynamic type checking, i.e. of the time when a type is checked.

Mouthy answered 29/3, 2012 at 23:12 Comment(9)

+1 Good point, you now made me think of type erasure as a feature that can also imply "weakly-typedness". There is type erasure also in Java, and at runtime, the type system will let you violate constraints that the compiler would never approve. The C example is excellent to illustrate the point. – Vermeil 29/3, 2012 at 23:37

Agreed, there are layers to the onion, or the inferno. These seems a more significant definition of type weakness. – Jihad 30/3, 2012 at 10:20

@edalorzo I don’t think this is quite the same since even though Java allows you to circumvent the compiler, the runtime type system will still catch the violation. So the Java runtime type system is strongly typed in this regard (there are exceptions, e.g. where reflection can be used to circumvent access control). – Mouthy 30/3, 2012 at 10:35

@Konrad Interesting. In Java, all generic collections are non-refiable. This means that through the use of legacy syntax you can fool the compiler to commit heap pollution, or through reflection you can fool the runtime type system to do the same. In every case, type information is lost, and you could treat, for instance, a collection of integers, as if it was of any other type. This would probably lead to errors, but these are ways to circunvent the strogly-typedness of Java. So, you are saying that you would not consider this weakness a case of weakly-typedness? – Vermeil 30/3, 2012 at 12:17

@edalorzo You can only circumvent the compiler this way, not the runtime system. It’s important to realise that languages such as Java and C# (and to a certain extent also C++) have a type system that is ensured twice: once at compile time, and once at runtime. void* breaks through both type checks. Generic type erasure doesn’t, it only circumvents the compile-time checks. It’s exactly like explicit casts (mentioned by Eric) in this regard. – Mouthy 30/3, 2012 at 12:20

@Konrad I got lost in where you were refering to C or to Java. In Java, type erasure discards most of the generic type information, at runtime there is no way that the type system can tell you that i.e a collection is a a container of any particular type. As such, you can put something that is not an integer into the collection of integers by the back door (reflection). So, not sure why you would say that you can only fool the compiler. – Vermeil 30/3, 2012 at 12:38

@edalorzo Yes, as soon as reflection enters the picture, type safety goes overboard in Java. – Mouthy 30/3, 2012 at 12:41

@Konrad My point is that with reflection you typically circunvent information hiding or encapsulation (public, private, etc). But you rarely can fool the type system with reflection. In fact, you can't do it in Java for reifiable types. Due to its design of generic types with type erasure, though, you can fool it for generic types. My confusion is, why should we consider weakly typing something that can only happen at compile time? I would like to know your opinion, would you still not consider this weakness derived from type erasure a case of weakly typing? – Vermeil 30/3, 2012 at 12:50

@edalorzo Re your confusion: we shouldn’t. The distinction is fluent. And yes, type erasure makes Java weakly typed in this regard. My point was that even with generic type erasure you still cannot circumvent the runtime type checks unless you also use reflection. – Mouthy 30/3, 2012 at 12:58

A perfect example comes from the wikipedia article of Strong Typing:

Generally strong typing implies that the programming language places severe restrictions on the intermixing that is permitted to occur.

Weak Typing

a = 2
b = "2"

concatenate(a, b) # returns "22"
add(a, b) # returns 4

Strong Typing

a = 2
b = "2"

concatenate(a, b) # Type Error
add(a, b) # Type Error
concatenate(str(a), b) #Returns "22"
add(a, int(b)) # Returns 4

Notice that a weak typing language can intermix different types without errors. A strong type language requires the input types to be the expected types. In a strong type language a type can be converted (str(a) converts an integer to a string) or cast (int(b)).

This all depends on the interpretation of typing.

Amaranth answered 29/3, 2012 at 19:25 Comment(2)

But this leads to the contradictory examples provided in the question. A Strongly Typed language may include implicit coercion that means either (or both) of your two "Type Error" examples are automatically converted to the relevant of the second two examples, but generally that language is still Strongly Typed. – Crosley 30/3, 2012 at 1:31

True. I guess you could say there are varying degrees of strong typing and weak typing. Implicit conversion could mean that the language is less strongly typed than a language that does not do implicit conversion. – Amaranth 30/3, 2012 at 14:48

I would like to contribute to the discussion with my own research on the subject, as others comment and contribute I have been reading their answers and following their references and I have found interesting information. As suggested, it is probable that most of this would be better discussed in the Programmers forum, since it appears to be more theoretical than practical.

From a theoretical standpoint, I think the article by Luca Cardelli and Peter Wegner named On Understanding Types, Data Abstraction and Polymorphism has one of the best arguments I have read.

A type may be viewed as a set of clothes (or a suit of armor) that protects an underlying untyped representation from arbitrary or unintended use. It provides a protective covering that hides the underlying representation and constrains the way objects may interact with other objects. In an untyped system untyped objects are naked in that the underlying representation is exposed for all to see. Violating the type system involves removing the protective set of clothing and operating directly on the naked representation.

This statement seems to suggest that weakly typing would let us access the inner structure of a type and manipulate it as if it was something else (another type). Perhaps what we could do with unsafe code (mentioned by Eric) or with c type-erased pointers mentioned by Konrad.

The article continues...

Languages in which all expressions are type-consistent are called strongly typed languages. If a language is strongly typed its compiler can guarantee that the programs it accepts will execute without type errors. In general, we should strive for strong typing, and adopt static typing whenever possible. Note that every statically typed language is strongly typed but the converse is not necessarily true.

As such, strong typing means the absence of type errors, I can only assume that weak typing means the contrary: the likely presence of type errors. At runtime or compile time? Seems irrelevant here.

Funny thing, as per this definition, a language with powerful type coercions like Perl would be considered strongly typed, because the system is not failing, but it is dealing with the types by coercing them into appropriate and well defined equivalences.

On the other hand, could I say than the allowance of ClassCastException and ArrayStoreException (in Java) and InvalidCastException, ArrayTypeMismatchException (in C#) would indicate a level of weakly typing, at least at compile time? Eric's answer seems to agree with this.

In a second article named Typeful Programming provided in one of the references provided in one of the answers in this question, Luca Cardelli delves into the concept of type violations:

Most system programming languages allow arbitrary type violations, some indiscriminately, some only in restricted parts of a program. Operations that involve type violations are called unsound. Type violations fall in several classes [among which we can mention]:

Basic-value coercions: These include conversions between integers, booleans, characters, sets, etc. There is no need for type violations here, because built-in interfaces can be provided to carry out the coercions in a type-sound way.

As such, type coercions like those provided by operators could be considered type violations, but unless they break the consistency of the type system, we might say that they do not lead to a weakly typed system.

Based on this neither Python, Perl, Java or C# are weakly typed.

Cardelli mentions two type vilations that I very well consider cases of truly weak typing:

Address arithmetic. If necessary, there should be a built-in (unsound) interface, providing the adequate operations on addresses and type conversions. Various situations involve pointers into the heap (very dangerous with relocating collectors), pointers to the stack, pointers to static areas, and pointers into other address spaces. Sometimes array indexing can replace address arithmetic. Memory mapping. This involves looking at an area of memory as an unstructured array, although it contains structured data. This is typical of memory allocators and collectors.

This kind of things possible in languages like C (mentioned by Konrad) or through unsafe code in .Net (mentioned by Eric) would truly imply weakly typing.

I believe the best answer so far is Eric's, because the definition of this concepts is very theoretical, and when it comes to a particular language, the interpretations of all these concepts may lead to different debatable conclusions.

Vermeil answered 30/3, 2012 at 18:44 Comment(0)

Weak typing does indeed mean that a high percentage of types can be implicitly coerced, attempting to guess what the coder intended.

Strong typing means that types are not coerced, or at least coerced less.

Static typing means your variables' types are determined at compile time.

Many people have recently been confusing "manifestly typed" with "strongly typed". "Manifestly typed" means that you declare your variables' types explicitly.

Python is mostly strongly typed, though you can use almost anything in a boolean context, and booleans can be used in an integer context, and you can use an integer in a float context. It is not manifestly typed, because you don't need to declare your types (except for Cython, which isn't entirely python, albeit interesting). It is also not statically typed.

C and C++ are manifestly typed, statically typed, and somewhat strongly typed, because you declare your types, types are determined at compile time, and you can mix integers and pointers, or integers and doubles, or even cast a pointer to one type into a pointer to another type.

Haskell is an interesting example, because it is not manifestly typed, but it's also statically and strongly typed.

Pedigree answered 18/10, 2012 at 16:53 Comment(1)

+1 Because I like the coined term "manifestly typed", which categorizes languages like Java and C# where you have to explicitly declare types and distinguish them from other statically-type languages like Haskell and Scala where type inference plays an important role and this typically confuses people, as you say, and makes them believe these languages are dynamically-typed. – Vermeil 19/10, 2012 at 17:12

The strong <=> weak typing is not only about the continuum on how much or how little of the values are coerced automatically by the language for one datatype to another, but how strongly or weakly the actual values are typed. In Python and Java, and mostly in C#, the values have their types set in stone. In Perl, not so much - there are really only a handful of different valuetypes to store in a variable.

Let's open the cases one by one.

Python

In Python example 1 + "1", + operator calls the __add__ for type int giving it the string "1" as an argument - however, this results in NotImplemented:

>>> (1).__add__('1')
NotImplemented

Next, the interpreter tries the __radd__ of str:

>>> '1'.__radd__(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute '__radd__'

As it fails, the + operator fails with the the result TypeError: unsupported operand type(s) for +: 'int' and 'str'. As such, the exception does not say much about strong typing, but the fact that the operator + does not coerce its arguments automatically to the same type, is a pointer to the fact that Python is not the most weakly typed language in the continuum.

On the other hand, in Python 'a' * 5 is implemented:

>>> 'a' * 5
'aaaaa'

That is,

>>> 'a'.__mul__(5)
'aaaaa'

The fact that the operation is different requires some strong typing - however the opposite of * coercing the values to numbers before multiplying still would not necessarily make the values weakly typed.

Java

The Java example, String result = "1" + 1; works only because as a fact of convenience, the operator + is overloaded for strings. The Java + operator replaces the sequence with creating a StringBuilder (see this):

String result = a + b;
// becomes something like
String result = new StringBuilder().append(a).append(b).toString()

This is rather an example of very static typing, without no actual coercion - StringBuilder has a method append(Object) that is specifically used here. The documentation says the following:

Appends the string representation of the Object argument.

The overall effect is exactly as if the argument were converted to a string by the method String.valueOf(Object), and the characters of that string were then appended to this character sequence.

Where String.valueOf then

Returns the string representation of the Object argument. [Returns] if the argument is null, then a string equal to "null"; otherwise, the value of obj.toString() is returned.

Thus this is a case of absolutely no coercion by the language - delegating every concern to the objects itself.

C#

According to the Jon Skeet answer here, operator + is not even overloaded for the string class - akin to Java, this is just convenience generated by the compiler, thanks to both static and strong typing.

Perl

As the perldata explains,

Perl has three built-in data types: scalars, arrays of scalars, and associative arrays of scalars, known as "hashes". A scalar is a single string (of any size, limited only by the available memory), number, or a reference to something (which will be discussed in perlref). Normal arrays are ordered lists of scalars indexed by number, starting with 0. Hashes are unordered collections of scalar values indexed by their associated string key.

Perl however does not have a separate data type for numbers, booleans, strings, nulls, undefineds, references to other objects etc - it just has one type for these all, the scalar type; 0 is a scalar value as much as is "0". A scalar variable that was set as a string can really change into a number, and from there on behave differently from "just a string", if it is accessed in a numerical context. The scalar can hold anything in Perl, it is as much the object as it exists in the system. whereas in Python the names just refers to the objects, in Perl the scalar values in the names are changeable objects. Furthermore, the Object Oriented Type system is glued on top of this: there are just 3 datatypes in perl - scalars, lists and hashes. A user defined object in Perl is a reference (that is a pointer to any of the 3 previous) blessed to a package - you can take any such value and bless it to any class at any instant you want.

Perl even allows you to change the classes of values at whim - this is not possible in Python where to create a value of some class you need to explicitly construct the value belonging to that class with object.__new__ or similar. In Python you cannot really change the essence of the object after the creation, in Perl you can do much anything:

package Foo;
package Bar;

my $val = 42;
# $val is now a scalar value set from double
bless \$val, Foo;
# all references to $val now belong to class Foo
my $obj = \$val;
# now $obj refers to the SV stored in $val
# thus this prints: Foo=SCALAR(0x1c7d8c8)
print \$val, "\n"; 
# all references to $val now belong to class Bar
bless \$val, Bar;
# thus this prints Bar=SCALAR(0x1c7d8c8)
print \$val, "\n";
# we change the value stored in $val from number to a string
$val = 'abc';
# yet still the SV is blessed: Bar=SCALAR(0x1c7d8c8)
print \$val, "\n";
# and on the course, the $obj now refers to a "Bar" even though
# at the time of copying it did refer to a "Foo".
print $obj, "\n";

thus the type identity is weakly bound to the variable, and it can be changed through any reference on the fly. In fact, if you do

my $another = $val;

\$another does not have the class identity, even though \$val will still give the blessed reference.

TL;DR

There are much more about weak typing to Perl than just automatic coercions, and it is more about that the types of the values themselves are not set into stone, unlike the Python which is dynamically yet very strongly typed language. That python gives TypeError on 1 + "1" is an indication that the language is strongly typed, even though the contrary one of doing something useful, as in Java or C# does not preclude them being strongly typed languages.

Unaunabated answered 10/8, 2014 at 18:26 Comment(5)

This is totally confused. That Perl 5 variables don't have types doesn't have any bearing on values, which always have a type. – Melosa 2/6, 2016 at 23:12

@JimBalter well, yes, a value has a type in that it is a string or a number, and it can behave differently in some context depending on if the scalar variable contained a string or a number; but the value contained in a variable can change type by just accessing the variable, and since the value itself lives in the variable, the values themselves can be considered mutable between types. – Orna 3/6, 2016 at 6:35

Values don't change types -- that's incoherent; a value is always of a type. The value that a variable contains can change. A change from 1 to "1" is just as much a change in value as a change from 1 to 2. – Melosa 3/6, 2016 at 19:15

A weakly typed language such as Perl allows the former type of value change to occur implicitly depending on context. But even C++ allows such implicit conversions via operator definitions. Weak typing is a very informal property and really isn't a useful way to describe languages, as Eric Lippert pointed out. – Melosa 3/6, 2016 at 19:22

P.S. It can be shown that, even in Perl, <digits> and "<digits>" have different values, not just different types. Perl makes <digits> and "<digits>" appear to have the same value in most cases via implicit conversions, but the illusion is not complete; e.g. "12" | "34" is 36 whereas 12 | 34 is 46. Another example is that "00" is numerically equal to 00 in most contexts, but not in boolean context, where "00" is true but 00 is false. – Melosa 3/6, 2016 at 19:37

As many others have expressed, the entire notion of "strong" vs "weak" typing is problematic.

As a archetype, Smalltalk is very strongly typed -- it will always raise an exception if an operation between two objects is incompatible. However, I suspect few on this list would call Smalltalk a strongly-typed language, because it is dynamically typed.

I find the notion of "static" versus "dynamic" typing more useful than "strong" versus "weak." A statically-typed language has all the types figured out at compile-time, and the programmer has to explicitly declare if otherwise.

Contrast with a dynamically-typed language, where typing is performed at run-time. This is typically a requirement for polymorphic languages, so that decisions about whether an operation between two objects is legal does not have to be decided by the programmer in advance.

In polymorphic, dynamically-typed languages (like Smalltalk and Ruby), it's more useful to think of a "type" as a "conformance to protocol." If an object obeys a protocol the same way another object does -- even if the two objects do not share any inheritance or mixins or other voodoo -- they are considered the same "type" by the run-time system. More correctly, an object in such systems is autonomous, and can decide if it makes sense to respond to any particular message referring to any particular argument.

Want an object that can make some meaningful response to the message "+" with an object argument that describes the colour blue? You can do that in dynamically-typed languages, but it is a pain in statically-typed languages.

Mendelssohn answered 4/4, 2012 at 18:9 Comment(1)

I think that the concept of dynamic vs static typing is not under discussion. Although I have to say that I do not believe that polymorphism is in anyway handicaped in statically type languages. Ultimately, the type system verifies if a given operation is applicable to the given operands, whether at runtime or at compile time. Also, other forms of polymorphism, like parametric functions and classes permit combining types in statically type languages in ways that you described as very difficult when compared to dynamically typed, even nicer if type inference is provided. – Vermeil 4/4, 2012 at 18:30

I like @Eric Lippert's answer, but to address the question - strongly typed languages typically have explicit knowledge of the types of variables at each point of the program. Weakly typed languages do not, so they can attempt to perform an operation that may not be possible for a particular type. It think the easiest way to see this is in a function. C++:

void func(string a) {...}

The variable a is known to be of type string and any incompatible operation will be caught at compile time.

Python:

def func(a)
  ...

The variable a could be anything and we can have code that calls an invalid method, which will only get caught at runtime.

Binturong answered 29/3, 2012 at 16:57 Comment(1)

I think you may be confusing dynamic typing vs static typing with strong typing vs weak typing. In both versions of your code, the runtime type systems knows very well that a is a string. It is just that in the first case, the compiler can tell you that, in the second it can't. But this does not make any of these languages weakly-typed. – Vermeil 29/3, 2012 at 17:2

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Python

Java

C#

Perl

TL;DR

Recommended topics

Hot tags