Is weak typing a performance increase or a decrease?
Asked Answered
C

2

9

When writing interpreted languages, is it faster to have weak typing or strong typing?

I was wondering this, because often the faster dynamically typed interpreted languages out there (Lua, Javascript), and in fact most interpreted languages use weak typing.

But on the other hand strong typing gives guarantees weak typing does not, so, are optimization techniques possible with one that aren't possible with the other?


With strongly typed I mean no implicit conversions between types. For example this would be illegal in a strongly typed, but (possibly) legal in a weakly typed language: "5" * 2 == 10. Especially Javascript is notorious for these type conversions.

Castilian answered 10/3, 2012 at 13:9 Comment(15)
+1 An interesting question, though I fear it's hard to answer in general. I'm looking forward to gory details about interpreter optimizations.Maisey
Just to be clear, what do you mean by weak and strong typing? (I try to avoid using these terms for the reasons given here -- people use them in many incompatible ways.)Globule
Strongly typed as in no implicit conversions between types. For example this would be illegal in a strongly typed, but (possibly) legal in a weakly typed language: "5" * 2 == 10. Especially Javascript is notorious for these type-conversions. EDIT: added it to the question.Castilian
Most modern JavaScript engines no longer interpret but compile directly into bytecode these days. So it's difficult to apply this question to that language.Diandiana
Jivings: why would the machine it's executing on (directly on the machine or a VM) matter for optimization techniques?Castilian
I assume the JavaScript would be compiled into the appropriate type so it's no longer weakly typed.Diandiana
are you sure that is what you mean by "strongly typed"? it's quite possible for implicit type conversions to happen in languages that most people would consider strongly typed - for example in java you can implicitly convert anything to a string, while in c you can convert between floats and ints. more usually when people are asking about performance they are concerned with whether the type of an object is known at compile time, which can allow the compiler to optimize (and this advantage is reduced by just-in-time compilation, which uses the runtime type)Redblooded
@andrewcooke: I'm sure this is what I mean with strongly typed, neither Java or C are strongly typed. Yes, they are statically typed, but not strongly. (For example in C you can convert between pointers to different types implicitly). But really, I mean strongly typed (no implicit conversions) and not statically typed (the declaration of type in the code).Castilian
ah, ok. sorry for the misunderstanding.Redblooded
@Diandiana JS is still dynamically typed, so you can't really settle for any type during compilation (especially when compiling to bytecode - you sure you aren't talking about JIT compilers?). The best you can get is speculating that something will usually be of a certain type, and generate code based on that assumption. But you still need guards (conditionals checking that the assumption is indeed true) and a more general/less optimized version to fall back to (when it was incorrect). But none of this matters to this question, as it's due to dynamic typing, not due to weak/strong typing.Maisey
so which interpreted languages have strong typing?Redblooded
This question doesn't really make sense, in two ways. Firstly: the overhead of implicit conversions is incurred when those implicit conversions are performed. A program that doesn't perform any such implicit conversions will be the same whether or not the language supports them. So your question is like asking whether programming languages that have a built-in arctangent function are slower than languages that do not. Secondly: the implementation of implicit conversions is quite different in dynamically-typed languages from in statically-typed languages; so even if your question had a coherentHouseline
answer for the one case, there would be no sense trying to apply that answer to the other case. (And, for that matter, it's also sometimes quite different between different languages even within those groups.)Houseline
@andrewcooke: Which languages at all have strong typing by this definition? Offhand, the only ones I can think of are functional languages like Standard ML.Houseline
For example Python comes pretty close to pure strong typing, with very few operators defined to work with mixed types. But Javascript on the other hand has an enormous amount of conversions defined between strings, numbers and arrays, introducing a lot of WTF moments. So I wondered that perhaps there is a speed benefit of allowing that.Castilian
R
3

it seems to me that the question is going to be hard to answer with explicit examples because of the lack of "strongly typed interpreted languages" (using the definitions i understand from the question comments).

i cannot think of any language that is interpreted and does not have implicit conversions. and i think this for two reasons:

  1. interpreted languages tend not be statically typed. i think this is because if you are going to implement a statically typed language then, historically, compilation is relatively easy and gives you a significant performance advantage.

  2. if a language is not statically typed then it is forced into having implicit conversions. the alternative would make life too hard for the programmer (they would have to keep track of types, invisible in the source, to avoid runtime errors).

so, in practice, all interpreted languages are weakly typed. but the question of a performance increase or decrease implies a comparison with some that are not. at least, it does if we want to get into a discussion of different, existing implementation strategies.

now you might reply "well, imagine one". ok. so then you are asking for the performance difference between code that detects the need for a conversion at runtime with code where the programmer has explicitly added the conversion. in that case you are comparing the difference between dynamically detecting the need for a conversion and calling an explicit function specified by the programmer.

on the face of it, detection is always going to add some overhead (in a [late-]compiled language that can be ameliorated by a jit, but you are asking about interpreters). but if you want fail-fast behaviour (type errors) then even the explicit conversion has to check types. so in practice i imagine the difference is relatively small.

and this feeds back to the original point - since the performance cost of weak typing is low (given all the other constraints/assumptions in the question), and the usability costs of the alternative are high, most (all?) interpreted languages support implicit conversion.

[sorry if i am still not understanding. i am worried i am missing something because the question - and this answer - does not seem interesting...]

[edit: maybe a better way of asking the same(?) thing would be something like "what are the comparative advantages/disadvantages of the various ways that dynamic (late binding?) languages handle type conversion?" because i think there you could argue that python's approach is particularly powerful (expressive), while having similar costs to other interpreted languages (and the question avoids having to argue whether python or any other language is "weakly typed" or not).]

Redblooded answered 10/3, 2012 at 14:58 Comment(6)
Python, as you surely know, has very few if any implicit conversions. Numeric types are promoted (in one-way fasion, and always from one type to a more general one), and one might argue if/and/or/not implicitly convert to booleans (though one could also say all objects have a truth value and these simply get that value and process it - the fact that and/or return one of the operands supports this view). Other than that, I'm now aware of any implicit conversions. And I doubt Python is alone in that regard. So I dare challenge your reason #2.Maisey
by implicit conversion i mean implicit invocation of the methods like __float__ - see docs.python.org/reference/datamodel.html#specialnames (so __nonzero__ is what you are referring to with conversion to boolean).Redblooded
a really common example in python is conversion to iterator. that is all over the place in idiomatic python code, and is implemented in the same way (__iter__ iirc).Redblooded
__float__ is not called implicitly, it's only called from float AFAIK (perhaps during numeric promotion too, but we've already got that one). __iter__ is indeed called implicitly, but only in the context of for loops and {list,dict,set} comprehensions so it's extremely restricted and always called. For that reason, I don't really consider it an implicit conversion. And most of the "special methods" don't convert at all - they serve purposes like attribute access and operator overloading.Maisey
__str__ is another common one. and iirc __float__ is called by the default implementation of __add__. more generally, i am not sure what your point is - yes, these blur into message dispatch and so you can argue that it's not implicit type conversion but method invocation. and that's cool - it's exactly what i meant when i said python had a good solution and that arguing about exactly what is and isn't "weak typing" is largely pointless. so pointless that i will not do any more.Redblooded
My point is not that strongly/weakly typed is a useful distinction or that we should discuss what exactly counts as weak typing. I'm merely doubtful of your second reason (that dynamic typing makes implicit conversions necessary). On your other examples: __str__ is only called by str (it may be indirect, but it's definitely explicit). Arithmetic operators, as we both already stated, indeed include some implicit promotions/conversions.Maisey
G
1

With strongly typed I mean no implicit conversions between types.

"5" * 2 == 10

The problem is that "weak typing" is not a well-defined term, since there are two very different ways such "implicit conversions" can happen, which have pretty much the opposite effect on performance:

  • The "scripting language way": values have a runtime type and the language implicitly applies semantic rules to convert between types (such as formatting a binary number as a decimal string) when an operation calls for the different type. This will tend to decrease performance since it A) requires there to be type information at runtime and b) requires that this information be checked. Both of these requirements introduce overhead.
  • The "C way": at runtime, it's all just bytes. If you can convince the compiler to apply an operation that takes a 4 byte integer on a string, then depending on how exactly you do it, either the first 4 bytes of that string will simply be treated as if they were a (probably very large) integer, or you get a buffer overrun. Or demons flying out of your nose. This method requires no overhead and leads to very fast performance (and very spectacular crashes).
Gatepost answered 10/3, 2012 at 15:15 Comment(1)
There are actually three ways, since you left out the C way: at compile-time, expressions have types, and implicit conversions can be inserted to convert an expression from one type to another as needed. For example, if i is an int, then 5.0 * i will implicitly convert i from int to double. The decision to make this conversion is performed at compile-time. (What you describe as "the 'C way'" is not really the C way of performing implicit conversions; rather, it's the result of bypassing C's implicit conversions.)Houseline

© 2022 - 2024 — McMap. All rights reserved.