Is Python type safe?
Asked Answered
P

9

49

According to Wikipedia

Computer scientists consider a language "type-safe" if it does not allow operations or conversions that violate the rules of the type system.

Since Python runtime checks ensure that type system rules are satisfied, we should consider Python a type safe language.

The same point is made by Jason Orendorff and Jim Blandy in Programming Rust:

Note that being type safe is independent of whether a language checks types at compile time or at run time: C checks at compile time, and is not type safe; Python checks at runtime, and is type safe.

Both separate notion of static type checking and type safety.

Is that correct?

Pad answered 24/9, 2017 at 8:57 Comment(9)
python is a duck-typing lang. If it walks like a duck, sounds like a duck and looks like a duck - it's a duck. I wouldn't say it is type-safe - check this thread https://mcmap.net/q/24990/-what-is-duck-typingCtn
Yes, typing can be static or dynamic. A language can be type-safe or not type safe. C is statically typed, but it isn't type-safe by any means.Abele
@Vinny: I don't think duck-typing is relevant to this discussion. Python looks type-safe to me : '1' + 2. Javascript isn't type-safe for example.Hhd
@EricDuminil based on the statement from wikipedia, a language is "type-safe" if it does not allow operations or conversions that violate the rules of the type system. Your example is performing auto-conversion. Try doing that on user-defined objects. Type-safe means you state the variable types, and you can't use them unless explicity override. In python you can do this: s = 'this is string'; s = 1 but you can't do it in JavaCtn
@Vinny: The example I provide for Python doesn't do auto-conversion: it raises an exception.Your examples only show the difference between static and dynamic typing, which isn't relevant to the question either.Hhd
@EricDuminil gotcha. You are correct, I mixed up the two. ThanksCtn
@Vinny: No problem, I love being an insufferable know-it-all! :DHhd
chan: this is simple example of unacceptable semantics / type variations in the same scope: s = 1; s = "hello". but when it comes to functions having untyped returns and untyped parameters, well, it adds insult to the injury. furthermore, classes with not enough visibility protection, implicitly impossible polymorphism, the list is way longer... nothing wrong with python, just don't use it in large projects, or if you do be strict and ask for enough time to write redundant test cases (for all possible types). this never happens in productionDoornail
mixing types of same identifier in same scope is as bad as visual basic... but hey, there is more, threads... global interpreter lock and other things way worse than type unsafetyness only guys like bill can tolerate and sellDoornail
B
73

Many programmers will equate static type checking to type-safety:

  • "language A has static type checking and so it is type-safe"
  • "language B has dynamic type checking and so it is not type-safe"

Sadly, it's not that simple.

In the Real World

For example, C and C++ are not type-safe because you can undermine the type-system via Type punning. Also, the C/C++ language specifications extensively allow undefined behaviour (UB) rather than explicitly handling errors and this has become the source of security exploits such as the stack smashing exploit and the format string attack. Such exploits shouldn't be possible in type-safe languages. Early versions of Java had a type bug with its Generics that proved it is was not completely type-safe.

Still today, for programming languages like Python, Java, C++, ... it's hard to show that these languages are completely type-safe because it requires a mathematical proof. These languages are massive and compilers/interpreters have bugs that are continually being reported and getting fixed.

[ Wikipedia ] Many languages, on the other hand, are too big for human-generated type safety proofs, as they often require checking thousands of cases. .... certain errors may occur at run-time due to bugs in the implementation, or in linked libraries written in other languages; such errors could render a given implementation type unsafe in certain circumstances.

In Academia

Type safety and type systems, while applicable to real-world programming have their roots and definitions coming from academia – and so a formal definition of what exactly is "type safety" comes with difficulty – especially when talking about real programming languages used in the real world. Academics like to mathematically (formally) define tiny programming languages called toy languages. Only for these languages is it possible to show formally that they are type-safe (and prove they the operations are logically correct).

[ Wikipedia ] Type safety is usually a requirement for any toy language proposed in academic programming language research

For example, academics struggled to prove Java is type-safe, so they created a smaller version called Featherweight Java and proved in a paper that it is type-safe. Similarly, this Ph.D. paper by Christopher Lyon Anderson took a subset of Javascript, called it JS0 and proved it was type-safe.

It's practically assumed proper languages like python, java, c++ are not completely type-safe because they are so large. It's so easy for a tiny bug to slip through the cracks that would undermine the type system.

Summary

  • No python is probably not completely type-safe – nobody has proved it, it's too hard to prove. You're more likely to find a tiny bug in the language that would demonstrate that it is not type-safe.
  • In fact, most programming languages are probably not completely type-safe - all for the same reasons (only toy academic ones have been proven to be)
  • You really shouldn't believe static-typed languages are necessarily type safe. They are usually safer than dynamically-typed languages, but to say that they are completely type-safe with certainty is wrong as there's no proof for this.

References: http://www.pl-enthusiast.net/2014/08/05/type-safety/ and https://en.wikipedia.org/wiki/Type_system

Blazer answered 24/9, 2017 at 10:19 Comment(8)
It's practically assumed proper languages like ... c++ are not completely type-safe - we can say for sure it is not type safe by design, right? It is just a mistake?Pad
Almost any dynamic typed language ensures type-safety by dynamic type-checks before accessing a variable content. So these are type-safe. None of your references explains why static-typed languages are usually safer. I do not know any popular unsafe dynamic typed language, but at least two static-typed ones (C/C++).Medieval
Your last statement in the summary is simply wrong. type-safety as nothing to do with static or dynamic typing. Python is not type-safe if you use ctypes.Velarium
"undermine the type-system via Type punning." how does that undermine the type system?Breastsummer
summary is simply put, misleading (probably in purpose) and dead wrong. nothing is safe when a bad programmer (without proper engineering knowledge / formal education & experience) crafts flipping invoices cheap improvisations. beyond that, compiler is your friend. what compilers do is holly and saves you invaluable time to do proper unit and coverage testing. without those, the product is unsafe to be properly maintained and implicitly is as valuable as the barrier of entry in certain languages (ZERO). i do love python, do not take my words in a bad way!Doornail
since you said that academia does not state precisely what type safety is all about, well, lemme help: a type a certain set of values and operations can occur to that set. a compiler is a process translating compilation units (generally written by humans, but not always) from a language to another. if the compiler verifies type integrity at all times and does not allow redefinition with different semantics in the very scope, it is said to be type safe. duck typing is not type safe, because it happens at runtimeDoornail
furthermore, the python interpreter could be seen as a on the fly compiler, being able to compile and actually execute one line at a time, of course, in the scope/context (knowing what other types and values are available at any time, during that one line execution). it would have been enough to add the obligation of declaring a variable with its precise type before being able to use it (since python ONE) and it would have been better (the whole planet would have been better off). but the spirit was to reduce the barrier of entry to almost zero, so average joe could type 1 + 1 and go like 2Doornail
but hiring mediocre programmers (instead of software engineers) is the source of all evil, more than lack of type safety (i believe)Doornail
D
25

Not in your wildest dreams.

#!/usr/bin/python

counter = 100          # An integer assignment
miles   = 1000.0       # A floating point
name    = "John"       # A string

print counter
print miles
print name

counter = "Mary had a little lamb"

print counter

When you run that you see:

python p1.py
100
1000.0
John
Mary had a little lamb

You cannot consider any language "type safe" by any stretch of the imagination when it allows you to switch a variable's content from integer to string without any significant effort.

In the real world of professional software development what we mean by "type safe" is that the compiler will catch the stupid stuff. Yes, in C/C++ you can take extraordinary measures to circumvent type safety. You can declare something like this

union BAD_UNION
{
   long number;
   char str[4];
} data;

But the programmer has to go the extra mile to do that. We didn't have to go extra inches to butcher the counter variable in python.

A programmer can do nasty things with casting in C/C++ but they have to deliberately do it; not accidentally.

The one place that will really burn you is class casting. When you declare a function/method with a base class parameter then pass in the pointer to a derived class, you don't always get the methods and variables you want because the method/function expects the base type. If you overrode any of that in your derived class you have to account for it in the method/function.

In the real world a "type safe" language helps protect a programmer from accidentally doing stupid things. It also protects the human species from fatalities.

Consider an insulin or infusion pump. Something that pumps limited amounts of life saving/prolonging chemicals into the human body at a desired rate/interval.

Now consider what happens when there is a logic path that has the pump stepper control logic trying to interpret the string "insulin" as the integer amount to administer. The outcome will not be good. Most likely it will be fatal.

Dikmen answered 3/11, 2020 at 12:7 Comment(0)
S
9

In Python, you'll get a runtime error if you use a variable from the wrong type in the wrong context. E.g.:

>>> 'a' + 1

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot concatenate 'str' and 'int' objects

Since this check only happens in runtime, and not before you run the program, Python is not a typesafe language (PEP-484 notwithstanding).

Sophiasophie answered 24/9, 2017 at 9:1 Comment(7)
It is still defined behavior which doesn't violate type system rules, isn't it?Pad
There is no requirement, as far as I can tell, that the type checking can't happen at runtime. Indeed, as the wikipedia article states, "Type enforcement can be static, catching potential errors at compile time, or dynamic, associating type information with values at run-time and consulting them as needed to detect imminent errors, or a combination of both."Abele
Python doesn't get compiledBacktrack
@whackamadoodle3000: all widely used python implementations are compilers.Velarium
If it doesn't let you violate the type system it is type safe - it doesn't matter when that happens. Static type checking doesn't guarantee that it is type safe (look at c and c++)Titicaca
and, the best example of "happening at run time" is simple: not you and/or qa would catch it, but YOUR CUSTOMERS. lovely huh?Doornail
the hints defined/described in pep-484 are great, but if you happen to read it, you will see that there are no checks even at runtime. so a static analyzer should be used, better than nothing anyway!Doornail
M
9

Because nobody has said it yet, it's also worth pointing out that Python is a strongly typed language, which should not be confused with dynamically typed. Python defers type checking until the last possible moment, and usually results in an exception being thrown. This explains the behavior Mureinik mentions. That having been said, Python also does automatic conversion often. Meaning that it will attempt to convert an int to a float for a arithmetic operation, for example.

You can enforce type safety in your programs manually by checking types of inputs. Because everything is an object, you can always create classes that derive from base classes, and use the isinstance function to verify the type (at runtime of course). Python 3 has added type hints, but this is not enforced. And mypy has added static type checking to the language if you care to use it, but this does not guarantee type safety.

Memorialize answered 3/10, 2017 at 16:16 Comment(6)
deferring it to the last moment means:Doornail
2. last moment is in most cases experienced by customers (on an already deployed product, with special circumstances/issue)Doornail
3. unit tests where you explicitly use dynamic metadata querying (isinstance) cannot cover any user defined data types, which partially comply, syntactically with a certain execution context, and here is why: simply because one cannot use isintance against datatypes which do not exist at the time the testcase and call to isinstance is actually happening. and you go like what?!? well, imagine this: you manually add all sort of isinstance checks to a thousands test cases today. but they will not be good if (in the future) somebody adds new compliant datatypes without adjusting all test casesDoornail
mypy is not core pythonDoornail
there should an option to have the hints enforced, but like always, the "benevolent dictator" (not retired) did not see it this way. this also explains the complete failure of python 2 vs python 3Doornail
"last moment" is sometimes called "duck typing" by the community, and is generally considered a feature. Rather than enforce types up-front (through types), exceptions are thrown when a method or operator is attempted to be called on a type that does not implement it at runtime. In this way interfaces (see abc module) do not need to be designed and implemented before the program is written, and legacy code can extend or "monkey patched" in order to add only the required functionality to make the data fit the algorithm.Memorialize
V
7

The wikipedia article associates type-safe to memory-safe, meaning, that the same memory area cannot be accessed as e.g. integer and string. In this way Python is type-safe. You cannot change the type of a object implicitly.

Velarium answered 24/9, 2017 at 9:6 Comment(1)
but you can change it explicitly, in the same scope, which reminds me of quick basic :-)Doornail
D
1

in general, large scale / complex systems need type checking, first at compile type (static) and run time (dynamic). this is not academia, but rather a simple, common sense rule of thumb like "compiler is your friend". beyond runtime performance implication, there are other major implications, as following:

the 3 axis of scalability are:

  1. build time (ability to design and manufacture safe systems in time and budget)
  2. runtime (obvious)
  3. maintain time (ability to maintain (fix bugs) and extend existing systems in a safe manner, generally by refactoring)

the only way to do safe refactoring is to have everything fully tested (use test driven development or at least unit testing and at least decent coverage testing as well, this is not qa, this is development/r&d). what is not covered, will break and systems like that are rather garbage than engineering artifacts.

now let's say that we have a simple function, sum, returning the sum of two numbers. one can imagine doing unit testing on this function, based on the fact that both parameters and returned type are known. we are not talking about function templates, which boil down to the trivial example. please write a simple unit test on the same function called sum where both parameters and return type can literally be of any kind, they can be integers, floats, strings and/or any other kind of user defined types having the plus operator overloaded/implemented. how do you write such a simple test case?!? how complex does the test case need to be in order to cover every possible scenario?

complexity means cost. without proper unit testing and test coverage, there no safe way to do any refactory, so the product is maintenance garbage, in not immediately visible, clearly in long term, because performing any refactoring in blind would be like driving a car without a driver license, drunk like a skunk and of course, without insurance.

go figure! :-)

Doornail answered 24/9, 2017 at 8:57 Comment(0)
D
0

assume that you have a function sum, taking two arguments if arguments are un-typed (can can anything) then... well... that is unacceptable for any serious software engineer working on real life large systems here is why:

  1. naive answer would be "compiler is your friend". despite being about 65 years old, this is true and hey, this is not only about having static types! ide(s) use compiler services for a lot of things, which, for average joe programmer look like magic... (code completion, design time (editing) assistance, etc
  2. o more realistic reason consists in something completely unknown to developers without a strong background in computer science and more, in software engineering. there are 3 axes of scalability: a. design/write and deploy, b. runtime & c. maintain time, based on refactoring. who do you think is the most expensive one? being clearly recurring on any real life serious system? the third one (c). in order to satisfy (c), you need to do it safe. in order to do any safe refactoring, you need to have unit testing AND coverage testing (so you can estimate the level of coverage your unit testing suite covers) - remember, when something is not automatically tested, it will break (at run time, late in the cycle, at customers site, you name it) - SO, in order to have a decent product, you need to have decent unit testing and test coverage

now, let's get to our intellectually challenging function (sum). is sum(a,b) does not specify the types of a and b, there is no way to do decent unit testing. tests like assent sum(1,1) is 2 IS A LIE, because it does not cover anything but assumed integer arguments. in real life, when a and b are type hermaphrodites, then there is no way to write real unit testing against function sum! various frameworks even pretend to derive test coverage results from crippled test cases as the one described above. that is (obvious) another LIE.

that's all i had to say! thanks for reading, the only reason i posted this is, perhaps, to make you think of this and, maybe (MAYBE..) one day to do software engineering...

Doornail answered 24/9, 2017 at 8:57 Comment(0)
P
0

We just had a big error in a piece of code. The error was because we had this:

   if sys.errno:
        my_favorite_files.append(sys.errno)

instead of this:

    if args.errno:
        my_favorite_files.append(sys.errno)

The aggressive casting anything to Boolean because it makes if statements easier is something that I would not expect to find in a language that is type-safe.

Profiteer answered 30/12, 2020 at 20:8 Comment(0)
E
0

This runs just fine with no exceptions and returns -10. Is x a string, integer, or boolean? Granted most people would say they would never write code like this but I've seen some really messy, buggy code in my career. Imagine having to inherit thousands of lines of Python code and all the bugs caused by weakly typed code like this.

x = "foo"
x = -10
try:
    if x:
        print(x)
    else:
        print ("false")
except:
    print("An exception occurred")
Epirus answered 30/4 at 6:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.