How can I use Python for large scale development?

M

8

63

I would be interested to learn about large scale development in Python and especially in how do you maintain a large code base?

When you make incompatibility changes to the signature of a method, how do you find all the places where that method is being called. In C++/Java the compiler will find it for you, how do you do it in Python?
When you make changes deep inside the code, how do you find out what operations an instance provides, since you don't have a static type to lookup?
How do you handle/prevent typing errors (typos)?
Are UnitTest's used as a substitute for static type checking?

As you can guess I almost only worked with statically typed languages (C++/Java), but I would like to try my hands on Python for larger programs. But I had a very bad experience, a long time ago, with the clipper (dBase) language, which was also dynamically typed.

Marsiella answered 25/10, 2008 at 13:30 Comment(6)

I think these are all valid questions for small projects as well as large. – Polynices 25/10, 2008 at 13:33

It is a common illusion that if my code compiles then it is correct. Tests are a must-have to refactor your code whether your language is statically typed or not. – Smriti 25/10, 2008 at 17:15

You might be interested in reading this: infoq.com/articles/nuxeo_python_to_java – Estellestella 9/8, 2010 at 21:47

In Haskell, in my experience if the code compiles it is correct most of the time. (Super strict typing) – Incorrigible 2/1, 2014 at 1:16

@J.F.Sebastian , that's true, and unit tests are necessary in any language, but I find that static typing eliminates an entire class of tests that I've often seen in test suites for dynamically typed languages (in both small and large projects, mostly Python/JS). – Willetta 14/1, 2016 at 19:2

@sevko: You must have seen bad tests: you are doing it wrong if your tests can be replaced by a compiler. – Smriti 14/1, 2016 at 19:48

S

25

Since nobody pointed out pychecker, pylint and similar tools, I will: pychecker and pylint are tools that can help you find incorrect assumptions (about function signatures, object attributes, etc.) They won't find everything that a compiler might find in a statically typed language -- but they can find problems that such compilers for such languages can't find, too.

Python (and any dynamically typed language) is fundamentally different in terms of the errors you're likely to cause and how you would detect and fix them. It has definite downsides as well as upsides, but many (including me) would argue that in Python's case, the ease of writing code (and the ease of making it structurally sound) and of modifying code without breaking API compatibility (adding new optional arguments, providing different objects that have the same set of methods and attributes) make it suitable just fine for large codebases.

Sassan answered 25/10, 2008 at 15:5 Comment(2)

I still have my doubts about using Python in a large scale development, but these tools should be able to mitigate a lot of problems. – Marsiella 19/11, 2008 at 11:45

Yeah it's just about your opinion. There are huge projects built with Python and some of them are built for very high availability and reliability (Zope, Plone, ...). – Chorister 7/10, 2013 at 21:53

M

72

Don't use a screw driver as a hammer

Python is not a statically typed language, so don't try to use it that way.

When you use a specific tool, you use it for what it has been built. For Python, it means:

Duck typing : no type checking. Only behavior matters. Therefore your code must be designed to use this feature. A good design means generic signatures, no dependences between components, high abstraction levels.. So if you change anything, you won't have to change the rest of the code. Python will not complain either, that what it has been built for. Types are not an issue.
Huge standard library. You do not need to change all your calls in the program if you use standard features you haven't coded yourself. And Python come with batteries included. I keep discovering them everyday. I had no idea of the number of modules I could use when I started and tried to rewrite existing stuff like everybody. It's OK, you can't get it all right from the beginning.

You don't write Java, C++, Python, PHP, Erlang, whatever, the same way. They are good reasons why there is room for each of so many different languages, they do not do the same things.

Unit tests are not a substitute

Unit tests must be performed with any language. The most famous unit test library (JUnit) is from the Java world!

This has nothing to do with types. You check behaviors, again. You avoid trouble with regression. You ensure your customer you are on tracks.

Python for large scale projects

Languages, libraries and frameworks don't scale. Architectures do.

If you design a solid architecture, if you are able to make it evolves quickly, then it will scale. Unit tests help, automatic code check as well. But they are just safety nets. And small ones.

Python is especially suitable for large projects because it enforces some good practices and has a lot of usual design patterns built-in. But again, do not use it for what it is not designed. E.g : Python is not a technology for CPU intensive tasks.

In a huge project, you will most likely use several different technologies anyway. As a SGBD (French for DBMS) and a templating language, or else. Python is no exception.

You will probably want to use C/C++ for the part of your code you need to be fast. Or Java to fit in a Tomcat environment. Don't know, don't care. Python can play well with these.

As a conclusion

My answer may feel a bit rude, but don't get me wrong: this is a very good question.

A lot of people come to Python with old habits. I screwed myself trying to code Java like Python. You can, but will never get the best of it.

If you have played / want to play with Python, it's great! It's a wonderful tool. But just a tool, really.

Metagenesis answered 25/10, 2008 at 14:46 Comment(5)

Well put and thought provoking. Thank you. – Imaginable 28/1, 2010 at 22:30

Can you cite the source of that quote, "Languages, libraries and frameworks don't scale. Architectures do."? It's so well put I wanted to find out who said it originally. Was it you? – Inconstant 17/11, 2010 at 19:5

This is supposedly from Cal Henderson but since I have no certitude it is, I won't put it in the anwser. – Metagenesis 19/11, 2010 at 10:8

python3 has type hint, confict with duck type? – Innis 9/11, 2016 at 3:59

Python type hints will get structural typing support in the next version, making it compatible with duck typing. Right now you have to be explicit about the interfaces you use if you want to use type hints, by importing types and abc from collections and typing. – Metagenesis 12/11, 2016 at 15:2

F

40

I had some experience with modifying "Frets On Fire", an open source python "Guitar Hero" clone.

as I see it, python is not really suitable for a really large scale project.

I found myself spending a large part of the development time debugging issues related to assignment of incompatible types, things that static typed laguages will reveal effortlessly at compile-time. also, since types are determined on run-time, trying to understand existing code becomes harder, because you have no idea what's the type of that parameter you are currently looking at.

in addition to that, calling functions using their name string with the __getattr__ built in function is generally more common in Python than in other programming languages, thus getting the call graph to a certain function somewhat hard (although you can call functions with their name in some statically typed languages as well).

I think that Python really shines in small scale software, rapid prototype development, and gluing existing programs together, but I would not use it for large scale software projects, since in those types of programs maintainability becomes the real issue, and in my opinion python is relatively weak there.

Frida answered 25/10, 2008 at 14:1 Comment(6)

I have no idea why you were down voted for this insightful comment. Projects written in dynamic languages, without very extensive unit tests, once they reach a critical mass have a real danger of demonstrating this behaviour. The more magic (getattr, eval) there is, the worse the problem gets. – Athematic 25/10, 2008 at 14:8

looking in the repository of frets on fire (yeah open source) ~ 70 src-files > 15 test-src-files hm – Illustrational 25/10, 2008 at 14:21

Python scales well if you write it well.. Google, youtube, NASA and others use Python a lot (Youtube most extensively, I think). A majority of my work's film post-production pipeline is written in Python (render farm management, extensions to artist-tools, asset management, etc etc). – Diplocardiac 26/10, 2008 at 10:24

voted up because dynamic languages MAY exhibit those problems on large scale systems. But who says that large scale static-language systems does not exhibit terrible problems as well? C++ code can grow unmantainable. The key (and the hardest step) is to develop a suitable architecture for the project. – Uniaxial 8/7, 2009 at 18:26

I realize I'm 4 years late to the party, but I wonder how the answers here change if one uses one of the many good (and improving) Python IDEs which help address things like function signatures. – Kristin 29/6, 2013 at 18:6

Yeah eval is really bad for scaling Python, but getattr "magic" can sometimes seriously reduce the code size of a project if used wisely reducing your chance of getting errors that aren't type errors. For Python projects you can try editing the code while its running like stopped at a debug point, and you can try applying a global module decorator or sys.set_trace to log all function calls in a module. – Radiator 3/5, 2021 at 16:57

S

25

Since nobody pointed out pychecker, pylint and similar tools, I will: pychecker and pylint are tools that can help you find incorrect assumptions (about function signatures, object attributes, etc.) They won't find everything that a compiler might find in a statically typed language -- but they can find problems that such compilers for such languages can't find, too.

Python (and any dynamically typed language) is fundamentally different in terms of the errors you're likely to cause and how you would detect and fix them. It has definite downsides as well as upsides, but many (including me) would argue that in Python's case, the ease of writing code (and the ease of making it structurally sound) and of modifying code without breaking API compatibility (adding new optional arguments, providing different objects that have the same set of methods and attributes) make it suitable just fine for large codebases.

Sassan answered 25/10, 2008 at 15:5 Comment(2)

I still have my doubts about using Python in a large scale development, but these tools should be able to mitigate a lot of problems. – Marsiella 19/11, 2008 at 11:45

Yeah it's just about your opinion. There are huge projects built with Python and some of them are built for very high availability and reliability (Zope, Plone, ...). – Chorister 7/10, 2013 at 21:53

G

17

Here are some items that have helped me maintain a fairly large system in python.

Structure your code in layers. i.e separate biz logic, presentation logic and your persistence layers. Invest a bit of time in defining these layers and make sure everyone on the project is brought in. For large systems creating a framework that forces you into a certain way of development can be key as well.
Tests are key, without unit tests you will likely end up with an unmanagable code base several times quicker than with other languages. Keep in mind that unit tests are often not sufficient, make sure to have several integration/acceptance tests you can run quickly after any major change.
Use Fail Fast principle. Add assertions for cases you feel your code maybe vulnerable.
Have standard logging/error handling that will help you quickly navigate to the issue
Use an IDE( pyDev works for me) that provides type ahead, pyLint/Checker integration to help you detect common typos right away and promote some coding standards
Carefull about your imports, never do from x import * or do relative imports without use of .
Do refactor, a search/replace tool with regular expressions is often all you need to do move methods/class type refactoring.

Giguere answered 25/10, 2008 at 15:36 Comment(1)

And still, if tests are so crucial it means time spent in testing will increase and Fail Fast is almost the same as having static types. The refactoring even with help can become tedious or error prone because the certainty is missing. Sometimes I wish Python would allow to mix static and dynamic types. – Ecker 4/4, 2014 at 14:28

I

16

my 0.10 EUR:

i have several python application in 'production'-state. our company use java, c++ and python. we develop with the eclipse ide (pydev for python)

unittests are the key-solution for the problem. (also for c++ and java)

the less secure world of "dynamic-typing" will make you less careless about your code quality

BY THE WAY:

large scale development doesn't mean, that you use one single language!

large scale development often uses a handful of languages specific to the problem.

so i agree to the-hammer-problem :-)

PS: static-typing & python

Illustrational answered 25/10, 2008 at 13:44 Comment(1)

this is true for any dynamic language, as you have no compiler check for invalid variable name etc. – Flout 25/10, 2008 at 15:48

D

9

Incompatible changes to the signature of a method. This doesn't happen as much in Python as it does in Java and C++.

Python has optional arguments, default values, and far more flexibility in defining method signatures. Also, duck typing means that -- for example -- you don't have to switch from some class to an interface as part of a significant software change. Things just aren't as complex.

How do you find all the places where that method is being called? grep works for dynamic languages. If you need to know every place a method is used, grep (or equivalent IDE-supported search) works great.

How do you find out what operations an instance provides, since you don't have a static type to lookup?

a. Look at the source. You don't have the Java/C++ problem of object libraries and jar files to contend with. You don't need all the elaborate aids and tools that those languages require.

b. An IDE can provide signature information under many common circumstances. You can, easily, defeat your IDE's reasoning powers. When that happens, you should probably review what you're doing to be sure it makes sense. If your IDE can't reason out your type information, perhaps it's too dynamic.

c. In Python, you often work through the interactive interpreter. Unlike Java and C++, you can explore your instances directly and interactively. You don't need a sophisticated IDE.

Example:

  >>> x= SomeClass()
  >>> dir(x)

How do you handle/prevent typing errors? Same as static languages: you don't prevent them. You find and correct them. Java can only find a certain class of typos. If you have two similar class or variable names, you can wind up in deep trouble, even with static type checking.

Example:

class MyClass { }
class MyClassx extends MyClass { }

A typo with these two class names can cause havoc. ["But I wouldn't put myself in that position with Java," folks say. Agreed. I wouldn't put myself in that position with Python, either; you make classes that are profoundly different, and will fail early if they're misused.]

Are UnitTest's used as a substitute for static type checking? Here's the other Point of view: static type checking is a substitute for clear, simple design.

I've worked with programmers who weren't sure why an application worked. They couldn't figure out why things didn't compile; the didn't know the difference between abstract superclass and interface, and the couldn't figure out why a change in place makes a bunch of other modules in a separate JAR file crash. The static type checking gave them false confidence in a flawed design.

Dynamic languages allow programs to be simple. Simplicity is a substitute for static type checking. Clarity is a substitute for static type checking.

Disposable answered 25/10, 2008 at 17:21 Comment(2)

-1. The statically-typed languages cause havoc at compile-time for certain category of errors (typos, type-mismatch etc), but the dynamically-typed languages (such as Python) delay this until runtime, often after several days of running. THAT is PAIN. Early detection of problems is far helpful for developers. – Goncourt 5/8, 2013 at 12:26

Advicing grep for refactoring is a bit unsatisfactionary. I can imagine you can keep false positive and false negatives which would not be very nice. – Ecker 4/4, 2014 at 14:30

H

3

My general rule of thumb is to use dynamic languages for small non-mission-critical projects and statically-typed languages for big projects. I find that code written in a dynamic language such as python gets "tangled" more quickly. Partly that is because it is much quicker to write code in a dynamic language and that leads to shortcuts and worse design, at least in my case. Partly it's because I have IntelliJ for quick and easy refactoring when I use Java, which I don't have for python.

Hadji answered 25/10, 2008 at 15:26 Comment(1)

You do now :) – Switchboard 25/5, 2014 at 19:58

T

2

The usual answer to that is testing testing testing. You're supposed to have an extensive unit test suite and run it often, particularly before a new version goes online.

Proponents of dynamically typed languages make the case that you have to test anyway because even in a statically typed language conformance to the crude rules of the type system covers only a small part of what can potentially go wrong.

Thrilling answered 25/10, 2008 at 14:24 Comment(0)

Don't use a screw driver as a hammer

Unit tests are not a substitute

Python for large scale projects

As a conclusion

Recommended topics

Hot tags