Automatically simplifying/refactoring Python code (e.g. for loops -> list comprehension)? [closed]
Asked Answered
L

1

6

In Python, I really enjoy how concise an implementation can be when using list comprehension. I love to do concise list comprehensions this:

myList = [1, 5, 11, 20, 30, 35] #input data
bigNumbers = [x for x in myList if x > 10]

However, I often encounter more verbose implementations like this:

myList = [1, 5, 11, 20, 30, 35] #input data
bigNumbers = []
for i in xrange(0, len(myList)):
    if myList[i] > 10:
        bigNumbers.append(myList[i])

When a for loop only looks through one data structure (e.g. myList[]), there is usually a straightforward list comprehension statement that is equivalent to the loop.
With this in mind, is there a refactoring tool that converts verbose Python loops into concise list comprehension statements?


Previous StackOverflow questions have asked for advice on transforming loops into list comprehension. But, I have yet to find a question about automatically converting loops into list comprehension expressions.


Motivation: There are numerous ways to answer the question "what does it mean for code to be clean?" Personally, I find that making code concise and getting rid of some of the fluff tends to make code cleaner and more readable. Naturally there's a line in the sand between "concise code" and "incomprehensible one-liners." Still, I often find it satisfying to write and work with concise code.

Lythraceous answered 25/1, 2013 at 7:48 Comment(6)
That's what I would do too. Unfortunately, I see a lot of unnecessarily verbose code like this. Are there refactoring tools that would replace xrange(0, len(myList)) with enumerate(myList)? This would be especially useful when trying to clean up someone else's code, or trying to convert some messy code into something that's usable in a tutorial.Lythraceous
@AshwiniChaudhary, or just use for elem in myList:.Forgotten
What is Pythonic? "for i in range(len(seq)):"? No. Use "for obj in seq:".Kele
@Lythraceous I did some googling but can't seem to find anything that modifies the source in a way that while loops are changed into for-loop or LC. Everything was related to profiling and static code analysis.Muscovado
I love writing concise code too, but I also want to keep it readable. It's not hard to convert a loop into a list comprehension by hand, I wouldn't trust a tool to do it.Twofold
One way of doing that automatically would be to use urllib to post a question on SO, wait 2-3 minutes and then download the answer.Pyroelectric
I
5

2to3 is a refactoring tool that can perform arbitrary refactorings, as long as you can specify them with a syntactical pattern. The pattern you might want to look for is this

VARIABLE1 = []
for VARIABLE2 in EXPRESSION1:
    if EXPRESSION2:
        VARIABLE1.append(EXPRESSION3)

This can be refactored safely to

VARIABLE1 = [EXPRESSION3 for VARIABLE2 in EXPRESSION1 if EXPRESSION2]

In your specific example, this would give

bigNumbers = [myList[i] for i in xrange(0, len(myList)) if myList[i] > 10]

Then, you can have another refactoring that replaces xrange(0, N) with xrange(N), and another one that replaces

[VARIABLE1[VARIABLE2] for VARIABLE2 in xrange(len(VARIABLE1)) if EXPRESSION1]

with

[VARIABLE3 for VARIABLE3 in VARIABLE1 if EXPRESSION1PRIME]

There are several problems with this refactoring:

  • EXPRESSION1PRIME must be EXPRESSION1 with all occurrences of VARIABLE1[VARIABLE2] replaced by VARIABLE3. This is possible with 2to3, but requires explicit code to do the traversal and replacement.
  • EXPRESSION1PRIME then must not contain no further occurrences of VARIABLE1. This can also be checked with explicit code.
  • One needs to come up with a name for VARIABLE3. You have chosen x; there is no reasonable way to have this done automatically. You could chose to recycle VARIABLE1 (i.e. i) for that, but that may be confusing as it suggests that i is still an index. It might work to pick a synthetic name, such as VARIABLE1_VARIABLE2 (i.e. myList_i), and check whether that's not used otherwise.
  • One needs to be sure that VARIABLE1[VARIABLE2] yields the same as you get when using iter(VARIABLE1). It's not possible to do this automatically.

If you want to learn how to write 2to3 fixers, take a look at Lennart Regebro's book.

Insanity answered 25/1, 2013 at 14:56 Comment(2)
whoa dude what's up with the all caps? that's a bit crazy looking lolVillose
I was hoping that the caps use is self-evident: they are placeholders, and the rest is concrete syntax.Maida

© 2022 - 2024 — McMap. All rights reserved.