NOTICE! The code will be broken down and not properly indented in relation to each of it's parts, so I recommend taking a look at the code in the question itself / itertools documentation (same code) as well.
It's been over 7 years since this was asked. Wow. I've been interested in this myself, and the explanations above while being very helpful didn't really hit the spot for me, so here's the summarization I made for myself.
Since I finally managed to understand it (or atleast I think I do), I thought it might be beneficial to post this "version" of an explanation, in case there are more like me. so Let's start.
def combinations(iterable, r):
pool = tuple(iterable)
n = len(pool)
In this first section, Simply make a tuple of the iterable and get the length of the iterable. These will be useful later.
if r > n:
return
indices = list(range(r))
yield tuple(pool[i] for i in indices)
This is also quite straight forward - If the length of the needed combination is bigger than our pool of elements, we cannot construct a valid combination (you cannot make a combination of 5 elements from 4), therefore we simply stop the execution with a return statement. We also generate the first combination (the first r elements from our iterable).
This next part is slightly more complex, so read carefully.
while True:
for i in reversed(range(r)):
if indices[i] != n - (r - i):
break
"""
The job of the while loop is to increment the indices one after
the other, and print out all the possible element combinations based
off all the possible valid indice combinations.
This for loop's job is to make sure we never over-increment any values.
In order for us to not run into any errors, the incremention of
the last element of the indice list must stop when it reaches one-less
than the length of our element list, otherwise we'll run into an index error
(trying to access an indice out of the list range).
How do we do that?
The range function will give us values cascading down from r-1 to 0
(r-1, r-2, r-3, ... , 0)
So first and foremost, the (r-1)st indice must not be greater than (n-1)
(remember, n is the length of our element pool), as that is the largest indice.
We can then write
Indices[r - 1] < n - 1
Moreover, because we'll stop incrementing the r-1st indice when we reach it's
maximum value, we must also stop incrementing the (r-2)nd indice when we reach
it's maximum value. What's the (r-2)nd indice maximum value?
Since we'll also be incrementing the (r-1)st indice based on the
(r-2)nd indice, and because the maximum value of the (r-1)st
indice is (n-1), plus we want no duplicates, the maximum value the
(r-2)nd indice can reach would be (n-2).
This trend will continue. more generally:
Indices[r - k] < n - k
Now, since r - k is an arbitrary index generated by the reversed range function,
namely (i), we can substitute:
r - k = i -----> k = r - i
Indices[r - k] < n - k -----> Indices[i] < n - (r - i)
That's our limit - for any indice i we generate, we must break the
increment if this inequality { Indices[i] < n - (r - i) } is no longer
true.
(In the documentation it's written as (Indice[i] != i + n - r), and it
means the exact same thing. I simply find this version easier to visualize
and understand).
"""
else:
return
"""
When our for loop runs out - which means we've gone through and
maximized each element in our indice list - we've gone through every
single combination, and we can exit the function.
It's important to distinct that the else statement is not linked to
the if statement in this case, but rather to the for loop. It's a
for-else statement, meaning "If you've finished iterating through the
entire loop, execute the else statement".
"""
If we did manage to break out of the for loop, this means that we can safely increment our indice to get the next combination (the first line below).
The for loop below makes sure that every time we start on a new index,
we reset the other indexes back to their smallest possible value,
so as to not miss any combinations.
for example, if we were to not do that, then once we reached a point
where we had to move on, say we had (0, 1, 2, 3, 4) and the combination
indexes were (0, 1, 4), when we'd move on and increment 1 to 2, the last
index will remain the same - 4, and we'll miss out on (0, 2, 3), only
registering (0, 2, 4) as a valid combination. Instead, after we increment
(1 -> 2), we update the latter indices based on that : (4 -> 3), and when we
run the while loop again we'd increment 3 back to 4 (refer
to the previous section).
Notice that we never increment the previous indices, as to not create
duplicates.
And finally, for each iteration, the yield statement generates the element combination corresponding to the current indice combination.
indices[i] += 1
for j in range(i+1, r):
indices[j] = indices[j-1] + 1
yield tuple(pool[i] for i in indices)
And just as the documentation states, because we're dealing with positions, a unique combination is unique based on the elements' locations in the iterable, not their value.