Get Row Position instead of Row Index from iterrows() in Pandas

Asked 23/5, 2018 at 9:52 Answered 22/5 at 12:35

Solved python python-3.x pandas for-loop

I'm new to stackoverflow and I have research but have not find a satisfying answer.

I understand that I can get a row index by using df.iterrows() to iterate through a df. But what if I want to get a row position instead of row idx. What method can I use?

Example code that I'm working on is below:

df = pd.DataFrame({'month': ['Jan', 'Feb', 'March', 'April'],
               'year': [2012, 2014, 2013, 2014],
               'sale':[55, 40, 84, 31]})

df = df.set_index('month')

for idx, value in df.iterrows():
    print(idx)

How can I get an output of:

Thanks!

Undone answered 23/5, 2018 at 9:52 Comment(2)

you can get it by df.index – Freer 23/5, 2018 at 10:2

i think the question you should ask is "how do I answer my question without using df.iterrows()" – Boleslaw 23/5, 2018 at 10:37

If you need row number instead of index, you should:

Use enumerate for a counter within a loop.
Don't extract the index, see options below.

Option 1

In most situations, for performance reasons you should try and use df.itertuples instead of df.iterrows. You can specify index=False so that the first element is not the index.

for idx, row in enumerate(df.itertuples(index=False)):
    # do something

df.itertuples returns a namedtuple for each row.

Option 2

Use df.iterrows. This is more cumbersome, as you need to separate out an unused variable. In addition, this is inefficient vs itertuples.

for idx, (_, row) in enumerate(df.iterrows()):
    # do something

Russian answered 23/5, 2018 at 11:46 Comment(0)

Simply use enumerate:

for idx, (_, value) in enumerate(df.iterrows()):
    print(idx)

Mulkey answered 23/5, 2018 at 9:55 Comment(0)

You can use get_loc on df.index:

for idx, value in df.iterrows():
    print(idx, df.index.get_loc(idx))

Output:

Jan 0
Feb 1
March 2
April 3

Stclair answered 23/5, 2018 at 11:52 Comment(0)

You can use df.index() which returns a range of indexes numbers. The returned value is a RangeIndex object which is a range like iterable that supports iteration and many other functionalities that a Pandas series supports :

>>> df.index
RangeIndex(start=0, stop=4, step=1)
>>> 
>>> list(df.index)
[0, 1, 2, 3]

Superposition answered 23/5, 2018 at 9:56 Comment(0)

Most simple answer without any overhead:

current_run_counter = 0
for df_index, row in df.iterrows():
    current_run_counter += 1
    print("{0}/{1}".format(current_run_counter, len(df))

Nuclease answered 22/5 at 12:35 Comment(0)

Recommended topics

Hot tags