Convert column to row in Python Pandas
Asked Answered
L

3

29

I have the following Python pandas dataframe:

     fruits | numFruits
---------------------
0  | apples |   10
1  | grapes |   20
2  |  figs  |   15

I want:

                 apples | grapes | figs
-----------------------------------------
Market 1 Order |    10  |   20   |  15

I have looked at pivot(), pivot_table(), Transpose and unstack() and none of them seem to give me this. Pandas newbie, so all help appreciated.

Lodger answered 25/1, 2017 at 21:18 Comment(1)
If you are interested about difference of performace, check this questionMichelson
M
28

You need set_index with transpose by T:

print (df.set_index('fruits').T)
fruits     apples  grapes  figs
numFruits      10      20    15

If need rename columns, it is a bit complicated:

print (df.rename(columns={'numFruits':'Market 1 Order'})
         .set_index('fruits')
         .rename_axis(None).T)
                apples  grapes  figs
Market 1 Order      10      20    15

Another faster solution is use numpy.ndarray.reshape:

print (pd.DataFrame(df.numFruits.values.reshape(1,-1), 
                    index=['Market 1 Order'], 
                    columns=df.fruits.values))

                apples  grapes  figs
Market 1 Order      10      20    15

Timings:

#[30000 rows x 2 columns] 
df = pd.concat([df]*10000).reset_index(drop=True)    
print (df)


In [55]: %timeit (pd.DataFrame([df.numFruits.values], ['Market 1 Order'], df.fruits.values))
1 loop, best of 3: 2.4 s per loop

In [56]: %timeit (pd.DataFrame(df.numFruits.values.reshape(1,-1), index=['Market 1 Order'], columns=df.fruits.values))
The slowest run took 5.64 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 424 µs per loop

In [57]: %timeit (df.rename(columns={'numFruits':'Market 1 Order'}).set_index('fruits').rename_axis(None).T)
100 loops, best of 3: 1.94 ms per loop
Michelson answered 25/1, 2017 at 21:20 Comment(3)
Hi ...In this question there are only 3 columns what If we have 10 columns and we need to retain 8 of them and just use the other 2 to reshape the data ?June
It seems you need pivot_table, but without data hard answer. Maybe the best is create new question with sample data, desired output and what do you try (your code)Michelson
I just created a new question please review.June
M
10
pd.DataFrame([df.numFruits.values], ['Market 1 Order'], df.fruits.values)

                apples  grapes  figs
Market 1 Order      10      20    15

Refer to jezrael's enhancement of this concept. df.numFruits.values.reshape(1, -1) is more efficient.

Melanochroi answered 25/1, 2017 at 21:22 Comment(6)
@Michelson that was me being sloppy. Overhead for pandas figuring out that I had a list of arrays. So much simpler to give it the 2-D array in the first place as you did.Melanochroi
@Michelson The concept is the same. I'll edit my answer to point to your update of it.Melanochroi
@Michelson also, I think that overhead is small for larger arrays... maybeMelanochroi
I was a bit confused, but I think now it is super. thank you.Michelson
@Michelson nope! I'm wrong... for larger arrays, it's even worse.Melanochroi
I create new question about explanation, I hope you get nice answer ;)Michelson
A
1

You can use transpose api of pandas as follow:

df.transpose()

Considering df as your pandas dataframe

Ainu answered 30/5, 2019 at 5:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.