Can you prevent automatic alphabetical order of df.append()?

Asked 2/1, 2015 at 22:18 Answered 8/9, 2016 at 10:22

I am trying to append data to a log where the order of columns isn't in alphabetical order but makes logical sense, ex.

Org_Goals_1  Calc_Goals_1  Diff_Goals_1   Org_Goals_2 Calc_Goals_2 Diff_Goals_2

I am running through several calculations based on different variables and logging the results through appending a dictionary of the values after each run. Is there a way to prevent the df.append() function to order the columns alphabetically?

Pierian answered 2/1, 2015 at 22:18 Comment(0)

Seems you have to reorder the columns after the append operation:

In [25]:
# assign the appended dfs to merged
merged = df1.append(df2)
# create a list of the columns in the order you desire
cols = list(df1) + list(df2)
# assign directly
merged.columns = cols
# column order is now as desired
merged.columns
Out[25]:
Index(['Org_Goals_1', 'Calc_Goals_1', 'Diff_Goals_1', 'Org_Goals_2', 'Calc_Goals_2', 'Diff_Goals_2'], dtype='object')

example:

In [26]:

df1 = pd.DataFrame(columns=['Org_Goals_1','Calc_Goals_1','Diff_Goals_1'], data = randn(5,3))
df2 = pd.DataFrame(columns=['Org_Goals_2','Calc_Goals_2','Diff_Goals_2'], data=randn(5,3))
merged = df1.append(df2)
cols = list(df1) + list(df2)
merged.columns = cols
merged
Out[26]:
   Org_Goals_1  Calc_Goals_1  Diff_Goals_1  Org_Goals_2  Calc_Goals_2  \
0     0.028935           NaN     -0.687143          NaN      1.528579   
1     0.943432           NaN     -2.055357          NaN     -0.720132   
2     0.035234           NaN      0.020756          NaN      1.556319   
3     1.447863           NaN      0.847496          NaN     -1.458852   
4     0.132337           NaN     -0.255578          NaN     -0.222660   
0          NaN      0.131085           NaN     0.850022           NaN   
1          NaN     -1.942110           NaN     0.672965           NaN   
2          NaN      0.944052           NaN     1.274509           NaN   
3          NaN     -1.796448           NaN     0.130338           NaN   
4          NaN      0.961545           NaN    -0.741825           NaN   

   Diff_Goals_2  
0           NaN  
1           NaN  
2           NaN  
3           NaN  
4           NaN  
0      0.727619  
1      0.022209  
2     -0.350757  
3      1.116637  
4      1.947526

The same alpha sorting of the columns happens with concat also so it looks like you have to reorder after appending.

EDIT

An alternative is to use join:

In [32]:

df1.join(df2)
Out[32]:
   Org_Goals_1  Calc_Goals_1  Diff_Goals_1  Org_Goals_2  Calc_Goals_2  \
0     0.163745      1.608398      0.876040     0.651063      0.371263   
1    -1.762973     -0.471050     -0.206376     1.323191      0.623045   
2     0.166269      1.021835     -0.119982     1.005159     -0.831738   
3    -0.400197      0.567782     -1.581803     0.417112      0.188023   
4    -1.443269     -0.001080      0.804195     0.480510     -0.660761   

   Diff_Goals_2  
0     -2.723280  
1      2.463258  
2      0.147251  
3      2.328377  
4     -0.248114

Fiasco answered 2/1, 2015 at 22:56 Comment(4)

join does what you want I think except it aligns on index which may or may not be what you want though – Fiasco 2/1, 2015 at 23:3

I actually used this : Log=Log.append(Summary) Log.columns=col_list However, the values don't seem to correspond to the correct column name now... it's as if the titles are in the right position but the data isn't.. – Pierian 2/1, 2015 at 23:30

@AlexisPerez I see your problem, join seems to preserve the column name and data order – Fiasco 2/1, 2015 at 23:34

@AlexisPerez I think something subtle is happening here as normally reordering the column order would also reorder the data but this is not happening, it's as if the NaN columns are causing some confusion in the sorting of the columns – Fiasco 2/1, 2015 at 23:37

Actually, I found "advanced indexing" to work quite well

df2=df.ix[:,'order of columns']

Pierian answered 3/1, 2015 at 0:18 Comment(2)

I was going to try this as this is another way or re-ordering the columns but didn't try due to the other methods failing, it is odd why direct assignment produces funny results though IMO – Fiasco 3/1, 2015 at 0:29

@alexisperez I would just add (for clarity) that your answer's dummy text would be better if it was changed to df2=df.ix[:, ['order','of','columns']]. I.e., supplying a list instead of a string. Still up-voted your answer for brevity though. – Duwe 28/11, 2016 at 17:56

As I see it, the order is lost, but when appending, the original data should have the correct order. To maintain that, assuming Dataframe 'alldata' and dataframe to be appended data 'newdata', appending and keeping column order as in 'alldata' would be:

alldata.append(newdata)[list(alldata)]

(I encountered this problem with named date fields, where 'Month' would be sorted between 'Minute' and 'Second')

Poacher answered 8/9, 2016 at 10:22 Comment(0)

Recommended topics

Hot tags