Is there an easy way to group columns in a Pandas DataFrame?

I am trying to use Pandas to represent motion-capture data, which has T measurements of the (x, y, z) locations of each of N markers. For example, with T=3 and N=4, the raw CSV data looks like:

T,Ax,Ay,Az,Bx,By,Bz,Cx,Cy,Cz,Dx,Dy,Dz
0,1,2,1,3,2,1,4,2,1,5,2,1
1,8,2,3,3,2,9,9,1,3,4,9,1
2,4,5,7,7,7,1,8,3,6,9,2,3

This is really simple to load into a DataFrame, and I've learned a few tricks that are easy (converting marker data to z-scores, or computing velocities, for example).

One thing I'd like to do, though, is convert the "flat" data shown above into a format that has a hierarchical index on the column (marker), so that there would be N columns at level 0 (one for each marker), and each one of those would have 3 columns at level 1 (one each for x, y, and z).

  A     B     C     D
  x y z x y z x y z x y z
0 1 2 1 3 2 1 4 2 1 5 2 1
1 8 2 3 3 2 9 9 1 3 4 9 1
2 4 5 7 7 7 1 8 3 6 9 2 3

I know how do this by loading up the flat file and then manipulating the Series objects directly, perhaps by using append or just creating a new DataFrame using a manually-created MultiIndex.

As a Pandas learner, it feels like there must be a way to do this with less effort, but it's hard to discover. Is there an easier way?

from StringIO import StringIO import numpy as np a = pd.read_csv(StringIO('T,Ax,Ay,Az,Bx,By,Bz,Cx,Cy,Cz,Dx,Dy,Dz\n\ 0,1,2,1,3,2,1,4,2,1,5,2,1\n\ 1,8,2,3,3,2,9,9,1,3,4,9,1\n\ 2,4,5,7,7,7,1,8,3,6,9,2,3')) a.set_index('T', inplace=True)

>> a Ax Ay Az Bx By Bz Cx Cy Cz Dx Dy Dz T 0 1 2 1 3 2 1 4 2 1 5 2 1 1 8 2 3 3 2 9 9 1 3 4 9 1 2 4 5 7 7 7 1 8 3 6 9 2 3

a.columns = pd.MultiIndex.from_tuples([(c[0], c[1]) for c in a.columns]) >> a A B C D x y z x y z x y z x y z T 0 1 2 1 3 2 1 4 2 1 5 2 1 1 8 2 3 3 2 9 9 1 3 4 9 1 2 4 5 7 7 7 1 8 3 6 9 2 3

Recommended topics

Hot tags