I'm trying to figure out if there is a good way to manage units in my pandas data. For example, I have a DataFrame
that looks like this:
length (m) width (m) thickness (cm)
0 1.2 3.4 5.6
1 7.8 9.0 1.2
2 3.4 5.6 7.8
Currently, the measurement units are encoded in column names. Downsides include:
- column selection is awkward --
df['width (m)']
vs.df['width']
- things will likely break if the units of my source data change
If I wanted to strip the units out of the column names, is there somewhere else that the information could be stored?
df.units = pd.Series({'length' : 'm', 'width': 'm', 'thickness': 'cm'})
) -- This may be dangerous though. – CabanTable
andunits
module, you can move over from DataFrame to Astropy Table (atab=astropy.table.Table.from_pandas(df)
), and then give each column a unit (e.g.atab['length'].unit = astropy.units.m
). I can post a mwe if you are interested, it looks too messy as a comment with lots of code. – Horripilate