pandas - change time object to a float?
Asked Answered
B

4

8

I have a field for call length in my raw data which is listed as an object, such as: 00:10:30 meaning 10 minutes and 30 seconds. How can I convert this to a number like 10.50?

I keep getting errors. If convert the fields with pd.datetime then I can't do an .astype('float'). In Excel, I just multiple the time stamp by 1440 and it outputs the number value I want to work with. (Timestamp * 24 * 60)

Bobbie answered 5/2, 2015 at 20:15 Comment(1)
Or, better yet, use a timedelta.Orchestrion
O
11

You can use time deltas to do this more directly:

In [11]: s = pd.Series(["00:10:30"])

In [12]: s = pd.to_timedelta(s)

In [13]: s
Out[13]:
0   00:10:30
dtype: timedelta64[ns]

In [14]: s / pd.offsets.Minute(1)
Out[14]:
0    10.5
dtype: float64
Orchestrion answered 5/2, 2015 at 20:30 Comment(2)
If you do this on a column that isn't converted to a datetime (just timestamp strings), you get ValueError: only leading negative signs are allowed (I'm using pandas version 0.25.3) I tried converting to datetime first, and got /opt/anaconda3/envs/incident_finder/lib/python3.7/site-packages/pandas/util/_decorators.py:208: FutureWarning: Passing datetime64-dtype data to TimedeltaIndex is deprecated, will raise a TypeError in a future version return func(*args, **kwargs)Sanfred
In this case, you can just do s = pd.to_timedelta(pd.to_datetime(s))Glaab
E
2

I would convert the string to a datetime and then use the dt accessor to access the components of the time and generate your minutes column:

In [16]:

df = pd.DataFrame({'time':['00:10:30']})
df['time'] = pd.to_datetime(df['time'])
df['minutes'] = df['time'].dt.hour * 60 + df['time'].dt.minute + df['time'].dt.second/60
df
Out[16]:
                 time  minutes
0 2015-02-05 00:10:30     10.5
Electrotherapeutics answered 5/2, 2015 at 20:28 Comment(3)
I really hate this behaviour of to_datetime (to take today's date) - IMO this is almost never what you want. :(Orchestrion
@AndyHayden it would be preferable to pass a default date or just have no date at atll, it does mean having to perform a second pass on the data to fix the dateElectrotherapeutics
I think it's just too forgiving here, to_datetime just never wants to raise (no matter how bad the data, it finds a way!!)Orchestrion
S
1

There is probably a better way of doing this, but this will work.

from datetime import datetime
import numpy as np

my_time = datetime.strptime('00:10:30','%H:%M:%S')
zero_time = datetime.strptime('00:00:00','%H:%M:%S')
x = my_time - zero_time
x.seconds

Out[25]: 630
Smolensk answered 5/2, 2015 at 20:25 Comment(0)
B
0

To easily convert a pd.Series to a numeric format, you can use pd.to_numeric

s = pd.to_timedelta(["00:10:30"])  # interprets the string correctly to 'timedelta64[ns]'
pd.to_numeric(s) / 1E9 / 60  # converts to minutes
Breakfast answered 24/7 at 12:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.