Extract date and time from pandas timestamp
Asked Answered
M

5

43

I have a timestamp column where the timestamp is in the following format

2016-06-16T21:35:17.098+01:00

I want to extract date and time from it. I have done the following:

import datetime as dt

df['timestamp'] = df['timestamp'].apply(lambda x : pd.to_datetime(str(x)))

df['dates'] = df['timestamp'].dt.date

This worked for a while. But suddenly it does not.

If I again do df['dates'] = df['timestamp'].dt.date I get the following error

Can only use .dt accessor with datetimelike values

Luckily, I have saved the data frame with dates in the csv but I now want to create another column time in the format 23:00:00.051

EDIT

From the raw data file (15 million samples), the timestamp column looks like following (first 5 samples):

            timestamp

0           2016-06-13T00:00:00.051+01:00
1           2016-06-13T00:00:00.718+01:00
2           2016-06-13T00:00:00.985+01:00
3           2016-06-13T00:00:02.431+01:00
4           2016-06-13T00:00:02.737+01:00

After the following command

df['timestamp'] = df['timestamp'].apply(lambda x : pd.to_datetime(str(x)))

the timestamp column looks like with dtype as dtype: datetime64[ns]

0    2016-06-12 23:00:00.051
1    2016-06-12 23:00:00.718
2    2016-06-12 23:00:00.985
3    2016-06-12 23:00:02.431
4    2016-06-12 23:00:02.737

Then finally

df['dates'] = df['timestamp'].dt.date

0           2016-06-12
1           2016-06-12
2           2016-06-12
3           2016-06-12
4           2016-06-12

EDIT 2

Found the mistake. I had cleaned the data and saved the data frame in a csv file, so I don't have to do the cleaning again. When I read the csv, the timestamp dtype changes to object. Now how do I fix this?

Melanite answered 23/9, 2016 at 13:33 Comment(12)
well it means you have some duff values so you can force these duff values to NaT: df['timestamp'] = pd.to_datetime(df['timestamp'], errors='coerce') then you can get rid of these values using dropna and then you can call dt.date as beforeRattoon
df.timestamp.isnull().sum() returns 0Melanite
Sorry but unless you post your raw data and code with errors then this becomes a hypothetical posturing exercise which wastes timeRattoon
If checking with summing null values, you assume that incorrect values are null/interpreted as null. I'd doubt thatIng
related question: #33365555Outrank
I have posted the code to reproduce. I will try your method too now.Melanite
@chintans OT, instead of df['timestamp'].apply(lambda x : pd.to_datetime(str(x))), consider pd.to_datetime(df['timestamp']).Phuongphycology
Is it because you have imported datetime as dt and then calling the dt accessor attribute resulting in it being overshadowed and thereby resulting in the error?Marilynnmarimba
That seems to be the mistake. I think @Ami 's solution will work. It just takes a lot of time to apply it to 15 million samples :)Melanite
OK, found the mistake. I had cleaned the data and saved the data frame in a csv file, so I dont have to do the cleaning again. When I read the csv, the timestamp dtype changes to object. Now how do I fix this?Melanite
@chintans look at the converters argument for pd.read_csv...Satisfied
@chintans To speed up the conversion, specify the format of your datetime strings --- see this question.Habited
G
55

Do this first:

df['time'] = pd.to_datetime(df['timestamp'])

Before you do your extraction as usual:

df['dates'] = df['time'].dt.date
Giraudoux answered 15/6, 2018 at 15:0 Comment(0)
P
24

If date is in string form then:

import datetime

# this line converts the string object in Timestamp object
df['DateTime'] = [datetime.datetime.strptime(d, "%Y-%m-%d %H:%M") for d in df["DateTime"]]

# extracting date from timestamp
df['Date'] = [datetime.datetime.date(d) for d in df['DateTime']] 

# extracting time from timestamp
df['Time'] = [datetime.datetime.time(d) for d in df['DateTime']] 

If the object is already in the Timestamp format then skip the first line of code.

%Y-%m-%d %H:%M this means your timestamp object must be in the form like 2016-05-16 12:35:00.

Pigeontoed answered 16/11, 2017 at 12:41 Comment(0)
N
3

You can use pandas built-in to_datetime object for this

df['timestamp'] = pd.to_datetime(df['timestamp'])
df['date'] = df['timestamp'].dt.date
df['time'] = df['timestamp'].dt.time
Nevus answered 25/5, 2022 at 14:26 Comment(0)
J
1

When you are importing your csv, then use parse_dates parameter of pandas.read_csv(). For example, to import a column utc_datetime as datetime:

parse_dates = ['utc_datetime']
df = pandas.read_csv('file.csv', parse_dates=parse_dates)

To extract date from timestamp, use numpy instead of pandas:

df['utc_date'] = numpy.array(df['utc_datetime'].values, dtype='datetime64[D]')

Numpy datetime operations are significantly faster than pandas datetime operations.

Jacqulynjactation answered 21/6, 2021 at 18:50 Comment(0)
V
0

This are simple lines that works for me right now

# To extract the date
df["date"] = pd.to_datetime(df["timestamp"]).dt.strftime("%d-%m-%Y")

# Extract Time
df["time"] = pd.to_datetime(df["timestamp"]).dt.time

Source DataCamp.

Violet answered 24/5, 2024 at 19:31 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.