I have a Value 38142 I need to convert it into date format using python. if use this number in excel and do a right click and format cell at that time the value will be converted to 04/06/2004 and I need the same result using python. How can I achieve this
The offset in Excel is the number of days since 1900/01/01, with 1
being the first of January 1900, so add the number of days as a timedelta to 1899/12/31:
from datetime import datetime, timedelta
def from_excel_ordinal(ordinal: float, _epoch0=datetime(1899, 12, 31)) -> datetime:
if ordinal >= 60:
ordinal -= 1 # Excel leap year bug, 1900 is not a leap year!
return (_epoch0 + timedelta(days=ordinal)).replace(microsecond=0)
You have to adjust the ordinal by one day for any date after 1900/02/28; Excel has inherited a leap year bug from Lotus 1-2-3 and treats 1900 as a leap year. The code above returns datetime(1900, 2, 28, 0, 0)
for both 59
and 60
to correct for this, with fractional values in the range [59.0 - 61.0) all being a time between 00:00:00.0 and 23:59:59.999999 on that day.
The above also supports serials with a fraction to represent time, but since Excel doesn't support microseconds those are dropped.
datetime(1899, 12, 31) + timedelta(ordinal - (ordinal > 59))
–
Boehmer _epoch
parameter is there to cache the value as a local variable, nothing more. This helps avoid having to create it for each call, or to have to look up a global (slightly slower). –
Kamilah from datetime import datetime, timedelta
def from_excel_ordinal(ordinal, epoch=datetime(1900, 1, 1)):
# Adapted from above, thanks to @Martijn Pieters
if ordinal > 59:
ordinal -= 1 # Excel leap year bug, 1900 is not a leap year!
inDays = int(ordinal)
frac = ordinal - inDays
inSecs = int(round(frac * 86400.0))
return epoch + timedelta(days=inDays - 1, seconds=inSecs) # epoch is day 1
excelDT = 42548.75001 # Float representation of 27/06/2016 6:00:01 PM in Excel format
pyDT = from_excel_ordinal(excelDT)
The above answer is fine for just a date value, but here I extend the above solution to include time and return a datetime values as well.
timedelta()
does this for you when days
is a floating point value. –
Kamilah I would recomment the following:
import pandas as pd
def convert_excel_time(excel_time):
return pd.to_datetime('1900-01-01') + pd.to_timedelta(excel_time,'D')
Or
import datetime
def xldate_to_datetime(xldate):
temp = datetime.datetime(1900, 1, 1)
delta = datetime.timedelta(days=xldate)
return temp+delta
Is taken from https://gist.github.com/oag335/9959241
I came to this question when trying to do the same above, but for entire columns within a df. I made this function, which did it for me:
import pandas as pd
from datetime import datetime, timedelta
import copy as cp
def xlDateConv(df, *cols):
tempDt = []
fin = cp.deepcopy(df)
for col in [*cols]:
for i in range(len(fin[col])):
tempDate = datetime(1900, 1, 1)
delta = timedelta(float(fin[col][i]))
tempDt.append(pd.to_datetime(tempDate+delta))
fin[col] = tempDt
tempDt = []
return fin
Note that you need to type each column, quoted (as string), as one parameter, which can most likely be improved (list of columns as input, for instance). Also, it returns a copy of the original df (doesn't change the original).
Btw, partly inspired by this (https://gist.github.com/oag335/9959241).
If you are working with Pandas this could be useful
import xlrd
import datetime as dt
def from_excel_datetime(x):
return dt.datetime(*xlrd.xldate_as_tuple(x, datemode=0))
df['date'] = df.excel_date.map(from_excel_datetime)
If the date seems to be 4 years delayed, maybe you can try with datemode 1.
:param datemode: 0: 1900-based, 1: 1904-based.
I had the same problem and then I used this function: (source: https://gist.github.com/OmarArain/9959241)
import datetime
def xldate_to_datetime(xldate):
xldate = int(xldate)
temp = datetime.datetime(1900, 1, 1)
delta = datetime.timedelta(days=xldate)
return temp+delta
And then I applied it to my dataframe:
df['column_date'] = df['column_date'].apply(lambda x: xldate_to_datetime(x))
This is going to be the simplest solution yet: openpyxl
has a built-in just for that:
from openpyxl.utils.datetime import from_excel
excel_oridinal = 38142
python_datetime = from_excel(excel_oridinal)
print(python_datetime, type(python_datetime)) # 2004-06-04 00:00:00 <class 'datetime.datetime'>
It takes care of the 1900-02-29
bug by itself.
Just make sure you understand which date system you're working with – read more here.
© 2022 - 2024 — McMap. All rights reserved.
date.fromordinal()
does. But that number is missing a digit then. – Kamilah