How to convert a given ordinal number (from Excel) to a date
Asked Answered
B

7

34

I have a Value 38142 I need to convert it into date format using python. if use this number in excel and do a right click and format cell at that time the value will be converted to 04/06/2004 and I need the same result using python. How can I achieve this

Borras answered 1/4, 2015 at 9:24 Comment(5)
That's a weird ordinal; are you sure 04/06/2004 is correct? If the value 38142 stands for days then that'd be an offset from either 1993/12/25 or 1993/10/27 depending on what you interpret as the month.Kamilah
Formula to convert date to number suggests it should be a number of days since 1900/01/01, which is what date.fromordinal() does. But that number is missing a digit then.Kamilah
My file have the value I don't know its ordinal or not my client says its ordinal and told me that "if you want find the actual date just do format cell in excel for the given value at that time I am getting this value" @MartijnPietersBorras
yeah, it is indeed an ordinal, but there's a bug in Excel which caused me to discount my initial theory.Kamilah
related, older question: How to convert a python datetime.datetime to excel serial date numberShahjahanpur
K
49

The offset in Excel is the number of days since 1900/01/01, with 1 being the first of January 1900, so add the number of days as a timedelta to 1899/12/31:

from datetime import datetime, timedelta

def from_excel_ordinal(ordinal: float, _epoch0=datetime(1899, 12, 31)) -> datetime:
    if ordinal >= 60:
        ordinal -= 1  # Excel leap year bug, 1900 is not a leap year!
    return (_epoch0 + timedelta(days=ordinal)).replace(microsecond=0)

You have to adjust the ordinal by one day for any date after 1900/02/28; Excel has inherited a leap year bug from Lotus 1-2-3 and treats 1900 as a leap year. The code above returns datetime(1900, 2, 28, 0, 0) for both 59 and 60 to correct for this, with fractional values in the range [59.0 - 61.0) all being a time between 00:00:00.0 and 23:59:59.999999 on that day.

The above also supports serials with a fraction to represent time, but since Excel doesn't support microseconds those are dropped.

Kamilah answered 1/4, 2015 at 9:41 Comment(6)
@Krish: the bug is popularized by Joel Spolsky: My First BillG ReviewBoehmer
Are you sure the epoch is not December 31, 1899? datetime(1899, 12, 31) + timedelta(ordinal - (ordinal > 59))Boehmer
@J.F.Sebastian I stuck to the documentation for Excel here; it makes little difference here to subtract one relative to 1900-01-01.Kamilah
makes no sense to have _epoch as a parameter if we hard code the ordinal check for being > 59.Antonomasia
@FinanceGuyThatCantCode: The _epoch parameter is there to cache the value as a local variable, nothing more. This helps avoid having to create it for each call, or to have to look up a global (slightly slower).Kamilah
@MartijnPieters - I see - hence the leading _ in the _epoch parameter. I have not done that before. I guess it is a known PEP thingy as a function parameter then to be thought of sort of like a static local variable in C++ or something.Antonomasia
E
7
from datetime import datetime, timedelta

def from_excel_ordinal(ordinal, epoch=datetime(1900, 1, 1)):
    # Adapted from above, thanks to @Martijn Pieters 

    if ordinal > 59:
        ordinal -= 1  # Excel leap year bug, 1900 is not a leap year!
    inDays = int(ordinal)
    frac = ordinal - inDays
    inSecs = int(round(frac * 86400.0))

    return epoch + timedelta(days=inDays - 1, seconds=inSecs) # epoch is day 1

excelDT = 42548.75001           # Float representation of 27/06/2016  6:00:01 PM in Excel format  
pyDT = from_excel_ordinal(excelDT)

The above answer is fine for just a date value, but here I extend the above solution to include time and return a datetime values as well.

Endpaper answered 27/10, 2016 at 23:25 Comment(1)
There is no need to split out the days and the seconds; timedelta() does this for you when days is a floating point value.Kamilah
A
2

I would recomment the following:

import pandas as pd

def convert_excel_time(excel_time):

    return pd.to_datetime('1900-01-01') + pd.to_timedelta(excel_time,'D')

Or

import datetime

def xldate_to_datetime(xldate):
    temp = datetime.datetime(1900, 1, 1)
    delta = datetime.timedelta(days=xldate)
    return temp+delta

Is taken from https://gist.github.com/oag335/9959241

Acrylyl answered 11/4, 2019 at 20:30 Comment(2)
xldate_to_datetime(44000) gives 2020-06-20 where as the answer is 2020-06-18Taddeusz
@PoornaPrudhvi is correct; the base date should be 1899-12-30. One day offset because we should be adding to Dec 31 and another day offset b/c of the leap year bug mention in the accepted answer.Haggai
R
1

I came to this question when trying to do the same above, but for entire columns within a df. I made this function, which did it for me:

import pandas as pd    
from datetime import datetime, timedelta
import copy as cp

def xlDateConv(df, *cols):      
    tempDt = []
    fin = cp.deepcopy(df)
    for col in [*cols]:
        for i in range(len(fin[col])):
            tempDate = datetime(1900, 1, 1)
            delta = timedelta(float(fin[col][i]))
            tempDt.append(pd.to_datetime(tempDate+delta))

        fin[col] = tempDt
        tempDt = []
    return fin

Note that you need to type each column, quoted (as string), as one parameter, which can most likely be improved (list of columns as input, for instance). Also, it returns a copy of the original df (doesn't change the original).

Btw, partly inspired by this (https://gist.github.com/oag335/9959241).

Ready answered 24/5, 2019 at 13:50 Comment(1)
Thanks a lot for thisObeng
D
1

If you are working with Pandas this could be useful

    import xlrd
    import datetime as dt
    
    def from_excel_datetime(x):
        return dt.datetime(*xlrd.xldate_as_tuple(x, datemode=0))
    
    df['date'] = df.excel_date.map(from_excel_datetime)

If the date seems to be 4 years delayed, maybe you can try with datemode 1.

:param datemode: 0: 1900-based, 1: 1904-based.

Disquiet answered 2/12, 2020 at 5:11 Comment(0)
N
0

I had the same problem and then I used this function: (source: https://gist.github.com/OmarArain/9959241)

import datetime
def xldate_to_datetime(xldate):
    xldate = int(xldate)
    temp = datetime.datetime(1900, 1, 1)
    delta = datetime.timedelta(days=xldate)
    return temp+delta

And then I applied it to my dataframe:

df['column_date'] = df['column_date'].apply(lambda x: xldate_to_datetime(x))
Nitrification answered 20/5, 2023 at 20:8 Comment(0)
R
0

This is going to be the simplest solution yet: openpyxl has a built-in just for that:

from openpyxl.utils.datetime import from_excel

excel_oridinal = 38142
python_datetime = from_excel(excel_oridinal)

print(python_datetime, type(python_datetime)) # 2004-06-04 00:00:00 <class 'datetime.datetime'>

It takes care of the 1900-02-29 bug by itself. Just make sure you understand which date system you're working with – read more here.

Reticulum answered 9/3 at 7:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.