Pandas: resample a dataframe to match a DatetimeIndex of a different dataframe
Asked Answered
F

3

9

I have a two time series in separate pandas.dataframe, the first one - series1 has less entries and different start datatime from the second - series2:

index1 = pd.date_range(start='2020-06-16 23:16:00', end='2020-06-16 23:40:30', freq='1T')
series1 = pd.Series(range(len(index1)), index=index1)
index2 = pd.date_range('2020-06-16 23:15:00', end='2020-06-16 23:50:30', freq='30S')
series2 = pd.Series(range(len(index2)), index=index2)

How can I resample series2 to match the DatetimeIndex of series1?

Flanders answered 9/8, 2020 at 11:24 Comment(1)
What operation you want to do when resampling? doing a mean, get the min, max, ...?Kotta
V
8

Use reindex:

series2.reindex(series1.index)

Output:

2020-06-16 23:16:00     2
2020-06-16 23:17:00     4
2020-06-16 23:18:00     6
2020-06-16 23:19:00     8
2020-06-16 23:20:00    10
2020-06-16 23:21:00    12
2020-06-16 23:22:00    14
2020-06-16 23:23:00    16
2020-06-16 23:24:00    18
2020-06-16 23:25:00    20
2020-06-16 23:26:00    22
2020-06-16 23:27:00    24
2020-06-16 23:28:00    26
2020-06-16 23:29:00    28
2020-06-16 23:30:00    30
2020-06-16 23:31:00    32
2020-06-16 23:32:00    34
2020-06-16 23:33:00    36
2020-06-16 23:34:00    38
2020-06-16 23:35:00    40
2020-06-16 23:36:00    42
2020-06-16 23:37:00    44
2020-06-16 23:38:00    46
2020-06-16 23:39:00    48
2020-06-16 23:40:00    50
Freq: T, dtype: int64
Vernitavernoleninsk answered 12/8, 2020 at 14:55 Comment(3)
reindex is the best for this case :) +1Osteoarthritis
ValueError: cannot reindex on an axis with duplicate labelsTowery
@Tobbey, one of your series has multiple indexes. You can't use reindex with duplicate index labels.Vernitavernoleninsk
S
3

Wouldn't a simple resample yield the results are looking for?

series2.resample('T').first()

If your goal is to merge the resampled timestamp back to the first dataset, you could do that as follows:

dt_map = {}
for group_label, group_series in series2.resample('T'):
    dt_map.update({x:group_label for x in group_series.index})

# Overwrite the index
series2.index = series2.index.map(dt_map)

Note: If you want to perform an aggregate function, stick with the first option.

Swarey answered 12/8, 2020 at 15:17 Comment(0)
H
2

IIUC, this is what you need:

series2[series2.index.isin(series1.index)]
Hydrography answered 9/8, 2020 at 12:34 Comment(2)
Nice idea! But in this solution I should first make sure that the two time series are synchronized... still a good solution - thanks!Flanders
It's the same result as @scottboston. Nothing particularly wrong with this, but syntax is not as simple.Tatyanatau

© 2022 - 2024 — McMap. All rights reserved.