I have a pandas dataframe that looks like below.
Key Name Val1 Val2 Timestamp
101 A 10 1 01-10-2019 00:20:21
102 A 12 2 01-10-2019 00:20:21
103 B 10 1 01-10-2019 00:20:26
104 C 20 2 01-10-2019 14:40:45
105 B 21 3 02-10-2019 09:04:06
106 D 24 3 02-10-2019 09:04:12
107 A 24 3 02-10-2019 09:04:14
108 E 32 2 02-10-2019 09:04:20
109 A 10 1 02-10-2019 09:04:22
110 B 10 1 02-10-2019 10:40:49
Starting from the earliest timestamp, that is, '01-10-2019 00:20:21', I need to create time bins of 10 seconds each and assign same group number to all the rows having timestamp fitting in a time bin. The output should look as below.
Key Name Val1 Val2 Timestamp Group
101 A 10 1 01-10-2019 00:20:21 1
102 A 12 2 01-10-2019 00:20:21 1
103 B 10 1 01-10-2019 00:20:26 1
104 C 20 2 01-10-2019 14:40:45 2
105 B 21 3 02-10-2019 09:04:06 3
106 D 24 3 02-10-2019 09:04:12 4
107 A 24 3 02-10-2019 09:04:14 4
108 E 32 2 02-10-2019 09:04:20 4
109 A 10 1 02-10-2019 09:04:22 5
110 B 10 1 02-10-2019 10:40:49 6
First time bin: '01-10-2019 00:20:21' to '01-10-2019 00:20:30', Next time bin: '01-10-2019 00:20:31' to '01-10-2019 00:20:40', Next time bin: '01-10-2019 00:20:41' to '01-10-2019 00:20:50', Next time bin: '01-10-2019 00:20:51' to '01-10-2019 00:21:00', Next time bin: '01-10-2019 00:21:01' to '01-10-2019 00:21:10' and so on.. Based on these time bins, 'Group' is assigned for each row. It is not mandatory to have consecutive group numbers(If a time bin is not present, it's ok to skip that group number).
I have generated this using for loop, but it takes lot of time if data is spread across months. Please let me know if this can be done as a pandas operation using a single line of code. Thanks.
pandas.DataFrame
has acut()
function used for binning. Give this a try. – Ugric