Suppose I have a table consisting of entries like
ID Arrival Date Arrival City Departure Date Departure City
1 Jun 27 2015 Berlin Jun 20 2015 Paris
1 Jul 1 2015 Rome Jun 29 2015 Berlin
1 Jul 30 2015 Vienna Jul 15 2015 Rome
2 Jun 28 2015 Prague Jun 23 2015 Vienna
2 Jul 1 2015 Rome Jun 29 2015 Prague
2 Jul 30 2015 Vienna Jul 15 2015 Moscow
...
and for each ID I want to join this data on itself such that observations with subsequent Departure Date
and Arrival Date
are grouped pairwise - i.e. a departure is paired with the previous arrival for each ID.
In the example above (where the observations are sorted for convenience) the 2nd row would be appended to the 1st, the 3rd to the 2nd, the 5th to the 4th and the 6th to the 5th (thus producing 4 rows with fields ID Arrival Date Arrival City Departure Date Departure City Arrival Date2 Arrival City2 Departure Date2 Departure City2
).
There could potentially be more than three departures for each ID so a general approach is required. Also please note that there can be holes in the data where Arrival City
and Departure City
does not match - e.g. the Arrival City
of the 5th row is not the Departure City
of the 6th row but they should still be merged. In fact a major goal is to get a better view of how many holes there are in the data.
join
. Also, based on your sample data, please show how the final result should look like. – Nabsql-server-2012
orsql-server-2014
, give a look atLEAD()
analytics function. This is exactly what you're looking for. – Promontory