This question will seek multiple approaches LET/LAMBDA
VBA UDF
and Power Query Function
, so there will be no single right answer, but a solicitation of approaches to be used as references.
Scott raised a question here about unpivoting a complex table that contains blocks of data instead of individual data points. The basic idea is illustrated in this table:
Jan | Jan | Jan | Jan | Feb | Feb | Feb | Feb | Mar | Mar | Mar | Mar | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
State | City | Pressure | Temp | Humidity | CO2 | Pressure | Temp | Humidity | CO2 | Pressure | Temp | Humidity | CO2 |
Georgia | Atlanta | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
Massachusetts | Boston | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 |
Texas | Dallas | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 |
Louisiana | Jonesboro | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 |
California | San Francisco | 193 | 194 | 195 | 196 | 197 | 198 | 199 | 200 | 201 | 202 | 203 | 204 |
The data for each city is in blocks of four columns containing Pressure, Temperature, Humidity and CO2 (or PTHC). We want to unpivot the PTHC blocks of values according to their month by the State and City. Here is the desired output:
State | City | month | Pressure | Temp | Humidity | CO2 |
---|---|---|---|---|---|---|
Georgia | Atlanta | Jan | 1 | 2 | 3 | 4 |
Georgia | Atlanta | Feb | 5 | 6 | 7 | 8 |
Georgia | Atlanta | Mar | 9 | 10 | 11 | 12 |
Massachusetts | Boston | Jan | 49 | 50 | 51 | 52 |
Massachusetts | Boston | Feb | 53 | 54 | 55 | 56 |
Massachusetts | Boston | Mar | 57 | 58 | 59 | 60 |
Texas | Dallas | Jan | 97 | 98 | 99 | 100 |
Texas | Dallas | Feb | 101 | 102 | 103 | 104 |
Texas | Dallas | Mar | 105 | 106 | 107 | 108 |
Louisiana | Jonesboro | Jan | 145 | 146 | 147 | 148 |
Louisiana | Jonesboro | Feb | 149 | 150 | 151 | 152 |
Louisiana | Jonesboro | Mar | 153 | 154 | 155 | 156 |
California | San Francisco | Jan | 193 | 194 | 195 | 196 |
California | San Francisco | Feb | 197 | 198 | 199 | 200 |
California | San Francisco | Mar | 201 | 202 | 203 | 204 |
The order of the rows is not important, so long as they are complete - i.e. the output could be sorted by month, city, state, ... it does not matter. The output does not need to be a dynamic array that spills - i.e. in the case of a Power Query function, it clearly would not be.
It can be assumed that the PTHC block is always consistent, i.e.
- it never skips a field value, e.g. PTHC PTC PTHC...
- it never changes order, e.g. PTHC PCHT
The months are always presented in groups that are equally sized to the block (in this example, 4, so there will be four Jan columns, Feb columns, etc.). e.g. if there are 7 months, there will be 7 PTHC blocks or 28 columns of data.
However, the pattern of months can also be interleaved such that the months will increment and the PTHC block will be grouped (i.e. PPP TTT HHH CCC) like this:
Jan | Feb | Mar | Jan | Feb | Mar | Jan | Feb | Mar | Jan | Feb | Mar | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
State | City | Pressure | Pressure | Pressure | Temp | Temp | Temp | Humidity | Humidity | Humidity | CO2 | CO2 | CO2 |
The UDF would also have to accommodate more or less than 4 fields inside the block. The use of Months and PTHC are just illustrations, the attribute that represents months in this example will always be a single row (although a multi-row approach would be an interesting question - but a new and separate one). The attribute that represents the field values PTHC will also be a single row.
I will propose a LET function based on Scott's question, but there certainly can be better approaches and both VBA and Power Query have their own strengths. The objective is to create a collection of working approaches.
#"Removed Columns"
for most applications, but bc %Hum is a different data type, we cannot stop there - so your final repivoting is difficult and necessary. Nice job. As I see how you did it, I think hardcoding is inevitable i.e. PQ function = impossible. – Amulet