Excel - Aggregate dynamically values in a Table using array formulas (MAP, BYROW, CHOOSECOLS, TEXTJOIN, LAMBDA, FILTER, HSTACK) in a single Cell
Asked Answered
P

6

6

I would like to aggregate, merge, compact some values as shown below

The input is a table and the output is dynamic array formulas on a single Cell and has a goal to aggregate other columns based on unique values of the first column

My input is a Table (named TAB)

Name Service Type
02HPP002NZ 1:Basic unix
02HPP002NZ 1:Basic unix
02HPP002NZ 3: Relevant win
02HPP009O4 3: Relevant nt
02HPP001L7 2: Standard unix
02HPP001L7 2: Standard linux
02HPP009O4 1:Basic nt

Desired output using array formulas on a single cell E2

Name Service Type
02HPP002NZ 1:Basic / 3: Relevant unix / win
02HPP009O4 3: Relevant / 1:Basic nt
02HPP001L7 2: Standard unix / linux

The formula I used is bellow:

=LET(
    Names; TAB[Name];
    uNames; UNIQUE(Names);
    aggfunc; LAMBDA(colIdx; MAP(uNames; LAMBDA(x; BYCOL(CHOOSECOLS(TAB;colIdx); LAMBDA(col;TEXTJOIN(" / "; TRUE; UNIQUE(FILTER(col; Names=x))))))));
    HSTACK(
        uNames;
        aggfunc(2);
        aggfunc(3)
    )
)

As you notice, I hardcode "aggfunc(x)" (x is the number of the column of my table to return). I would like to use a single function aggfunc(numCols) and numCols is an array of the number of columns. And the formula should be like the following but unfortunately it doesn't work. It gives an error "#CALC!"

=LET(
    Names; TAB[Name];
    uNames; UNIQUE(Names);
    numCols; SEQUENCE(COLUMNS(TAB[#Data]));
    aggfunc; LAMBDA(colIdx; MAP(uNames; LAMBDA(x; BYCOL(CHOOSECOLS(TAB;colIdx); LAMBDA(col;TEXTJOIN(" / "; TRUE; UNIQUE(FILTER(col; Names=x))))))));
    HSTACK(
        uNames;
        aggfunc(numCols)
    )
)

An alternative formula that use BYROW instead of MAP

=LET(
     a; TAB[Name];
     b; TAB[Type];
     c; TAB[Service];
     ua; UNIQUE(a);
     HSTACK(
       ua;
       BYROW(ua; LAMBDA(x; TEXTJOIN(" / ";TRUE;UNIQUE(FILTER(b;a=x)))));
       BYROW(ua; LAMBDA(x; TEXTJOIN(" / ";TRUE;UNIQUE(FILTER(c;a=x)))))
     )
 )

I have also tried to use PowerQuery but using group by and then add custom columns but as I have a multiple columns, it is very painful to achieve my goal. If it may be interesting for someone, below the steps:

  1. Select any cell in the table
  2. Select the menu Data then From Table/Range (in the left side)
  3. It opens the PowerQuery Editor, you need now to select the column Name (the first column)
  4. In the menu Transform, select Group By. A new window is open, and I fill the fields as shown bellow
  • New column name: AGGNAME
  • Operation: All Rows
  • Click OK
  1. Now you have a table with Two columns with titles Name and AGGNAME (Table)
  2. In the menu Add Column, select Custom Column. A new window is open and I enter the following values
  • New column name: Type
  • Custom column formula: =Text.Combine(List.Transform(List.Distinct([AGGNAME][#"Type"]), Text.From), " / ")
  1. You have now 3 columns "Name", "AGGNAME" and "Type".
  2. You need to repeat the Step 6 and change the name Type (in the formula) according to the name of your column.
  3. Once you have inserted all needed new columns, you have to delete the column AGGNAME
  4. Now go to the menu Home and click on Close & Load. And it will create a new table with the same output as I need.

Using PowerQuery becomes a painful if the names of the input table change or if the Table has multiple columns.

My version of Office 365 does not yet include the function "GROUPBY" (Version 2406 Build 16.0.17726.20206) 64-bit.

Petula answered 14/9, 2024 at 15:51 Comment(0)
W
3

One approach could be using MAKEARRAY:

=LET(uNames,UNIQUE(Table1[Name]),
     MAKEARRAY(ROWS(uNames),COLUMNS(Table1),
        LAMBDA(rw,cl,
               ARRAYTOTEXT(UNIQUE(FILTER(INDEX(Table1,,cl),Table1[Name]=INDEX(uNames,rw)))))))

PS I used Tables. Change these into TAB for your reference. I'm unable to rename table names using the app version of Excel

Also if you want the separator to be / change ARRAYTOTEXT to TEXTJOIN.

Walczak answered 15/9, 2024 at 12:45 Comment(4)
I changed ARRAYTOTEXT to TEXTJOIN as you suggested and it works fine also, thank you @Walczak :)Petula
Thanks. I see in the comments you wanted to use a named lambda and be able to type column index numbers as arguments. You could easily adopt this and use CHOOSECOLSWalczak
Yes, I am currently working on to make it more flexible. Personally, among all the responses, your formula is the most clear and straightforward. ThanksPetula
Thanks (apart from GROUPBY, which is not available to you yet)Walczak
G
3

Try this formula:

=LET(
    n,Tab[Name],nUnique,UNIQUE(n),
    agg, LAMBDA(colIdx,n_,TEXTJOIN("/",TRUE,UNIQUE(FILTER(CHOOSECOLS(Tab,colIdx),n=n_)))),
    DROP(
        REDUCE("",nUnique,LAMBDA(r,n_,VSTACK(r,HSTACK(n_,agg(2,n_),agg(3,n_)))))
        ,1)
    )
Goldfarb answered 14/9, 2024 at 16:12 Comment(3)
Thank you @ike, your formula is similar to mine. I've tried to add a new column in the input table and unfortunately, the new column is not appeared as output. I mean, with my formula, I have to add manually "aggfunc(4)" and with yours this ";agg(4;n_)"Petula
yes, you would have to add ;add(4;n_)Goldfarb
I’m sorry to down vote your suggestion as it doesn’t answer my goal. The formula should return dynamically all the input columns and not hardcode them manually as the table may have many more columns. Thank you for your time.Petula
L
3
=LET(
    aggFunc; LAMBDA(tbl; criteriaSpec; retCols;
        LET(
            header; TAKE(tbl; 1);
            data; DROP(tbl; 1);
            criteriaSpec; TEXTSPLIT(criteriaSpec; "=");
            title; INDEX(criteriaSpec; 1; 1);
            toFind; INDEX(criteriaSpec; 1; 2);
            titleCol; XMATCH(title; header);
            criteria; IF(
                toFind = "";
                SEQUENCE(ROWS(data); ; TRUE; 0);
                CHOOSECOLS(data; titleCol) = toFind
            );
            uFound; UNIQUE(FILTER(CHOOSECOLS(data; titleCol); criteria));
            aggCol; LAMBDA(col; TEXTJOIN(" / "; TRUE; UNIQUE(col)));
            aggForTitle; LAMBDA(acc; titleVal;
                VSTACK(
                    acc;
                    LET(
                        filtered; FILTER(
                            CHOOSECOLS(data; retCols);
                            CHOOSECOLS(data; titleCol) = titleVal
                        );
                        HSTACK(titleVal; BYCOL(filtered; aggCol))
                    )
                )
            );
            REDUCE(
                CHOOSECOLS(header; titleCol; retCols);
                uFound;
                aggForTitle
            )
        )
    );
    aggFunc(TAB_34[#All]; "Name="; {2; 3})
)

Updated result

Labradorite answered 14/9, 2024 at 18:3 Comment(6)
Hello @nkalvi, it works fine. I can add a new column and it appeared in the output. Thanks a lot for your time!Petula
You're most welcome! and thank you for providing data and steps - it made it easier to start experimenting. I added a minor modification, please see whether it is helpful.Labradorite
I need to make it more flexible and using a "named formula" . but the idea is let's say I have a formula named "Compact" and it has parameters like this Compact(TAB;"Type=nt";"Service";...). I modified your first code like this (I don't yet analyse your second one) =LAMBDA(TAB;Criteria;retCol1; LET( Title;CHOOSECOLS(TEXTSPLIT(Criteria;"=");1); iTitle; MATCH(Title; TAB[#Headers];0); ToFind;CHOOSECOLS(TEXTSPLIT(Criteria;"=");2); Names;CHOOSECOLS(TAB;iTitle); uNames; IF(ToFind="";UNIQUE(Names);ToFind); numCols;{2;3}; ... (1rst part)Petula
``` aggfunc; LAMBDA(acc;name; VSTACK(acc;HSTACK(BYCOL(CHOOSECOLS(TAB; numCols); LAMBDA(col;TEXTJOIN(" / ";TRUE;UNIQUE(SORT(FILTER(col; Names = name)))))))) ); HSTACK( VSTACK(CHOOSECOLS(TAB[#Headers]; iTitle); uNames); REDUCE(CHOOSECOLS(TAB[#Headers]; numCols); uNames; aggfunc) ) ))(TAB;"Type=win";"Environment")``` As you may notice the first parameter is the name of the Table, the second is a criteria, the third is the title of the column to return (at the moment I hardcoded them numCols;{2;3} but I'm working on it.Petula
Posted a quick draft - please let me know whether it is helpful.Labradorite
Fantastic, it works fine. I updated your first code and now you can chose any column to filter or to make unique, and chose the aggregated column(s) to return. You're welcome if you want to optimize it. Many thanks for your time and have a good night :)Petula
I
3

Retrieve Unique Filtered Columns

  • The result is the same as in the OP plus an extra unique (Names) column to the left i.e. if the goal is to stack all columns, you don't need to stack the unique one anymore.
  • The first line contains the parameters, then follow two helper functions and the unique names. Finally, your aggregate function takes like a single parameter expecting an array of the columns to be returned.

Down the Rabbit Hole

=LET(data,TAB,uniques,TAB[Name],cols,SEQUENCE(COLUMNS(data)),delim," / ",
    GetCell,LAMBDA(cdata,cfilter,delim,BYCOL(cdata,LAMBDA(c,
        TEXTJOIN(delim,,UNIQUE(FILTER(c,cfilter)))))),
    GetCol,LAMBDA(data,udata,cdata,delim,
        MAP(udata,LAMBDA(m,GetCell(cdata,data=m,delim)))),
    u,UNIQUE(uniques),
    aggfunc,LAMBDA(cols,DROP(REDUCE("",cols,LAMBDA(cc,c,
        HSTACK(cc,GetCol(uniques,u,CHOOSECOLS(data,c),delim)))),,1)),
    HSTACK(u,aggfunc(cols)))

Screenshot of the Data and the Result

Inoperable answered 14/9, 2024 at 19:40 Comment(2)
thank you @VBasic2008, I slightly modified your code with this cols;SEQUENCE(COLUMNS(data)-1;;2) to avoid repeating 2 times the first column and it works as expected. ThanksPetula
You can achieve the same by leaving the sequence as is and using aggfunc(cols)) instead of the last line.Inoperable
E
3

Looks like you can use:

screenshot

Formula in E2:

=GROUPBY(A2:A8,B2:C8,LAMBDA(x,TEXTJOIN(" / ",,UNIQUE(x))),,0)

Notice how both service and type get grouped.

Enwrap answered 15/9, 2024 at 10:11 Comment(3)
Hello @JvdV, I forgot to mention that unfortunately my current version of Office 365 does not yet include this function (Version 2406 Build 16.0.17726.20206) 64-bit. The reason why I am here to create a new one. Thank you for your time :)Petula
You posted yesterday on Excel BI an 'instance' using the ARRAYTOTEXT ETA. Do you have any idea why it wouldn't work? I have the GROUPBY function (this works), I'm on the Insider channel, and I have the most recent updates.Inoperable
Quite possibly the eta isn't yet rolled out on current channel @VBasic2008. It should work of you'd write it inside an actual LAMBDAEnwrap
W
3

One approach could be using MAKEARRAY:

=LET(uNames,UNIQUE(Table1[Name]),
     MAKEARRAY(ROWS(uNames),COLUMNS(Table1),
        LAMBDA(rw,cl,
               ARRAYTOTEXT(UNIQUE(FILTER(INDEX(Table1,,cl),Table1[Name]=INDEX(uNames,rw)))))))

PS I used Tables. Change these into TAB for your reference. I'm unable to rename table names using the app version of Excel

Also if you want the separator to be / change ARRAYTOTEXT to TEXTJOIN.

Walczak answered 15/9, 2024 at 12:45 Comment(4)
I changed ARRAYTOTEXT to TEXTJOIN as you suggested and it works fine also, thank you @Walczak :)Petula
Thanks. I see in the comments you wanted to use a named lambda and be able to type column index numbers as arguments. You could easily adopt this and use CHOOSECOLSWalczak
Yes, I am currently working on to make it more flexible. Personally, among all the responses, your formula is the most clear and straightforward. ThanksPetula
Thanks (apart from GROUPBY, which is not available to you yet)Walczak
P
0

Based on @nkalvi suggestion, I improve the formula to make it more flexible (I added more columns

Flexible version

The formula on K1 cell is:

    =LAMBDA(TAB;Criteria;retCol1;
LET(
    Title;CHOOSECOLS(TEXTSPLIT(Criteria;"=");1);
    iTitle; MATCH(Title; TAB[#Headers];0);
    ToFind;CHOOSECOLS(TEXTSPLIT(Criteria;"=");2);
    Names;CHOOSECOLS(TAB;iTitle);
    uNames; IF(ToFind="";UNIQUE(Names);ToFind);
    numCols; MATCH(retCol1; TAB[#Headers];0);
    aggfunc; LAMBDA(acc;name;
        VSTACK(
            acc;
            HSTACK(
                BYCOL(
                    CHOOSECOLS(TAB; numCols);
                    LAMBDA(col;TEXTJOIN(" / ";TRUE;UNIQUE(SORT(FILTER(col; Names = name))))))))
    );
    HSTACK(
        VSTACK(CHOOSECOLS(TAB[#Headers]; iTitle); uNames);
        REDUCE(CHOOSECOLS(TAB[#Headers]; numCols); uNames; aggfunc)
    )
))(TAB;I2;G2#)

Now, here are some steps to create a "Named formula":

  1. Go to the menu "Formulas", click on "Name Manager" then click the button "New..."
  2. Insert "COMPACT" as Name
  3. Copy all codes above except (TAB;I2;G2#) and paste it on "Refers to:" field + OK + Close

Now if you put the formula =COMPACT(TAB;I2;G2#) on the cell K2 (the result is shown in the screenshot.

Explanation of the parameters:

  • 1st: name of the Table
  • 2nd: Criteria ([Column_Name]=[Value_to_find], if [Value_to_find] is omitted then it will return all unique values related to the Column_Name
  • 3rd: is a range of names of the columns to return

PS: in the formula K9, I use =COMPACT(TAB;I8;G8#) but I need to wrap all the formula to avoid duplicate, so it becomes =UNIQUE(COMPACT(TAB;I8;G8#))

Thank you to everyone and if you have an idea to optimize the formula, you'll be welcome :)

Petula answered 14/9, 2024 at 21:2 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.