How to create all possible pair combinations without duplicates in Google Sheets?
Asked Answered
P

7

1

How to perform iteration over excel/google sheets cells to get pairwise combinations?

"string1"

"string2"
"string3"
...
"string10"

I'm looking at writing a function that can iterate over these strings to create the following:

"string1, string2" 
"string1, string 3" 
...
"string 1, string 10" 
"string 2, string 3" 
...
"string 2, string 10" 
"string3, string 4" 
... ... 
"string9 string10".

Is this possible in google sheets?

Popular answered 21/12, 2017 at 9:54 Comment(4)
Probably. Other than that you haven't given enough information to figure out what you want. What have you tried so far? Where are "string 1", etc stored? Do you always want just combination of two strings returned? Do the strings include the quote marks?... plus other questions I haven't thought of.Liberality
related: #5045637Hourigan
@Dave, please edit the question to make it more understandable, this may be a good question if you say why you need it, and give an example with 3 strings instead of 10. Also, I suggest renaming it to: Find all possible pairs in list. Vote up!Hourigan
I have also voted it up because it was a good challenge but question could be improved.Greeting
H
2

It is a hard task for native functions. Try a script and use it as a custom function:

function getTournament(teams_from_range)

    {
      // teams_from_range -- 2D Array  
      var teams = [];
      // convert to list
      teams_from_range.forEach(function(row) { row.forEach(function(cell) { teams.push(cell); } ); } );
      return getTournament_(teams);
    }
    
    
    function getTournament_(teams)
    {
      var start = 0;
      var l = teams.length;
      var result = [], game = [];
      
      // loop each value
      for (var i = 0; i < l; i++)
      {
        // loop each value minus current
        start++;
        for (var ii = start; ii < l; ii++)
        {
          game = []
          game.push(teams[i]);
          game.push(teams[ii]);  
          result.push(game);
        }  
      }
      
      return result;
    
    }

Usage:

=getTournament(A1:A10)

Hourigan answered 21/12, 2017 at 13:28 Comment(0)
G
2

I have to agree with @Max that it is difficult with native functions, or at least long-winded, but it is possible in Google Sheets

=ArrayFormula(query({if((row(A:A)<=counta(A:A)^2)*(int((row(A:A)-1)/counta(A:A))<mod((row(A:A)-1),counta(A:A))),
vlookup(int((row(A:A)-1)/counta(A:A)),{row(A:A)-1,A:A},2)&vlookup(mod((row(A:A)-1),counta(A:A)),{row(A:A)-1,A:A},2),"")},"select Col1 where Col1<>''"))

enter image description here

Note 1 - method

Using a list of 10 strings as an example.

(1) Add a column to number the strings from 0 to 9 using

{row(A:A)-1,A:A}

(2) Use the row number in a VLOOKUP to get the first string of the pair with

vlookup(int((row(A:A)-1)/counta(A:A)),{row(A:A)-1,A:A},2)

Row number-1 int((row(A:A)-1)/counta(A:A))  String

0            0                              String1

1            0                              String1

...

9            0                              String1

10           1                              String2

...

20           2                              String3

...

99           9                              String10

(3) Use the row number in a VLOOKUP to get the second string of the pair with

vlookup(mod((row(A:A)-1),counta(A:A)),{row(A:A)-1,A:A},2)

Row number-1  mod((row(A:A)-1),counta(A:A)) String

0             0                             String1

1             1                             String2

2             2                             String3

...

9             9                             String10

10            0                             String1

11            1                             String2

...

99            9                             String10

Note that the list will include unwanted pairs like String1String1 and String2String1.

(4) Set unwanted pairs to "" with if condition

if((row(A:A)<=counta(A:A)^2)*(int((row(A:A)-1)/counta(A:A))

Note 1 Using a filter to remove unwanted pairs as suggested by @Max Makhrov would be shorter.

(5) Use Query to remove blank rows.

Note 2 - limitation on number of rows

Because redundant pairs are generated then removed, this method requires N^2 rows to be in the sheet where N is the number of strings rather than N*(N-1)/2 which is the number of distinct pairs of N objects. Conversely, the maximum number of strings s which can be processed this way for a sheet with N rows is floor(sqrt(N)), e.g. for a sheet with 1,000 rows s=floor(sqrt(1000))=31.

Note 3 - a possible way to avoid generating redundant pairs

One way of visualising what I have tried to do is as follows, where the array elements represent output rows (A:A) and the row and column headers indicate corresponding values which are used as lookups to get pairs like (string 1, string 1), (string 1 string 2) etc.

enter image description here

It is fairly easy to do the mapping from output rows to lookup values using integer division and the MOD function as above.

What we would really like to do is to get non-redundant pairs like this

enter image description here

but then how would you map from output rows 1-10 to pairs of lookup values 1-5 ?

I hope to show that this is possible with a bit of maths providing (at least in principle) a way to get the N(N-1)/2 non-redundant pairs straight away without first generating all N^2 pairs.

The count S of cells in rows 1 to r of the upper triangular part above is the total count N(N-1)/2 minus the count in the rows below it (N-r)(N-r-1)/2

enter image description here

This can be re-arranged as follows

enter image description here

This is a quadratic in r so we can solve it using the regular formula

enter image description here

to give

enter image description here

So the row is given by the ceiling of the above formula for r.

The number (say T) at the end of row r is given by substituting the ceiling of r back into the second equation above

enter image description here

and finally the column corresponding to S is given by

enter image description here

Now define a named range N whose value is

=counta(A:A)

and a named range M whose value is

=2*N-1

Then finally the formula you need to select stringA (the row r of the matrix) is

=iferror(ArrayFormula(vlookup(ceiling((M-sqrt(M^2-8*row(A:A)))/2,1),{row(A:A),A:A},2)),"")

and the formula you need to select stringB (the column c of the matrix) is

=iferror(ArrayFormula(vlookup(N+row(A:A)-(M*CEILING((M-SQRT(M^2-8*row(A:A)))/2,1)-CEILING((M-SQRT(M^2-8*row(A:A)))/2,1)^2)/2,{row(A:A),A:A},2)),"")

enter image description here

where columns D and E are just included for testing purposes.

Then it only remains to combine the two formulas into one column if desired.

Greeting answered 21/12, 2017 at 20:21 Comment(3)
Nice! I spent an hour to understand. You may add 2 notes. Note 1 how you did it: found all possible combinations and excluded unnecessary rows. Note 2 about the rows needed. The solution works if the sheet has N^2 rows, so for 100 values you need 10K rows! My shorter version of a formula: =query(filter(vlookup(int((row(A:A)-1)/counta(A:A)),{row(A:A)-1,A:A},2)&vlookup(mod((row(A:A)-1),counta(A:A)),{row(A:A)-1,A:A},2),int((row(A:A)-1)/counta(A:A))<mod((row(A:A)-1),counta(A:A))),"select Col1 where Col1<>''")Hourigan
Sorry it was lacking explanation! Yes I think your use of filter is a better way to do it. It is true that it needs N^2 rows instead of N*(N-1)/2 or 10K instead of 4500 in the case of 100 values. I would also add that if you were doing 'home and away' games in a tournament it would be N^2 instead of N*(N-1) so not very different. Will add to answer laterGreeting
Correction, N*(N-1)/2 would be 4950 for 100 values.Greeting
R
2
=ARRAYFORMULA(SPLIT(SORT(
 TRANSPOSE(SPLIT(CONCATENATE(REPT(UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),",")))&","&
 TRANSPOSE(UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),",")))), (
 UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),",")))<=
 TRANSPOSE(UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),",")))))*
 REGEXMATCH(CONCATENATE(","&SUBSTITUTE(TEXTJOIN(",",1,A2:A),",",",,")&","&CHAR(9)),"(,"&
 UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),",")))&",[^\t]*,"&
 TRANSPOSE(UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),","))))&",)|(,"&
 TRANSPOSE(UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),","))))&",[^\t]*,"&
 UNIQUE(TRANSPOSE(SPLIT(JOIN(",",TEXTJOIN(",",1,A2:A)),",")))&",)"))&CHAR(9)),CHAR(9)))),","))

0

for more see: https://mcmap.net/q/332275/-generate-all-possible-combinations-for-columns-cross-join-or-cartesian-product

Repine answered 23/2, 2019 at 2:44 Comment(3)
can you do this for combinations of 3?Sech
that makes duplicates - I want YOUR SOLUTION adapted to groups of 3 not 2... that page has MANY solutions many of which are not as efficient as what you have here...Sech
@OMNIADesignandMarketing try: https://mcmap.net/q/332283/-how-can-i-permutate-a-list-with-itself-3-times-over-or-make-combinations-of-3-from-a-single-list-without-duplicates-in-each-combo-on-google-sheetsRepine
T
2

Given the advantage of newer functions like REDUCE, it is possible to loop. This is basically a implementation of two loops mentioned here. The second one - one minus the first one.

=QUERY(LAMBDA(mrg,REDUCE({"Round Robin"&CHAR(10)&"(no order)",HYPERLINK("https://stackoverflow.com/a/73901481","-TheMaster")},SEQUENCE(ROWS(mrg)-1,1,0),LAMBDA(a_,c_,{a_;LAMBDA(rg,REDUCE({"",""},OFFSET(rg,1,0,ROWS(rg)-1),LAMBDA(a,c,{a;{INDEX(rg,1),c}})))(OFFSET(mrg,c_,0,ROWS(mrg)-c_))})))(A1:A5),"where Col1 is not null")

where A1:A5 is the list of 5 players.

Round Robin (no order/rotate) -TheMaster
Player 1 Player 2
Player 1 Player 3
Player 1 Player 4
Player 1 Player 5
Player 2 Player 3
Player 2 Player 4
Player 2 Player 5
Player 3 Player 4
Player 3 Player 5
Player 4 Player 5
Thomas answered 29/9, 2022 at 20:51 Comment(0)
R
1
=QUERY(ARRAYFORMULA(SPLIT(
 TRANSPOSE(SPLIT(REPT(CONCATENATE(A2:A&CHAR(9)), COUNTA(A2:A)),  CHAR(9)))& " "&
 TRANSPOSE(SPLIT(CONCATENATE(REPT(A2:A&CHAR(9),  COUNTA(A2:A))), CHAR(9))), " ")),
 "where Col1<>Col2 order by Col1", 0)

for more see: https://mcmap.net/q/332275/-generate-all-possible-combinations-for-columns-cross-join-or-cartesian-product

Repine answered 23/2, 2019 at 2:43 Comment(0)
E
0

The question seems to describe a special case of a Cartesian product of two sets where the sets are identical while removing results that have the same value in both tuple values.

How about a humble three-step solution:

all combos:

=arrayformula( split( 
  tocol(A2:A5 & "→" & transpose(A2:A5)), 
  "→" 
) )

filter out rows where the two strings are the same:

=filter(B2:C, B2:B <> C2:C)

join with comma:

=arrayformula( 
  if( 
    len(D2:D), 
    D2:D & ", " & E2:E, 
    iferror(ø) 
  ) 
)
source data all combos filter join Cartesian solution
string1 string1 string1 string1 string2 string1, string2 string1, string2
string2 string1 string2 string1 string3 string1, string3 string1, string3
string3 string1 string3 string1 string4 string1, string4 string1, string4
string4 string1 string4 string2 string1 string2, string1 string2, string1
string2 string1 string2 string3 string2, string3 string2, string3
string2 string2 string2 string4 string2, string4 string2, string4
string2 string3 string3 string1 string3, string1 string3, string1
string2 string4 string3 string2 string3, string2 string3, string2
string3 string1 string3 string4 string3, string4 string3, string4
string3 string2 string4 string1 string4, string1 string4, string1
string3 string3 string4 string2 string4, string2 string4, string2
string3 string4 string4 string3 string4, string3 string4, string3
string4 string1
string4 string2
string4 string3
string4 string4

The same can be done in one step with a recursive Cartesian that has an if() in the inner reduce():

Cartesian solution:

=let( 
  data, A2:A, 
  column, unique(filter(data, len(data))), 
  table, { column, column }, 
  blank, iferror(1/0), 
  first_, lambda(array, tocol(choosecols(array, 1), true)), 
  rest_, lambda(n, choosecols(table, sequence(1, columns(table) - n, n + 1))), 
  wrap_, lambda(array, wrapCount, wraprows(tocol(array, true), wrapCount)), 

  cartesian_, lambda(a, b, wrap_( 
    byrow(a, lambda(row, 
      reduce(blank, sequence(rows(b)), lambda(acc, i, 
        if( 
          row = chooserows(b, i), 
          acc, 
          { acc, row, chooserows(b, i) } 
        ) 
      ) ) 
    ) ), 
    columns(a) + columns(b) 
  ) ), 

  iterate_, lambda( 
    self, a, b, if(iserror(b), a, 
      self(self, cartesian_(a, first_(b)), rest_(columns(a) + 1)) 
    ) 
  ), 

  result, iterate_(iterate_, first_(table), rest_(1)), 
  byrow(result, lambda(row, join(", ", row))) 
)
Evacuate answered 2/5, 2023 at 10:55 Comment(0)
M
0

With the items to combine in named range "items":

=let(x,tocol(let(i,filter(items,items<>""),t,transpose(filter(items,items<>"")),arrayformula(if(match(i,items,0)>=match(t,items,0),,i&t)))),filter(x,x<>""))
Malley answered 10/8, 2023 at 9:51 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.