I have a pandas dataframe df
:
ID words
1 word1
1 word2
1 word3
2 word4
2 word5
3 word6
3 word7
3 word8
3 word9
I want to produce another dataframe that would generate all pairs of words in each group. So the result for the above would be:
ID wordA wordB
1 word1 word2
1 word1 word3
1 word2 word3
2 word4 word5
3 word6 word7
3 word6 word8
3 word6 word9
3 word7 word8
3 word7 word9
3 word8 word9
I know that I can used df.groupby['words']
to get the words within each ID
.
I also know that I can use
iterable = ['word1','word2','word3']
list(itertools.combinations(iterable, 2))
to get all possible pairwise combinations. However, I'm a little lost as to the best way to generate a resulting dataframe as shown above.