How can I elegantly get a specific number (>1) of distinct, random elements from a collection?
This is something that I think looks more or less nice, but is not as efficient as it can be:
yourCollection asSet asOrderedCollection shuffled first: numberOfElements
Consider the following code snippet
sample: anInteger from: aCollection using: aGenerator
| sample |
sample := Set new: anInteger.
[sample size = anInteger]
whileFalse: [ | element |
element := aCollection atRandom: aGenerator.
sample add: element].
^sample asArray
Some remarks
Explicit Generator: It explicitly uses a given generator, i.e., an instance of
Random
, which I've calledaGenerator
. For mathematical reasons, if you are getting samples for your application, all of them should use the very same generator across your program. Also this will give you an additional advantage: save and later restore theseed
and you will be able to reproduce a previous "random" behavior of your system, which is good for testing.No check for availability: The code doesn't check that it is possible to get the desired sample, which would be the case if
aCollection
doesn't have at leastanInteger
different elements.Classless code: The method should go to some class.
For example:
Random >> sample: anInteger from: aCollection
| sample |
sample := Set new: anInteger.
[sample size = anInteger]
whileFalse: [ | element |
element := aCollection atRandom: self.
sample add: element].
^sample asArray
UPDATE
Here is another approach:
Random >> remove: anInteger from: aCollection
| sample |
sample := OrderedCollection new: anInteger.
anInteger timesRepeat: [| index element |
index := aCollection size atRandom: self.
element := aCollection removeAt: index.
sample add: element].
^sample
Comment
It usually happens that when we want to sample without repetition we also want to remove the elements from the collection as we randomly pick them. In these cases what often happens is that the collection is known to have no repetitions.
#removeIndex:
- it's meant to be private. Using the public method #removeAt:
instead would have the added benefit of answering the removed element, meaning you can get rid of the extra element
temporary variable (i.e. just do sample add: (aCollection removeAt: index)
). Both those "remove" methods are on OrderedCollection
, so this won't work with other types of collections. –
Dunstan removeIndex:
was private. I've modified the method according to your remark. And yes, the method is only meant to work with OrderCollection
(which should be fine with most practical applications.) –
Hydrogenate This is something that I think looks more or less nice, but is not as efficient as it can be:
yourCollection asSet asOrderedCollection shuffled first: numberOfElements
#atAll:
to pick those randomised indices out of a set (you'd still need #asSet
if you want them distinct). –
Dunstan © 2022 - 2024 — McMap. All rights reserved.
#atAll:
to pick those randomised indices out of a set (you'd still need#asSet
if you want them distinct). – Dunstan