I'm trying to use the takeSample()
function in Spark and the parameters are - data, number of samples to be taken and the seed. But I don't want to use the seed. I want to have a different answer everytime. I'm not able to figure out how I can do that. I tried using System.nanoTime
as the seed value but it gave an error since I think the data type didn't match. Is there any other function similar to takeSample()
that can be used without the seed? Or is there any other implementation I can use with takeSample()
so that I get a different output every time.
takeSample() function in Spark
System.nanoTime
is of type long
, the seed expected by takeSample
is of type Int
. Hence, takeSample(..., System.nanoTime.toInt)
should work.
System.nanoTime
returns Long, whereas takeSample expects an Int.
You can feed scala.util.Random.nextInt
as a seed value to the takeSample function.
As of Spark version 1.0.0, the seed
parameter is optional. See https://issues.apache.org/jira/browse/SPARK-1438.
© 2022 - 2024 — McMap. All rights reserved.
.toInt
should be prefered over.intValue
– Goodrow