Just wondering, but does your Spark code above work? I thought that Spark won't allow a WHERE
on partition keys (a
and b
in your case), since it uses them under the hood (see last answer to this question): Spark Datastax Java API Select statements
In any case, with the Cassandra Spark connector, you are allowed to stack your WHERE
clauses, and an IN
can be specified with a List<String>
.
List<String> valuesList = new ArrayList<String>();
valuesList.Add("value2");
valuesList.Add("value3");
sc.cassandraTable("test", "cf")
.where("column1 = ?", "value1")
.where("column2 IN ?", valuesList)
.keyBy(new Function<MyCFClass, String>() {
public String call(MyCFClass _myCF) throws Exception {
return _myCF.getId();
}
});
Note that the normal rules of using IN with Cassandra/CQL still apply here.
Range queries function in a similar manner:
sc.cassandraTable("test", "person")
.where("age > ?", "15")
.where("age < ?", "20")
.keyBy(new Function<Person, String>() {
public String call(Person _person) throws Exception {
return _person.getPersonid();
}
});