I am selecting all records from cassandra nodes based on token range of my partition key.
Below is the code:
public static synchronized List<Object[]> getTokenRanges(
final Session session) {
if (cluster == null) {
cluster = session.getCluster();
}
Metadata metadata = cluster.getMetadata();
return unwrapTokenRanges(metadata.getTokenRanges());
}
private static List<Object[]> unwrapTokenRanges(Set<TokenRange> wrappedRanges) {
final int tokensSize = 2;
List<Object[]> tokenRanges = new ArrayList<>();
for (TokenRange tokenRange : wrappedRanges) {
List<TokenRange> unwrappedTokenRangeList = tokenRange.unwrap();
for (TokenRange unwrappedTokenRange : unwrappedTokenRangeList) {
Object[] objects = new Object[tokensSize];
objects[0] = unwrappedTokenRange.getStart().getValue();
objects[1] = unwrappedTokenRange.getEnd().getValue();
tokenRanges.add(objects);
}
}
return tokenRanges;
}
getTokenRanges
gives me all token range of vnodes across all nodes.
Then I am using these token range to query cassandra. object[0]
holds start token of vnode and object[1]
end token.
Which generates below query:
SELECT * FROM my_key_space.tablename WHERE token(id)><start token number> AND token(id)<= <end token number>;
In above id
column is partition key.
In Cassandra it is not recommended to perform range queries, So, will this query be performant?
From what I know, this query will call, only the individual partition/vnode and will not call multiple partitions and hence there should not be any performance issue? Is this correct?
Cassandra version: 3.x
tokenRange.unwrap()
. I was thinking this call would divide the token in 2 parts: first part will be last token to MIN_TOKEN and second part will be MIN_TOKEN to first token. Won't this solve the edge case you talked about? So In my case If I have total 64 tokens across nodes, I will get list of tokens which has 65 entries, last 2 being unwrapped. Can you please confirm? – Maurizio