Iterating through Cassandra wide row with CQL3
Asked Answered
E

4

10

How can I pull in a range of Composite columns with CQL3?

Consider the following:

CREATE TABLE Stuff (
    a int,
    b text,
    c text,
    d text,
    PRIMARY KEY (a,b,c)
);

In Cassandra what this effectively does is creates a ColumnFamily with integer rows (values of a) and with CompositeColumns composed of the values of b and c and the literal string 'd'. Of course this is all covered up by CQL3 so that we will think that we're inserting into individual database rows... but I digress.

And consider the following set of inputs:

INSERT INTO Stuff (a,b,c,d) VALUES (1,'A','P','whatever0');
INSERT INTO Stuff (a,b,c,d) VALUES (1,'A','Q','whatever1');
INSERT INTO Stuff (a,b,c,d) VALUES (1,'A','R','whatever2');
INSERT INTO Stuff (a,b,c,d) VALUES (1,'A','S','whatever3');
INSERT INTO Stuff (a,b,c,d) VALUES (1,'A','T','whatever4');
INSERT INTO Stuff (a,b,c,d) VALUES (1,'B','P','whatever5');
INSERT INTO Stuff (a,b,c,d) VALUES (1,'B','Q','whatever6');
INSERT INTO Stuff (a,b,c,d) VALUES (1,'B','R','whatever7');
INSERT INTO Stuff (a,b,c,d) VALUES (1,'B','S','whatever8');
INSERT INTO Stuff (a,b,c,d) VALUES (1,'B','T','whatever9');

In my current use case, I want to read all of the values of Stuff, n values at a time. How do I do this? Here's my current take using n=4:

SELECT * FROM Stuff WHERE a=1 LIMIT 4;

And as expected I get:

 a | b | c | d
---+---+---+-----------
 1 | A | P | whatever0
 1 | A | Q | whatever1
 1 | A | R | whatever2
 1 | A | S | whatever3

The trouble that I run into is how do I get the next 4? Here is my attempt:

SELECT * FROM Stuff WHERE a=1 AND b='A' AND c>'S' LIMIT 4;

This doesn't work because we've constrained b to equal 'A' - which is a reasonable thing to do! But I've found nothing in the CQL3 syntax that allows me to keep iterating anyway. I wish I could do something like:

SELECT * FROM Stuff WHERE a=1 AND {b,c} > {'A','S'} LIMIT 4;

How do I achieve my desired result. Namely, how do I make CQL3 return:

 a | b | c | d
---+---+---+-----------
 1 | A | T | whatever0
 1 | B | P | whatever1
 1 | B | Q | whatever2
 1 | B | R | whatever3
Expropriate answered 15/7, 2013 at 21:40 Comment(1)
Note to self: someone else who has the same problem.Expropriate
I
5

Auto paging is done https://issues.apache.org/jira/browse/CASSANDRA-4415, it's release to Cassandra 2.0.1

Isaak answered 8/11, 2013 at 0:45 Comment(1)
Just for my understanding, i have installed cassnadra 2.0.6 and using cqlsh i tried to do a select query on a CF(which has about 20K records) without giving limit , the result of my query is 10K and displays a message "Default LIMIT of 10000 was used. Specify your own LIMIT clause to get more results." .. Then where is the auto pagination, or is that something wrong am doing/understanding.Pitchford
E
4

After reading through CQL3 document I have not found a way to achieve the desired effect.

You can however fake the desired effect with a series of CQL queries. Consider that I want to page though items in the above model 4 at a time. It's easy enough to get the first 4:

SELECT * FROM a = 1 LIMIT 4;

However there is no way to get the next 4 in a single query. But I can do it piecewise. The last item from the above query is

 a | b | c | d
---+---+---+-----------
 1 | A | S | whatever3

So I can issue a query to start from here and get everything until the next value of b:

SELECT * FROM a = 1 WHERE b='A' and c>'S' LIMIT 4;

And in this case I'll get a single CQL3 row:

 a | b | c | d
---+---+---+-----------
 1 | A | T | whatever4

(Now, if I'd gotten 4 rows, I would hit the limit and I would start again next time with the last element of that set. But for now I just have one row.) So, to get the rest I iterate from that point and get the remaining 3 rows:

SELECT * FROM a = 1 WHERE b > 'A' LIMIT 3;

And I continue on with this same algorithm until I've incrementally scanned as far as I please.

In the example above the PRIMARY KEY is composed of 3 elements meaning that under CQL in Cassandra the column names are CompositeColumns of 2 elements (...well basically, but the difference doesn't matter here). And because the CompositeColumns are of 2 elements you have to make 2 queries as I've demoed here. In general though, if the PRIMARY KEY is of n elements, then you will have to make n-1 queries to fake a scan of the CQL table (a.k.a Cassandra row).


Update: Indeed, CQL3 doesn't have a server side cursor, (see the "CQL3 pagination" section here), and if you wanted to fake it, you'd have to use something described above (read further on that link so see my basic idea elaborated by the post's author).

However, there is a JIRA issue regarding the server-side cursor which will be available in Cassandra 2 and which is already present in Cassandra 2 Beta.

There is also a related JIRA issue that would make it much easier to implement the client-side cursor as I've hinted at above. But it stands unresolved.


Update2: JIRA issue is now fixed.

You can now query using tuple/vector syntax WHERE (c1, c2) > (1, 0)

Expropriate answered 16/7, 2013 at 18:33 Comment(1)
I do not recommend using the LIMIT clause. Use your primary keys instead, maybe with the help of a meta table. I experience severe performance drops, which I assume is due to the nature of distributed databases - since it needs a concerted effort to determine when the LIMIT is reached, someone has to count all rows before returning the resultset..Compo
W
0

What you try to do is to get pagination stuff in Cassandra. CQL3 doesn't support this. You should create a column which is suitable for comparison, i.e. for less than, greater than operations and this column should form increasing/decreasing sequence. Indeed, as jorgebg noticed above, concatenation of b+c would fit this.

Wearisome answered 16/7, 2013 at 14:25 Comment(0)
S
-1

select * from stuff where a = 1 and (b,c) > ('A','S') limit 4;

Stringent answered 26/5, 2015 at 19:33 Comment(1)
Generally, answers are much more helpful if they include an explanation of what the code is intended to do, and why that solves the problem without introducing others. (This post was flagged by at least one user, presumably because they thought an answer without explanation should be deleted.)Lowminded

© 2022 - 2024 — McMap. All rights reserved.