cassandra Select on indexed columns and with IN clause for the PRIMARY KEY are not supported

Asked 14/9, 2015 at 6:57 Answered 14/9, 2015 at 15:58

In Cassandra, I'm using the cql:

select msg from log where id in ('A', 'B') and filter1 = 'filter'

(where id is the partition key and filter1 is a secondary index and filter1 cannot be used as a cluster column)

This gives the response:

Select on indexed columns and with IN clause for the PRIMARY KEY are not supported

How can I change CQL to prevent this?

Ploss answered 14/9, 2015 at 6:57 Comment(0)

You would need to split that up into separate queries of:

select msg from log where id = 'A' and filter1 = 'filter';

and

select msg from log where id = 'B' and filter1 = 'filter';

Due to the way data is partitioned in Cassandra, CQL has a lot of seemingly arbitrary restrictions (to discourage inefficient queries and also because they are complex to implement).

Over time I think these restrictions will slowly be removed, but for now we have to work around them. For more details on the restrictions, see A deep look at the CQL where clause.

Seaborne answered 14/9, 2015 at 13:54 Comment(0)

Another option, is that you could build a table specifically for this query (a query table) with filter1 as a partition key and id as a clustering key. That way, your query works and you avoid having a secondary index all-together.

aploetz@cqlsh:stackoverflow> CREATE TABLE log 
    (filter1 text, 
          id text, 
         msg text, 
     PRIMARY KEY (filter1, id));
aploetz@cqlsh:stackoverflow> INSERT INTO log (filter1, id, msg) 
                             VALUES ('filter','A','message A');
aploetz@cqlsh:stackoverflow> INSERT INTO log (filter1, id, msg)
                             VALUES ('filter','B','message B');
aploetz@cqlsh:stackoverflow> INSERT INTO log (filter1, id, msg) 
                             VALUES ('filter','C','message C');
aploetz@cqlsh:stackoverflow> SELECT msg FROM log 
                             WHERE filter1='filter' AND id IN ('A','B');

 msg
-----------
 message A
 message B

(2 rows)

You would still be using an "IN" which isn't known to perform well either. But you would also be specifying a partition key, so it might perform better than expected.

Repentance answered 14/9, 2015 at 15:58 Comment(0)

Recommended topics

Hot tags