Cassandra timing out when queried for key that have over 10,000 rows even after giving timeout of 10sec

CREATE TABLE notificationstore.note ( user_id text, real_time timestamp, insert_time timeuuid, read boolean, PRIMARY KEY (user_id, real_time, insert_time)) WITH CLUSTERING ORDER BY (real_time DESC, insert_time ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"} AND **default_time_to_live** = 20160

[Copying my answer from here because it's the same environment/problem: amazon ec2 - Cassandra Timing out because of TTL expiration.]

You're running into a problem where the number of tombstones (deleted values) is passing a threshold, and then timing out.

You can see this if you turn on tracing and then try your select statement, for example:

cqlsh> tracing on;
cqlsh> select count(*) from test.simple;

 activity                                                                        | timestamp    | source       | source_elapsed
---------------------------------------------------------------------------------+--------------+--------------+----------------
...snip...
 Scanned over 100000 tombstones; query aborted (see tombstone_failure_threshold) | 23:36:59,324 |  172.31.0.85 |         123932
                                                    Scanned 1 rows and matched 1 | 23:36:59,325 |  172.31.0.85 |         124575
                           Timed out; received 0 of 1 responses for range 2 of 4 | 23:37:09,200 | 172.31.13.33 |       10002216

You're kind of running into an anti-pattern for Cassandra where data is stored for just a short time before being deleted. There are a few options for handling this better, including revisiting your data model if needed. Here are some resources:

The cassandra.yaml configuration file - See section on tombstone settings
Cassandra anti-patterns: Queues and queue-like datasets
About deletes

For your sample problem, I tried lowering the gc_grace_seconds setting to 300 (5 minutes). That causes the tombstones to be cleaned up more frequently than the default 10 days, but that may or not be appropriate based on your application. Read up on the implications of deletes and you can adjust as needed for your application.

Recommended topics

Hot tags