Cassandra Tombstoning warning and failure thresholds breached

Asked 10/3, 2015 at 19:31 Answered 18/4, 2017 at 18:56

We are running a Titan Graph DB server backed by Cassandra as a persistent store and are running into an issue with reaching the limit on Cassandra tombstone thresholds that is causing our queries to fail / timeout periodically as data accumulates. It seems like the compaction is unable to keep up with the number of tombstones being added.

Our use case supports:

High read / write throughputs.
High sensitivity to reads.
Frequent updates to node values in Titan. causing rows to be updated in Cassandra.

Given the above use cases, we are already optimizing Cassandra to aggressively do the following:

Aggressive compaction by using the levelled compaction strategies
Using tombstone_compaction_interval as 60 seconds.
Using tombstone_threshold to be 0.01
Setting gc_grace_seconds to be 1800

Despite the following optimizations, we are still seeing warnings in the Cassandra logs similar to: [WARN] (ReadStage:7510) org.apache.cassandra.db.filter.SliceQueryFilter: Read 0 live and 10350 tombstoned cells in .graphindex (see tombstone_warn_threshold). 8001 columns was requested, slices=[00-ff], delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}

Occasionally as time progresses, we also see the failure threshold breached and causes errors.

Our cassandra.yaml file has the tombstone_warn_threshold to be 10000, and the tombstone_failure_threshold to be much higher than recommended at 250000, with no real noticeable benefits.

Any help that can point us to the correct configurations would be greatly appreciated if there is room for further optimizations. Thanks in advance for your time and help.

Borecole answered 10/3, 2015 at 19:31 Comment(5)

Are you frequently deleting data? It is my understanding that tombstones are not created unless data is explicitly deleted or expires. – Rivkarivkah 10/3, 2015 at 19:55

Our belief is that Titan GraphDb which handles all our interactions with Cassandra internally might be doing deletes and new creates for every update, which is adding to the number of deletes. – Borecole 10/3, 2015 at 20:3

That would be good to confirm if that were the case. Could you enable probabilistic tracing (datastax.com/documentation/cassandra/2.0/cassandra/tools/…) on one of your cassandra nodes to see what the deletes are? Another possibility is that columns are being expired (set with a TTL), do you think that could be happening here as well? – Rivkarivkah 10/3, 2015 at 20:14

I will try this today. Thanks again for the pointers. – Borecole 11/3, 2015 at 14:31

@Borecole came across this post today. Should help you understand when tombstones are created. groups.google.com/forum/#!msg/aureliusgraphs/XMG7DKkAll0/… – Swim 20/3, 2015 at 2:24

Sounds like the root of your problem is your data model. You've done everything you can do to mitigate getting TombstoneOverwhelmingException. Since your data model requires such frequent updates causing tombstone creation a eventual consistent store like Cassandra may not be a good fit for your use case. When we've experience these types of issues we had to change our data model to fit better with Cassandra strengths.

About deletes http://www.slideshare.net/planetcassandra/8-axel-liljencrantz-23204252 (slides 34-39)

Swim answered 10/3, 2015 at 20:37 Comment(2)

Thanks Curtis. I will take a look at this and see if there are changes that we can make with the data model. Part of the problem is that with the use of Titan graphing server the data model is abstracted out from us. – Borecole 11/3, 2015 at 14:30

Hi @Borecole You can get an idea of how titan persists the data from here . Basically what we've had to do is minimize deleted vertices and edges. – Swim 11/3, 2015 at 15:47

Tombstones are not compacted away until the gc_grace_seconds configuration on a table has elapsed for a given tombstone. So even increasing your compaction interval your tombstones will not be removed until gc_grace_seconds has elapsed, with the default being 10 days. You could try tuning gc_grace_seconds down to a lower value and do repairs more frequently (usually you want to schedule repairs to happen every gc_grace_seconds_in_days - 1 days).

Rivkarivkah answered 10/3, 2015 at 19:54 Comment(1)

Thanks for getting back Andy. Good point that you mentioned. We are setting the Gc grace seconds to be 1800 too. I edited my post to reflect that attempt on our part as well. – Borecole 10/3, 2015 at 20:2

So everyone here is right. If you repair and compact frequently you an reduce your gc_grace_seconds number.

It may also however be worth considering that Inserting Nulls is equivalent to a delete. This will increase your number of tombstones. Instead you'll want to insert the UNSET_VALUE if you're using prepared statements. Probably too late for you, but if anyone else comes here.

Syllabary answered 18/4, 2017 at 18:56 Comment(1)

This is a CRITICALLY IMPORTANT fact, big thank you! Null fields dramatically impact performance as it causes tombstones! I have sold my problem from this. I had asked this question #56126482 – Meniscus 14/5, 2019 at 13:49

The variables you've tuned are helping you expire tombstones, but it's worth noting that while tombstones can not be purged until gc_grace_seconds, Cassandra makes no guarantees that tombstones WILL be purged at gc_grace_seconds. Indeed, tombstones are not compacted until the sstable containing the tombstone is compacted, and even then, it will not be eliminated if there is another sstable containing a cell that is shadowed.

This results in tombstones potentially persisting a very long time, especially if you're using sstables that are infrequently compacted (say, very large STCS sstables). To address this, tools exist such as the JMX endpoint to forceUserDefinedCompaction - if you're not adept at using JMX endpoints, tools to do this for you automatically exist such as http://www.encql.com/purge-cassandra-tombstones/

Jonas answered 25/4, 2015 at 20:57 Comment(0)

Recommended topics

Hot tags