Cassandra LeveledCompactionStrategy and high SSTable number per read

We are using cassandra 2.0.17 and we have a table with 50% selects, 40% of updates and 10% of inserts (no deletes).

To have high read performance for such table we found that it is suggested to use LeveledCompactionStrategy (it is supposed to guarantee that 99% of reads will be fulfilled from single SSTable). Every day when I run nodetool cfhistograms i see more and more SSTtables per read. First day we had 1, than we had 1,2,3 ...
and this morning I am seeing this:

ubuntu@ip:~$ nodetool cfhistograms prodb groups | head -n 20                                                                                                                                
prodb/groups histograms

SSTables per Read
1 sstables: 27007
2 sstables: 97694
3 sstables: 95239
4 sstables: 3928
5 sstables: 14
6 sstables: 0
7 sstables: 19

The describe groups returns this:

CREATE TABLE groups (
  ...
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=172800 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};

Is it normal? In such case we loose the advantage of using LeveledCompaction which as described in the documentation should guarantee 99% of reads from single sstable.

It does depend on the usecase - but as a rule of thumb I normally look at LCS for 90% read to 10% write ratio. From your description you're looking at 50/50 at best.

The additional compaction demands placed by LCS makes it pretty io hungry. It's highly likely that compaction is backed up and your levels are not balanced. The easiest way to tell is to run nodetool cfstats for the table in question.

You're looking for the line:

SSTables in each level: [2042/4, 10, 119/100, 232, 0, 0, 0, 0, 0]

The numbers in the square brackets shows how many sstables are in each level. [L0, L1, L2 ...]. The number after the slash is the ideal level. As a rule of thumb L1 should be 10, L2 100, L3 1000 etc.

New sstables go in at L0 and then gradually move up. You can see the above example is in a really bad state. We've still got 2000 sstables to process more than exists in all other levels. The performance here will be massively worse than if I'd just used STCS.

Nodetool cfstats makes it pretty easy to measure if LCS is keeping up with your usecase. Just dump out the above every 15 minutes throughout the day. Any time your levels are unbalanced the read performance will suffer. If it's constantly behind you probably want to switch to STCS. If it spikes for say 10 minutes when you data load but the rest of the day is fine - then you may decide to live with it. If it never goes out of balance - stick with LCS - it's totally working for you.

As a side note - 2.1 allows L0 to carry out STCS style merging which will help in the situation where you have a temporary spike. If you're in the ten minute scenario above - it's almost certainly worth an upgrade.

Recommended topics

Hot tags