Does collections in CQL3 have certain limits?
Asked Answered
S

5

12

As there are two ways to support wide rows in CQL3..One is to use composite keys and another is to use collections like Map, List and Set. The composite keys method can have millions of columns (transposed to rows).. This is solving some of our use cases.

However, if we use collections, I want to know if there is a limit that the collections can store a certain number/amount of data (Like earlier with Thrift C* supports up-to 2 billion columns in a row)

Sidelong answered 2/9, 2013 at 12:38 Comment(1)
where is the documentation on the using composite keys as we need the millions of columns in one use case we have?Padus
O
15

Apart from the performance issue, there is a protocol issue which limits the number of items you can access to 65536.

http://mail-archives.apache.org/mod_mbox/cassandra-user/201305.mbox/%3CCAENxBwx6pcSA=cWn=dKW_52K5odw5F3Xigj-zn_4BwFth+4ruA@mail.gmail.com%3E

Outrun answered 3/9, 2013 at 9:56 Comment(2)
Interesting, I was not aware of that limitation from the protocol limiting the size to 65536. Thanks for the infoWhoredom
CASSANDRA-5428 raised this limit to 2B (2^31) in Cassandra 2.1 when a client connects using native protocol v3.Taxiplane
W
20

It is strongly recommended to store only a limited amount of data in collections & maps.

The reasons:

  1. Collections and maps are fetched as a whole, entirely. You can not "slice" on collections so putting lots of data in collections/maps will have impact on perf when reading them

  2. The CQL3 implementation of Lists is not performant for insertion/removal in the middle of the list. For append/prepend operations, it's quite fast. For insertion/removal element at index i, it will require a read-before-write. Basically, part of the list will be re-written because they need to be shifted to the good index

  3. Insertion/removal for Set and Map are more performant since they use the column key for storage/sorting/indexing

Now to answer to your question, is there a hard limit on the number of elements in a collection/map, the answer is no, technically there is no limit other than the classical 2 billions limit that already exist in Thrift yes, it is limited to 65536 as mentioned above by GlynD.

The related JIRA CASSANDRA-5428

Whoredom answered 2/9, 2013 at 13:47 Comment(1)
CASSANDRA-5428 was resolved in Cassandra >= 2.1 when a client connects with native protocol v3. Under those conditions, your original statement is correct as the protocol uses the same size limit as the underlying storage layer (# cells / partition <= 2^31).Taxiplane
O
15

Apart from the performance issue, there is a protocol issue which limits the number of items you can access to 65536.

http://mail-archives.apache.org/mod_mbox/cassandra-user/201305.mbox/%3CCAENxBwx6pcSA=cWn=dKW_52K5odw5F3Xigj-zn_4BwFth+4ruA@mail.gmail.com%3E

Outrun answered 3/9, 2013 at 9:56 Comment(2)
Interesting, I was not aware of that limitation from the protocol limiting the size to 65536. Thanks for the infoWhoredom
CASSANDRA-5428 raised this limit to 2B (2^31) in Cassandra 2.1 when a client connects using native protocol v3.Taxiplane
T
6

The revised non-frozen collection-related limits, after CASSANDRA-5428 was resolved in version 2.1 and when using version 3 or later of the native protocol, are:

======+==========+================+================
 TYPE | SIZE     | # KEYS         | VALUE SIZE
======+==========+================+================
 List | 2B (231)  | n/a             | 65,535 (216-1)  
 Set  | 2B (231)  | n/a             | 65,535 (216-1)  
 Map  | 2B (231)  | 65,535 (216-1)  | 65,535 (216-1)  
======+==========+================+================

Clients connecting via Thrift and earlier versions of the C* native protocol are still limited by those respective transports.

Taxiplane answered 11/12, 2015 at 5:47 Comment(0)
F
4

In addition to the limitation of 64k items in a collection, from http://www.datastax.com/documentation/cql/3.1/cql/cql_using/use_collections_c.html:

These are the TWO limitations:

Maximum size of an item is limited to 64k (max value of an unsinged short)

Number of items in collections are limited to 64K (max value of an unsinged short)

Fourthly answered 26/8, 2014 at 19:32 Comment(0)
S
0

Also collections are serialized so this adds an overhead. See CASSANDRA-5428 as well.

Schoolbag answered 27/10, 2014 at 10:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.