Does CQL3 require a schema for Cassandra now?
Asked Answered
M

4

9

I've just had a crash course of Cassandra over the last week and went from Thrift API to CQL to grokking SuperColumns to learning I shouldn't use them and user Composite Keys instead.

I'm now trying out CQL3 and it would appear that I can no longer insert into columns that are not defined in the schema, or see those columns in a select *

Am I missing some option to enable this in CQL3 or does it expect me to define every column in the schema (defeating the purpose of wide, flexible rows, imho).

Melancholy answered 9/10, 2012 at 2:24 Comment(0)
A
6

Yes, CQL3 does require columns to be declared before used.

But, you can do as many ALTERs as you want, no locking or performance hit is entailed.

That said, most of the places that you'd use "dynamic columns" in earlier C* versions are better served by a Map in C* 1.2.

Amarelle answered 9/10, 2012 at 14:26 Comment(4)
Thanks for pointing to Map. But, what about the "dynamic columns" created before 1.2? I mean, in that case, we'd have to migrate those dynamic columns to maps, right? And if that is true and if we still want to query the columns not defined in the schema before 1.2 in CQL3, how do we do that?Actinology
Yikes: "Most of the places that you'd used 'dynamic columns' in earlier C* versions are better served by a Map in C* 1.2". This is quite frankly bad advice, given that the CQL Collections documentation states this: "Never insert more than 64K items in a collection. If you insert more than 64K items into a collection, only 64K of them will be queryable, resulting in data loss." datastax.com/documentation/cql/3.0/webhelp/cql/cql_using/…Knight
I'm using "dynamic columns" in the sense Arne used it of "I don't want to define every attribute before using it." Most of these cases are indeed best served by a collection. I explore the other use cases more fully at datastax.com/dev/blog/…Amarelle
@Knight 1) link is dead, 2) actually that sounds about perfect to me. I have a use-case where I don't know the column name(s) in advance, and don't want to have to check what column names exist every time a new value is added, and can reasonably assume that there will never be an instance where more than 64K items are needed per row, and have column names that will be highly variable across each row (i.e. sparsely populated when viewed as a 'traditional' table).Abisia
E
5

I suggest you to explore composite columns with "WITH COMPACT STORAGE". A "COMPACT STORAGE" column family allows you to practically only define key columns:

Example:

CREATE TABLE entities_cargo ( entity_id ascii, item_id ascii, qt ascii, PRIMARY KEY (entity_id, item_id) ) WITH COMPACT STORAGE

Actually, when you insert different values from itemid, you dont add a row with entity_id,item_id and qt, but you add a column with name (item_id content) and value (qt content). So:

insert into entities_cargo (entity_id,item_id,qt) values(100,'oggetto 1',3);

insert into entities_cargo (entity_id,item_id,qt) values(100,'oggetto 2',3);

Now, here is how you see this rows in CQL3:

cqlsh:goh_master> select * from entities_cargo where entity_id = 100;

entity_id | item_id | qt

-----------+-----------+----

  100 | oggetto 1 |  3

  100 | oggetto 2 |  3

And how they are if you check tnem from cli:

[default@goh_master] get entities_cargo[100];

=> (column=oggetto 1, value=3, timestamp=1349853780838000)

=> (column=oggetto 2, value=3, timestamp=1349853784172000)

Returned 2 results.

You can access a single column with

select * from entities_cargo where entity_id = 100 and item_id = 'oggetto 1';

Hope it helps

Eberta answered 10/10, 2012 at 7:27 Comment(0)
C
2

Cassandra still allows using wide rows. This answer references that DataStax blog entry, written after the question was asked, which details the links between CQL and the underlying architecture.

Legacy support

A dynamic column family defined through Thrift with the following command (notice there is no column-specific metadata):

create column family clicks
  with key_validation_class = UTF8Type
  and comparator = DateType
  and default_validation_class = UTF8Type

Here is the exact equivalent in CQL:

CREATE TABLE clicks (
  key text,
  column1 timestamp,
  value text,
  PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE

Both of these commands create a wide-row column family that stores records ordered by date.

CQL Extras

In addition, CQL provides the ability to assign labels to the row id, column and value elements to indicate what is being stored. The following, alternative way of defining this same structure in CQL, highlights this feature on DataStax's example - a column family used for storing users' clicks on a website, ordered by time:

CREATE TABLE clicks (
  user_id text,
  time timestamp,
  url text,
  PRIMARY KEY (user_id, time)
) WITH COMPACT STORAGE

Notes

  • a Table in CQL is always mapped to a Column Family in Thrift
  • the CQL driver uses the first element of the primary key definition as the row key
  • Composite Columns are used to implement the extra columns that one can define in CQL
  • using WITH COMPACT STORAGE is not recommended for new designs because it fixes the number of possible columns. In other words, ALTER TABLE ... ADD is not possible on such a table. Just leave it out unless it's absolutely necessary.
Cementite answered 15/1, 2014 at 14:35 Comment(0)
L
0

interesting, something I didn't know about CQL3. In PlayOrm, the idea is it is a "partial" schema you must define and in the WHERE clause of the select, you can only use stuff that is defined in the partial schema BUT it returns ALL the data of the rows EVEN the data it does not know about....I would expect that CQL should have been doing the same :( I need to look into this now. thanks, Dean

Landes answered 9/10, 2012 at 14:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.