I need details from both performance and query aspects, I learnt from some site that only a key can be given when using a columnfamily, if so what would you suggest for my keyspace, I need to use group by, order by, count, sum, ifnull, concat, joins, and some times nested queries.
To answer the original question you posed: a column family and a table are the same thing.
- The name "column family" was used in the older Thrift API.
- The name "table" is used in the newer CQL API.
More info on the APIs can be found here: http://wiki.apache.org/cassandra/API
If you need to use "group by,order by,count,sum,ifnull,concat ,joins and some times nested querys" as you state then you probably don't want to use Cassandra, since it doesn't support most of those.
CQL supports COUNT
, but only up to 10000. It supports ORDER BY
, but only on clustering keys. The other things you mention are not supported at all.
group by
is not valid CQL. You cannot just run random SQL statements and expect them to work. –
Corinthians Refer the document: https://cassandra.apache.org/doc/old/CQL-3.0.html
It specifies that the LRM of the CQL supports TABLE keyword wherever COLUMNFAMILY is supported.
This is a proof that TABLE and COLUMNFAMILY are synonyms.
In cassandra there is no difference between table and columnfamily. they are one concept.
For Cassandra 3+ and cqlsh 5.0.1
To verify, enter into a cqlsh prompt within keyspace (ksp):
CREATE COLUMNFAMILY myTable (
... id text,
... name int
);
And type 'desc myTable'.
You'll see:
CREATE TABLE ksp.myTable (
... id text,
... name int
);
They are synonyms, and Cassandra uses table by default.
here small example to understands concept. A keyspace is an object that holds the column families, user defined types.
Create keyspace University with replication={'class':SimpleStrategy, 'replication_factor': 3};
create table University.student(roll int Primary KEY, dept text, name text, semester int)
'Create table', table 'Student' will be created in the keyspace 'University' with columns RollNo, Name and dept. RollNo is the primary key. RollNo is also a partition key. All the data will be in the single partition.
Key aspects while altering Keyspace in Cassandra
Keyspace Name: Keyspace name cannot be altered in Cassandra.
Strategy Name: Strategy name can be altered by specifying new strategy name.
Replication Factor: Replication factor can be altered by specifying new replication factor. DURABLE_WRITES :DURABLE_WRITES value can be altered by specifying its value true/false. By default, it is true. If set to false, no updates will be written to the commit log and vice versa.
Execution: Here is the snapshot of the executed command "Alter Keyspace" that alters the keyspace strategy from 'SimpleStrategy' to 'NetworkTopologyStrategy' and replication factor from 3 to 1 for DataCenter1.
Column family are somewhat related to relational database's table, with a distribution differences and maybe even idealistic character.
Imaging you have a user entity that might contain 15 column, in a relational db you might want to divide the columns into small-related-column-based struct that we all know as Table. In distributed db such as Cassandra you'll be able to concatenate all those tables entry into a single long row, so if you'll use profiler/ db manager you'll see a single table with 15 columns instead of 2/3 tables. Another interesting thing is that every column family is written to different nodes, maybe on different cluster and be recognized by the row key, meaning that you'll have a single key to all the columns family and won't need to maintain a PK or FK for every table and maintain the relationships between them with 1-1, 1-n, n-n relations. Easy!
© 2022 - 2024 — McMap. All rights reserved.