Wide column vs column family vs columnar vs column oriented DB definition
Asked Answered
H

1

18

There's a lot of confusion among these terms. I'd like to throw my understanding out and see if people agree. I have seen conflicting and wrong definitions all over the web.

In my mind, wide column and column family DBs are essentially the same thing. They are

  1. the data is organized logically by a group of key-value pairs (each one called column);
  2. is identified by a unique row key;
  3. each row can have variable length or definition of columns and
  4. stored on disk one row after another. So column family (wide column) table is similar to relational DB's table in that they are organized as rows still.

The main difference is they don't have fixed schema for columns and can't do table join obviously.

An example of 3 rows (column families): each row has different length and/or columns, but on disk rowkey1's entire content is a continuous line followed by other rows similar to relational DB

rowkey1 k1-v k2-v k3-v

rowkey2 k1-v k4-v

rowkey3 k2-v k4-v k5-v

On the other hand, the term columnar DB is the same as column-oriented DB. They are stored on disk one column at a time, not one row at a time. It is great for time series or any multi series analytical purpose. The fact each column has the same type of data and is stored together allows for better data compression as an added bonus.

an example:

enter image description here

on disk:

a:1 b:2 c:3 d:4

10:1 9:2 8:3 7:4

Hyphenated answered 30/7, 2020 at 18:52 Comment(0)
P
8

The definition from Wikipedia also helps further:

Wide-column stores such as Bigtable and Apache Cassandra are not column stores in the original sense of the term, since their two-level structures do not use a columnar data layout. In genuine column stores, a columnar data layout is adopted such that each column is stored separately on disk. Wide-column stores do often support the notion of column families that are stored separately. However, each such column family typically contains multiple columns that are used together, similar to traditional relational database tables. Within a given column family, all data is stored in a row-by-row fashion, such that the columns for a given row are stored together, rather than each column being stored separately. Wide-column stores that support column families are also known as column family databases.

Reference: https://en.wikipedia.org/wiki/Wide-column_store

Phillipphillipe answered 2/1, 2021 at 5:57 Comment(2)
Wikipedia has condradictory answers.en.m.wikipedia.org/wiki/Wide-column_store says Wide Column store = Column Oriented db. en.m.wikipedia.org/wiki/Column-oriented_DBMS says Column Oriented db = Columnar dbPicrate
[continued].. en.m.wikipedia.org/wiki/Wide-column_store also says Wide Colimm stores are not Columnar storesPicrate

© 2022 - 2024 — McMap. All rights reserved.