What is meant by a node in cassandra?
Asked Answered
C

4

14

I am new to Cassandra and I want to install it. So far I've read a small article on it.

But there one thing that I do not understand and it is the meaning of 'node'.

Can anyone tell me what a 'node' is, what it is for, and how many nodes we can have in one cluster ?

Caprification answered 11/2, 2015 at 14:35 Comment(1)
Somewhat related: #28196940Agripinaagrippa
C
15

A node is the storage layer within a server.

Newer versions of Cassandra use virtual nodes, or vnodes. There are 256 vnodes per server by default.

A vnode is essentially the storage layer.

  • machine: a physical server, EC2 instance, etc.
  • server: an installation of Cassandra. Each machine has one installation of Cassandra. The Cassandra server runs core processes such as the snitch, the partitioner, etc.
  • vnode: The storage layer in a Cassandra server. There are 256 vnodes per server by default.

Helpful tip:

Where you will get confused is that Cassandra terminology (in older blog posts, YouTube videos, and so on) had been used inconsistently. In older versions of Cassandra, each machine had one Cassandra server installed, and each server contained one node. Due to the 1-to-1-to-1 relationship between machine-server-node in old versions of Cassandra people previously used the terms machine, server and node interchangeably.

Clone answered 12/2, 2015 at 3:0 Comment(0)
P
8

Cassandra is a distributed database management system designed to handle large amounts of data across many commodity servers. Like all other distributed database systems, it provides high availability with no single point of failure.

You may got some ideas from the description of above paragraph. Generally, when we talk Cassandra, we mean a Cassandra cluster, not a single PC. A node in a cluster is just a fully functional machine that is connected with other nodes in the cluster through high internal network. All nodes work together to make sure that even if one of them failed due to unexpected error, they as a whole cluster can provide service.

All nodes in a Cassandra cluster are same. There is no concept of Master node or slave nodes. There are multiple reason to design like this, and you can Google it for more details if you want.

Theoretically, you can have as many nodes as you want in a Cassandra cluster. For example, Apple used 75,000 nodes served Cassandra summit in 2014.

Of course you can try Cassandra with one machine. It still work while just one node in this cluster.

Pochard answered 11/2, 2015 at 14:49 Comment(4)
Where did you hear about the 75K node cluster? Is anyone keeping a list of the biggest clusters?Adze
I don't know if there is such a list. I am a Ph.D. student right now, and I keep my eyes on tech news. This is I got from a news last years. I was so astonished by this number that I remember now. I just googled it, and there are some news about it. For example, opensourceconnections.com/blog/2014/09/17/cassandra-summit-2014Pochard
Cool, thanks. They have multiple clusters, their largest being 1,000 nodes. I wonder what the biggest single cluster is so far.Adze
Yes. They have multiple clusters. I don't really know the number of a single cluster. There are always trade-offs between size of a cluster and efficiency, mainly due to the way to connect them together. How to inter-connect nodes in a cluster is the main research interesting of distributed system research community.Pochard
J
2

What is meant by a node in cassandra?

Cassandra Node is a place where data is stored.

Data center is a collection of related nodes.

A cluster is a component which contains one or more data centers. In other words collection of multiple Cassandra nodes which communicates with each other to perform set of operation.

  • In Cassandra, each node is independent and at the same time interconnected to other nodes.
  • All the nodes in a cluster play the same role.
  • Every node in a cluster can accept read and write requests, regardless of where the data is actually located in the cluster.
  • In the case of failure of one node, Read/Write requests can be served from other nodes in the network. enter image description here
Jessabell answered 7/2, 2020 at 19:42 Comment(0)
C
-1

If you're looking to understand Cassandra terminology, then the following post is a good reference:

http://exponential.io/blog/2015/01/08/cassandra-terminology/

Clone answered 12/2, 2015 at 2:52 Comment(3)
there is no reference about node in the link you sharedGauvin
The referenced link does not use the term node as its ambiguous in much of the older documentation and blog posts. My answer above describes a node in more detail.Clone
true, no reference to node, but nonetheless a very useful reference for neophytesNeighboring

© 2022 - 2024 — McMap. All rights reserved.