Why don't you start off with a "single & small" Cassandra server as you usually do it with MySQL?
Asked Answered
W

4

33

For any website just starting out, the load initially is minimal & grows with a slow pace initially. People usually start with their MySQL based sites with a single server(***that too a VPS not a dedicated server) running as both app server as well as DB server & usually get too far with this setup & only as they feel the need they separate the DB from the app server giving it a separate VPS server. This is what a start up expects the things to be while planning about resources procurement.

But so far what I have seen, it's something very different with Cassandra. People usually recommend starting out with atleast a 3 node cluster, (on dedicated servers) with lots & lots of RAM. 4GB or 8GB RAM is what they suggest to start with. So is it that Cassandra requires more hardware resources in comparison to MySQL, for a website to deliver similar performance, serve similar load/ traffic & same amount of data. I understand about higher storage requirements of Cassandra due to replication but what about other hardware resources ?

Can't we start off with Cassandra based apps just like MySQL. Starting with 1 or 2 VPS & adding more whenever there's a need ?

Edit:

I don't want to compare apples with oranges. I just want to know how much more dangerous situation I may be in when I start out with a single node VPS based cassandra installation Vs a single node VPS based MySQL installation. Difference between these two situations. Are cassandra servers more prone to be unavailable than MySQL servers ? What is bad if I put tomcat too along with Cassandra as people use LAMP stack on single server.

Wayless answered 27/8, 2013 at 10:6 Comment(0)
P
20

TL;DR;
You can even start with a single node, but you loose the highly available factor of c*.

Cassandra is built for systems that handle huge volumes of data, terabytes and in some cases petabytes. Many users typically switch from MySQL (and lots of other RDBMS) to Cassandra once they find that their current DB system can't handle the data load efficiently (querying gets slow, managing storage becomes challenging etc.)


Why 4-8GB gb of ram?

The 4-8 GB of ram is to do with the JVM and the size of ram on efficient garbage collection. The advice is stating not that you should start on 8 GB, but hat you shouldn't have more than 8GB

This doesn't mean to say that you cant use Cassandra to start up a single node on a very basic machine (some people actually have cassandra running on a raspberry pi).


Why do people recommend 3 nodes?

Availability is one of cassandra's main selling points. If you have 2 nodes with RF=2 then you cant perform writes if a single node goes down. If you have 3 nodes you can still perform both reads and writes.

Plot answered 27/8, 2013 at 11:48 Comment(2)
Thanks Lyuben, I know all these details as I have been working on an app with Cassandra for over an year. So as you say, many switch to Cassandra after they have grown big, so how should beginners start? Start with mysql & then migrate?! Is it not made to work with smaller apps?? Or if I do use cassandra from start, in the hope that 1 day I'll go big, is it not over-engineering, am I not burdening myself from the beginning due to some over-ambitious hopes??Wayless
@user01 Start with mysql & then migrate Definitely not... I explained why people give the recomendations that they do above. You should start with a small cluster, if you dont have 3 machines thats fine, when (and if) you need to grow its easy to add new nodes, you just need to re-balance your cluster once you have added nodes read here but you can start with a single node and add capacity as you need it!Plot
E
15

The short answer is you absolutely can start with a single, small node.

What I think other people are getting at with suggesting you not do that is that you learn different things depending on how you configure your system.

A single node does not have high availability but if you're just starting to experiment with Cassandra then that's probably a non-issue. You won't get much exposure to how to do backups, how to tune things and obviously how to fail over...but in your case you likely don't care.

You will be able to learn about coding with and for Cassandra and if you're coming from a traditional RDBMS that's a much bigger and more important hurdle.

See if you like the data model. See if you like schema-free design. If you get past all that you can then worry about how to scale up.

WRT your other question: a single node Cassandra cluster, even running on a small machine, even if its sharing that machine with other services should be no more "dangerous" than running MySQL in a similar configuration.

Epideictic answered 18/9, 2013 at 16:29 Comment(0)
G
12

People usually recommend starting out with atleast a 3 node cluster, (on dedicated servers) with lots & lots of RAM. 4GB or 8GB RAM is what they suggest to start with.

Cassandra hardware recommendations are usually for people who will have 100's of GB of data. You can get away with having less if you don't have a lot of data. You can tune the JVM down to only using a 512 MB or 1 GB heap in the cassandra-env.sh.

Can't we start off with Cassandra based apps just like MySQL. Starting with 1 or 2 VPS & adding more whenever there's a need?

Yes you can. But, if you want to get the most out of Cassandra you definitely want to start with at least two servers, three if you need be able to use QUORUM for consistency, and still support one node going down.

While I have never run a production system on servers that small, I have run a continuously available QA cluster on VM's with 4 GB RAM and 2 Cores. And for small data sizes I have seen others run clusters on as little as 2 GB of RAM.

The nice thing about Cassandra is that when you do need more, it is very easy to add new nodes to the cluster. And if you want to move your cluster to more powerful hardware, instead of just adding more, you can easily add the new bigger boxes, then remove the older small ones.

Update:
Here is a recent blog post about getting Cassandra to run with a 64 MB heap:

Greek answered 28/8, 2013 at 15:28 Comment(2)
that's a bit encouraging. Thanks! Though I never saw people talk about starting with these configuration in practical.Wayless
That is because most people considering Cassandra are considering it because they out grew something else. So very few people would bother starting that small.Greek
T
0

In response to the last part of your question

"Can't we start off with Cassandra based apps just like MySQL. Starting with 1 or 2 VPS & adding more whenever there's a need ?"

You could definitely start by writing apps on Cassandra. I built a banking application on top of cassandra and it worked well. I had a 6 node cluster and used Cassandra 1.1.Cassandra has tunable data consistency which varies from very strong consistency (transactions support) and eventual consistency.

You could definitely start with one VPS and scale up as you need. Cassandra is scalable and adding new nodes results in linear increase in performance.

For more you can watch this video :

http://www.youtube.com/watch?v=5qEoEAfAer8

Helpful links :

http://www.datastax.com/docs/1.1/initialize/cluster_init

http://www.datastax.com/2012/01/how-to-set-up-and-monitor-a-multi-node-cassandra-cluster-on-linux

Tartaric answered 27/8, 2013 at 14:48 Comment(6)
did you also started with single node? VPS ? What was your starting hardware setup ?Wayless
I started with a single node and eventually scaled up. It is not a VPS. It was a 6 node commodity cluster with 700 GB RAM and and 100 TB storage capacity.Tartaric
what was the hardware config of your first node ? Also, how long did it worked for you in production, until you added more servers ?Wayless
did you started with dedicated servers from the beginning itself ?Wayless
i started with dedicated servers as we wanted to experiment with cassandra and test it if it suited our purposeTartaric
could you please also respond to my previous comment ie, ur hardware config in beginning & how long did you run it before adding more servers ?Wayless

© 2022 - 2024 — McMap. All rights reserved.