I read an interesting article about Netflix and their Cassandra installation.
They mention the fact that they used their Gorilla system to take down 33% of their Cassandra cluster and see that their systems were still working as expected.
They have some 2,000 Cassandra nodes and took 33% down. This means, 1 out of 3 nodes are gone. (About 660 nodes for Netflix)
If you are really unlucky, all the connections you specified are part of the 660 nodes... Ouch.
Chances are, though, that if you use just enough nodes and never expect a dramatic event to where more than 33% of your network goes down, then you should be able to use a pretty small number, such as 6 nodes because with such a number, you should always hit at least 4 that are up...
Now, it should certainly be chosen strategically if possible. That is, if you choose 6 nodes all in the same rack when you have 6 different racks, you probably chose wrong. Instead, you probably want to specify 1 node per rack. (That's once you grow that much, of course.)
Note that if you have a Replication Factor of 5 and 33% of your Cassandra nodes go down, you're in trouble anyway. In that situation, many nodes cannot access the database in a QUORUM manner. Notice that Netflix talks about that. Their replication factor is just 3! (i.e. 1/3 = 0.33
, and 1/5 = 0.2
so 20% which is less than 33%.)
Finally, I do not know the Java driver, I use the C++ one. When it fails, I am told. So what I can do is try with another set of IPs if necessary, until it works... My system has one connection that stays up between client accesses, so this is a one time process and I can relay the fact that this server is connected to Cassandra and thus can accept client connections. If you reconnect to Cassandra each time a client sends you a request, it may be wise to not send many IPs at all.