Apache Cassandra: Unable to gossip with any seeds
Asked Answered
S

17

53

I have built Cassandra server 2.0.3, then run it. It is starting and then stopped with messages:

X:\MyProjects\cassandra\apache-cassandra-2.0.3-src\bin>cassandra.bat >log.txt
java.lang.RuntimeException: Unable to gossip with any seeds
        at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1160)
        at org.apache.cassandra.service.StorageService.checkForEndpointCollision
(StorageService.java:416)
        at org.apache.cassandra.service.StorageService.joinTokenRing(StorageServ
ice.java:608)
        at org.apache.cassandra.service.StorageService.initServer(StorageService
.java:576)
        at org.apache.cassandra.service.StorageService.initServer(StorageService
.java:475)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.ja
va:346)
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon
.java:461)
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.jav
a:504)

What I can change to run it?

Snowdrift answered 19/12, 2013 at 20:23 Comment(0)
O
112

I had a similar problem with my cassandra v2.0.4 cluster running a single node.

Check your cassandra.yaml and make sure that your "listen_address" and "seeds" values match, with the exception that the seeds value requires quotes around it.

Orangeade answered 22/1, 2014 at 22:4 Comment(5)
If you are seeing this when you add a new node to the cluster, in addition to these things, one might want to check if cluster name on newly added node is the same as rest of the machines in the cluster.Ceporah
apart from the listen_address, in the seeds I also had to place the broadcast_address, such as: - seeds: "1.2.3.4,1.2.3.5"Entail
I do not think that the quotes are required "around it". The seeds work without them from what I can tell on my cluster...Twoedged
This answer is confusing and not very helpful for the majority of people searching for this issue that are trying to put together a cluster not with a single node. Its just incidental that those will the same in that case.Cordelia
@Cordelia this answer also helped me, when bringing up the first node after a complete cluster reboot. (don't try at home).Droop
O
26

You might get this problem if your private IP address is different than the public one (like on AWS). For example, the host thinks it's "172.31.0.2" when it's visible as "55.70.33.10".

The solution to this problem is:

listen_address: 172.31.0.2
broadcast_address: 55.70.33.10
Orrery answered 21/8, 2014 at 9:47 Comment(0)
G
9

in cassandra.yaml

  1. Make sure your cluster_name entry match on all the nodes in the cluster (you may need to delete your storage if you changed the cluster name)

  2. Verify that all nodes can ping to each other

  3. broadcast_rpc_address and listen_address should be set to local IP (not localhost or 127.0.0.1)

  4. seeds should point to the IP address of the seed(s)

Gassaway answered 30/3, 2016 at 8:37 Comment(0)
V
6

If you are on AWS and use the Ec2MultiRegionSnitch you will need to set the seeds to the public IP addresses rather than the private IPs.

Varnado answered 21/8, 2014 at 6:31 Comment(0)
C
3

For a quick single node setup on RHEL, I did the following: Get info about your network interface setup:

# /sbin/ifconfig -a

It will list the interfaces and the ip addresses they are attached to. Usually it will show an "Ethernet" interface and a "Local Loopback". Get the associated ip addresses.

Then edit conf/cassandra.yaml:

rpc_address: [Local Loopback address]
broadcast_rpc_address: [Ethernet address]
listen_address: [Local Loopback address]
broadcast_address: [Ethernet address]
listen_on_broadcast_address: true
seed_provider:
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          - seeds: "[Ethernet address]"

Then also, open the correct ports on Linux firewall, being 9042, 7000 and 7001. More info about opening ports on Linux here: http://ask.xmodulo.com/open-port-firewall-centos-rhel.html

Cent answered 9/3, 2017 at 18:39 Comment(0)
I
3

I had the same problem on Ubuntu 16.04. I'm not sure which of these changes made it work, where XXX.XXX.XXX.XXX is your public facing IP address, below are selections from cassandra.yaml

seed_provider:
    # Addresses of hosts that are deemed contact points. 
    # Cassandra nodes use this list of hosts to find each other and learn
    # the topology of the ring.  You must change this if you are running
    # multiple nodes!
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          # seeds is actually a comma-delimited list of addresses.
          # Ex: "<ip1>,<ip2>,<ip3>"
          - seeds: "XXX.XXX.XXX.XXX"


listen_address: XXX.XXX.XXX.XXX
broadcast_address: XXX.XXX.XXX.XXX
broadcast_rpc_address: XXX.XXX.XXX.XXX
listen_on_broadcast_address: true
start_rpc: true
rpc_address: XXX.XXX.XXX.XXX

I also needed to restart my Virtual Machine for some reason. ¯_(ツ)_/¯

Illomened answered 20/2, 2018 at 19:47 Comment(0)
A
2

in cassandra.yaml, I update the seed from domain name to IP address. and it works.

Abb answered 27/3, 2018 at 23:12 Comment(0)
H
1

Happened to me because in my configuration the "intial_token" settings was specified (I think because I just copied to configuration file over from another cluster member). After clearing the data directory, commenting out the setting and restarting the node, it worked fine for me.

Homogenetic answered 3/9, 2014 at 12:41 Comment(0)
T
1

I experienced this error today...

I could not find any reason for the error other than timing issues.

I restarted many times and after a while it sticked. It looks like they expect a bi-directional communication on the gossip channel and if it does not happen quickly enough (which looks like a very small amount of time to me) then they drop the line and generate that error.

In my case I just upgraded my software and restarted the computer. So it was clearly not a connection issue between the computers (I have firewalls and SSL, to complicate matters) and the node was connected before... So the one entry I found in that regard from datastax did not apply...

https://support.datastax.com/hc/en-us/articles/209691483-Bootstap-fails-with-Unable-to-gossip-with-any-seeds-yet-new-node-can-connect-to-seed-nodes

Twoedged answered 27/10, 2016 at 10:53 Comment(0)
T
0

I got the same error. There can be more than one solution. Hope my mistake is what you have done.

I had my localhost IP pointing to some domain name (and I did that in order that my Spring boot application's server context is some domain name like www.example.com:8080 instead of localhost:8080, and I had the following entry in my hosts file on Windows system).

127.0.0.1 www.example.com

While my cassandra batch file was looking for localhost which it didn't find. So, I made another entry for localhost too in my hosts file as:

127.0.0.1  localhost

127.0.0.1  www.example.com

After adding it, I opened new command prompt, ran cassandra batch from the cassandra bin directory and it then worked.

Trickish answered 12/6, 2017 at 11:43 Comment(0)
O
0

Disable the firewall and SELINUX and try again

Outgroup answered 14/6, 2017 at 7:15 Comment(1)
SELinux on a Windows Machine?Finical
D
0

In our case ssl was enabled, and cassandra.yaml configuration looks fine as per above comments. Then we enabled ssl debugging by by adding below jvm paramter in cassandra-env.sh -Djavax.net.debug=ssl:handshake

After starting the node again we noticed below in cassandra log file

MessagingService-Outgoing-geo2_host/xx.xx.xx.xx, Exception while waiting for close javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_unknown

After further investigating the ssl debug logs we got to know that the certificate was not valid. After fixing this ssl issue node was able to join the cluster.

Digit answered 3/5, 2018 at 6:29 Comment(2)
What was the exact issue with the certificate ? I am facing the same issue.Yokoyama
My certificate validity date was in futureDigit
E
0

Thanks to elvingt

His answer just remind me , I need to verify that all node needs to be able to talk to each other.
https://support.datastax.com/hc/en-us/articles/209691483-Bootstap-fails-with-Unable-to-gossip-with-any-seeds-yet-new-node-can-connect-to-seed-nodes

Gossip communications must be bi-directional.

To verify use this commnd, and you need test from BOTH SIDE

nc -vz {your_node_ip} 7000

Then I recollect that I turned on my ubuntu firewall last night. I open it by

sudo ufw allow 7000/tcp

And it is working now

Eisenhart answered 3/4, 2019 at 19:32 Comment(0)
T
0

Getting error during startup/bootstrap

Unable to gossip with any seeds

indicates there is some issue with broadcast_address. broadcast_address is responsible for communication with other nodes not with clients.

This address must be set in seed node(mandatory for seed node), If you are using cloud VMs you might have different IPs(public and private) hence its recommended to use your private IPs for broadcast_address this will save your n/w cost as well.

# Address to broadcast to other Cassandra nodes
# Leaving this blank will set it to the same value as listen_address
broadcast_address: 10.11.xx.xxx

In my scenario I was using IBM and once I set broadcast_address in seed nodes issue got resolved.

Please make sure you are starting your seed node first then other node, this order is mandatory.

Timbrel answered 9/3, 2021 at 13:19 Comment(0)
F
0

in cassandra.yaml changing listen_address value from localhost to domainName solved my issue

Functionary answered 27/12, 2021 at 19:22 Comment(0)
I
0

Context:

One of the 3 seed nodes stopped working suddenly. On restarting the node, it showed as DN on nodetool status. After nodetool removenode <rackid> and restarting the node, running nodetool status on the node, showed it as the standalone node.

Logs outgoing connection to port 7000 was successful but showed error

Unable to gossip with any peers

Issue:

There was Block request entry in IPTables on that particular Node. Incoming request to port 7000 was getting blocked.

Resolution:

remove the IPTables entry.

Innkeeper answered 6/6 at 13:13 Comment(0)
J
-1

I had same issue, I checked port, used tcpdump, netcat to test connections and finally it comes to expired SSL certificates on internode_encryption. I modified internode_encryption to make it 'none', restarted all nodes and it worked. Before all neighbor nodes were down. And node repair command was failing with: "Did not get positive replies from all endpoints" P.S Dont leave internode_encryption as none for a long time, just regenerate certs and enable it back.

Jerry answered 28/6, 2017 at 18:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.