gmr/RabbitMQ-in-Depth

Secondary node name mismatch in Vagrant file?

gevgev opened this issue · 3 comments

Trying to bring the secondary node up, per instructions, running 'vagrant up secondary' command, getting the following error:

==> secondary: Running provisioner: shell...
    secondary: Running: inline script
    secondary: Stopping node rabbit@secondary ...
    secondary: Error: unable to connect to node rabbit@secondary: nodedown
    secondary:
    secondary: DIAGNOSTICS
    secondary: ===========
    secondary:
    secondary: attempted to contact: [rabbit@secondary]
    secondary:
    secondary: rabbit@secondary:
    secondary:   * connected to epmd (port 4369) on secondary
    secondary:   * epmd reports node 'rabbit' running on port 25672
    secondary:   * TCP connection succeeded but Erlang distribution failed
    secondary:   * suggestion: hostname mismatch?
    secondary:   * suggestion: is the cookie set correctly?
    secondary:
    secondary: current node details:
    secondary: - node name: 'rabbitmq-cli-1781@secondary'
    secondary: - home dir: /var/lib/rabbitmq
    secondary: - cookie hash: H6gXPXlo+GZy8pWfFX3Ynw==
    secondary: Resetting node rabbit@secondary ...
    secondary: Error: unable to connect to node rabbit@secondary: nodedown
    secondary:
    secondary: DIAGNOSTICS
    secondary: ===========
    secondary:
    secondary: attempted to contact: [rabbit@secondary]
    secondary:
    secondary: rabbit@secondary:
    secondary:   * connected to epmd (port 4369) on secondary
    secondary:   * epmd reports node 'rabbit' running on port 25672
    secondary:   * TCP connection succeeded but Erlang distribution failed
    secondary:   * suggestion: hostname mismatch?
    secondary:   * suggestion: is the cookie set correctly?
    secondary:
    secondary: current node details:
    secondary: - node name: 'rabbitmq-cli-1832@secondary'
    secondary: - home dir: /var/lib/rabbitmq
    secondary: - cookie hash: H6gXPXlo+GZy8pWfFX3Ynw==
    secondary: Clustering node rabbit@secondary with rabbit@primary ...
    secondary: Error: unable to connect to node rabbit@secondary: nodedown
    secondary:
    secondary: DIAGNOSTICS
    secondary: ===========
    secondary:
    secondary: attempted to contact: [rabbit@secondary]
    secondary:
    secondary: rabbit@secondary:
    secondary:   * connected to epmd (port 4369) on secondary
    secondary:   * epmd reports node 'rabbit' running on port 25672
    secondary:   * TCP connection succeeded but Erlang distribution failed
    secondary:   * suggestion: hostname mismatch?
    secondary:   * suggestion: is the cookie set correctly?
    secondary:
    secondary: current node details:
    secondary: - node name: 'rabbitmq-cli-1884@secondary'
    secondary: - home dir: /var/lib/rabbitmq
    secondary: - cookie hash: H6gXPXlo+GZy8pWfFX3Ynw==
    secondary: Starting node rabbit@secondary ...
    secondary: Error: unable to connect to node rabbit@secondary: nodedown
    secondary:
    secondary: DIAGNOSTICS
    secondary: ===========
    secondary:
    secondary: attempted to contact: [rabbit@secondary]
    secondary:
    secondary: rabbit@secondary:
    secondary:   * connected to epmd (port 4369) on secondary
    secondary:   * epmd reports node 'rabbit' running on port 25672
    secondary:   * TCP connection succeeded but Erlang distribution failed
    secondary:   * suggestion: hostname mismatch?
    secondary:   * suggestion: is the cookie set correctly?
    secondary:
    secondary: current node details:
    secondary: - node name: 'rabbitmq-cli-1935@secondary'
    secondary: - home dir: /var/lib/rabbitmq
    secondary: - cookie hash: H6gXPXlo+GZy8pWfFX3Ynw==
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.

Looks like the node name is not configured properly, as here is what I see if I ssh in the secondary node using vagrant ssh secondary:

root@secondary:~# rabbitmqctl stop
Stopping and halting node rabbit@secondary ...
Error: unable to connect to node rabbit@secondary: nodedown

DIAGNOSTICS
===========

attempted to contact: [rabbit@secondary]

rabbit@secondary:
  * connected to epmd (port 4369) on secondary
  * epmd reports node 'rabbit' running on port 25672
  * TCP connection succeeded but Erlang distribution failed
  * suggestion: hostname mismatch?
  * suggestion: is the cookie set correctly?

current node details:
- node name: 'rabbitmq-cli-2166@secondary'
- home dir: /var/lib/rabbitmq
- cookie hash: H6gXPXlo+GZy8pWfFX3Ynw==

Here is how the /var/log/rabbitmq/ folder's content looks like:

root@secondary:~# ll /var/log/rabbitmq/
total 24
drwxr-xr-x  2 rabbitmq rabbitmq 4096 Mar 24  2015 ./
drwxrwxr-x 10 root     syslog   4096 Sep 22 16:04 ../
-rw-r--r--  1 rabbitmq rabbitmq 7187 Sep 22 16:04 rabbit@vagrant-ubuntu-trusty-64.log
-rw-r--r--  1 rabbitmq rabbitmq    0 Mar 24  2015 rabbit@vagrant-ubuntu-trusty-64-sasl.log
-rw-r--r--  1 root     root        0 Mar 24  2015 shutdown_err
-rw-r--r--  1 root     root       64 Mar 24  2015 shutdown_log
-rw-r--r--  1 rabbitmq rabbitmq    0 Sep 22 16:22 startup_err
-rw-r--r--  1 rabbitmq rabbitmq   62 Sep 22 16:22 startup_log

I was expecting the see the log file names to be 'rabbit@secondary.log' etc.
And here are the top 10 lines from the log file, which shows the node name (in bold below):

root@secondary:~# head /var/log/rabbitmq/rabbit@vagrant-ubuntu-trusty-64.log

=INFO REPORT==== 24-Mar-2015::02:08:07 ===
Starting RabbitMQ 3.5.0 on Erlang 17.4
Copyright (C) 2007-2014 GoPivotal, Inc.
Licensed under the MPL.  See http://www.rabbitmq.com/

=INFO REPORT==== 24-Mar-2015::02:08:07 ===
**node           : rabbit@vagrant-ubuntu-trusty-64**
home dir       : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.config (not found)

This command also confirms the node name mismatch:

root@secondary:~# rabbitmqctl start_app
Starting node rabbit@secondary ...
Error: unable to connect to node rabbit@secondary: nodedown

DIAGNOSTICS
===========

attempted to contact: [rabbit@secondary]

rabbit@secondary:
  * connected to epmd (port 4369) on secondary
  * epmd reports node 'rabbit' running on port 25672
  * TCP connection succeeded but Erlang distribution failed
  * suggestion: hostname mismatch?
  * suggestion: is the cookie set correctly?

current node details:
- node name: 'rabbitmq-cli-2779@secondary'
- home dir: /var/lib/rabbitmq
- cookie hash: H6gXPXlo+GZy8pWfFX3Ynw==

Also attempted to restart the service:

root@secondary:~# service rabbitmq-server stop
 * Stopping message broker rabbitmq-server                                                                                                         * message broker already stopped
                                                                                                                                           [ OK ]
root@secondary:~# service rabbitmq-server start
 * Starting message broker rabbitmq-server                                                                                                         * FAILED - check /var/log/rabbitmq/startup_\{log, _err\}
                                                                                                                                           [fail]
root@secondary:~# cat /var/log/rabbitmq/startup_log
ERROR: node with name "rabbit" already running on "secondary"
root@secondary:~# cat /var/log/rabbitmq/startup_err
root@secondary:~#

Thanks.

Couple of more folder contents and file content:

root@secondary:~# ll /var/lib/rabbitmq/mnesia/
total 16
drwxr-x---  4 rabbitmq rabbitmq 4096 Sep 22 16:04 ./
drwxr-xr-x  3 rabbitmq rabbitmq 4096 Mar 24  2015 ../
drwxr-xr-x  4 rabbitmq rabbitmq 4096 Sep 22 16:04 rabbit@vagrant-ubuntu-trusty-64/
drwxr-xr-x 22 rabbitmq rabbitmq 4096 Sep 22 16:04 rabbit@vagrant-ubuntu-trusty-64-plugins-expand/

And the content of cluster_nodes.config file:

root@secondary:~# cat /var/lib/rabbitmq/mnesia/rabbit@vagrant-ubuntu-trusty-64/cluster_nodes.config
{['rabbit@vagrant-ubuntu-trusty-64'],['rabbit@vagrant-ubuntu-trusty-64']}.
root@secondary:~#

Did this issue resolve? I have this issue as well today.

Resolved this as

  1. Connect to secondary via ssh
  2. Check the process of RabbitMQ
    ps -ef | grep rabbitmq
  3. Kill the progress
    ps -ef | grep rabbitmq | grep -v grep | awk '{print $2}' | xargs kill -9
  4. Start the RabbitMQ server
    rabbitmq-server
  5. Run rabbitmqctl status successfully