5.x 集群 mcast 发现,无法创建集群

UPDATE

多播地址

是我搞错了。感谢 @heeejianbo 科普

阿里云 ec2 不支持多播

https://help.aliyun.com/document_detail/25412.html?source=5176.11533457&userCode=r3yteowb&type=copy
算是一个小坑, 给后来人看一眼

环境信息

  • EMQX 版本:5.0.3
  • 操作系统及版本:Debian11
  • 其他

问题描述

按照官方文档 ,尝试用 mcast 方式建立集群

配置文件及日志

先不考虑脑裂等问题,假设两个节点

  • a, ip: 10.2.24.1
  • b, ip: 10.2.8.4

cluster 配置部分

节点 a

cluster {
  name = emqx
  discovery_strategy = mcast
  mcast {
    addr = "10.2.24.1"
    ports = [4369, 4370]
    iface = "0.0.0.0"
    ttl = 255
    loop = true
  }
}

节点 b

cluster {
  name = emqx
  discovery_strategy = mcast
  mcast {
    addr = "10.2.8.4"
    ports = [4369, 4370]
    iface = "0.0.0.0"  
    ttl = 255
    loop = true
  }
}

分别修改配置之后,重启服务,使用 emqx_ctl cluster staus 查看,各自节点上只能查到自己,并未形成集群

日志部分

打开 debug 日志

2022-08-25T15:53:32.570859+08:00 [info] Ekka(AutoCluster): all discovered nodes already in cluster; ignoring
2022-08-25T15:53:32.570953+08:00 [debug] Ekka(AutoCluster): join result: ignore
2022-08-25T15:53:32.571004+08:00 [info] Ekka(AutoCluster): no discovered nodes outside cluster
2022-08-25T15:53:35.571737+08:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
2022-08-25T15:53:43.801819+08:00 [info] Ekka(AutoCluster): all discovered nodes already in cluster; ignoring
2022-08-25T15:53:43.801902+08:00 [debug] Ekka(AutoCluster): join result: ignore
2022-08-25T15:53:43.801999+08:00 [info] Ekka(AutoCluster): no discovered nodes outside cluster
2022-08-25T15:53:46.802704+08:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms

其他信息

b 探测 a 的两个端口 4369, 4370 ,都是打开的

root@tsing-int-devsrv-mqtt-edge-3:/var/log/emqx# nc -vuz 10.2.24.1 4369
10.2.24.1: inverse host lookup failed: Unknown host
(UNKNOWN) [10.2.24.1] 4369 (?) open
root@tsing-int-devsrv-mqtt-edge-3:/var/log/emqx# nc -vtz 10.2.24.1 4370
10.2.24.1: inverse host lookup failed: Unknown host
(UNKNOWN) [10.2.24.1] 4370 (?) open

多播地址应该是

quoted from wiki: https://en.wikipedia.org/wiki/Multicast_address

试试看 239.192.0.1 而不是节点的 IP 地址

啊。感谢,薄弱了。我试一下您说的
生产中,非 k8s 环境,官方推荐何种 discovery 方式?

都行的不用担心

UPDATE

多播地址

是我搞错了。感谢 @heeejianbo 科普

阿里云 ec2 不支持多播

https://help.aliyun.com/document_detail/25412.html?source=5176.11533457&userCode=r3yteowb&type=copy
算是一个小坑, 给后来人看一眼

1 个赞

@heeejianbo 再打扰一下

我尝试了 static 模式

现在的情况有点奇怪

  • 命令行查询集群抓状态,ok
  • 日志,反复报错

不知道有没有什么需要我注意的地方?

配置

cluster {
  name = emqx
  discovery_strategy = static
  static {
    seeds = ["emqx@10.2.24.1"]
  }
}

cluster status

可以看到集群组建成功

root@tsing-int-devsrv-mqtt-edge:~# emqx_ctl cluster status
Cluster status: #{running_nodes =>
                      ['emqx@10.2.24.1','emqx@10.2.8.3','emqx@10.2.8.4'],
                  stopped_nodes => []}

logging

但是日志仍然在提示报错。 我是否需要关注此错误?

2022-08-25T17:48:27.685596+08:00 [info] Ekka(AutoCluster): no discovered nodes outside cluster
2022-08-25T17:48:27.685696+08:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 ms
2022-08-25T17:48:33.357645+08:00 [info] Ekka(AutoCluster): joining with 'emqx@10.2.24.1'
2022-08-25T17:48:33.360559+08:00 [debug] Ekka(AutoCluster): join result: {error,{already_in_cluster,'emqx@10.2.24.1'}}
2022-08-25T17:48:33.360637+08:00 [info] Ekka(AutoCluster): no discovered nodes outside cluster
2022-08-25T17:48:33.360736+08:00 [warning] Ekka(AutoCluster): discovery did not succeed; retrying in 5000 m

这个报错需要关注的。 staic 里的 seed 配置应该包含所有的节点才对

了解。那 seed 这个叫法就很误导了。谢谢

我按照官方的配置设置了,但是还是没有组成集群
image