心跳间隔周期内,设备重连推送的上下线事件的时间戳问题

环境

  • EMQX 版本:5.7.2
  • 操作系统版本: centos 7

重现此问题的步骤

步骤:

  1. 设备断电
  2. 设备上电连接(心跳检测周期内,即控制台未检测已离线)

问题:
此时emqx会推送一个connect 事件、disconnect 事件,
但是一些情况下我发现,connect 事件的 connected_at 会小于 disconnect 事件的 disconnected_at

如下所示:connected_at:1756839646560,disconnected_at:1756839646561

 c.s.i.mqtt.callback.ConnectCallback - ConnectEventStr:{"ipaddress":"","expiry_interval":7200000,"clean_start":false,"sockport":1883,
"connected_at":1756839646560,"proto_ver":4,"proto_name":"MQTT","clientid":"8zlj9szy5nrmr13f56thwb","username":"8zlj9szy5nrmr13f56thwb|productId=nrbhc2ebpfvqktjx3u5n,signMethod=hmacSha256,authType=1,timestamp=1756839645244","ts":1756839646560,"protocol":"mqq
tt","keepalive":60}
c.s.i.m.callback.DisconnectCallback - DisconnectEventStr:{"ipaddress":"","disconnected_at":1756839646561,"sockport":1883,"connected_at":1756758482404,,
"proto_ver":4,"proto_name":"MQTT","clientid":"8zlj9szy5nrmr13f56thwb","username":"8zlj9szy5nrmr13f56thwb|productId=nrbhc2ebpfvqktjx3u5n,signMethod=hmacSha256,authType=1,timestamp=1756753268184","ts":1756839646561,"protocol":"mqtt",reason":"takenover"}

但是目前无法稳定复现,同时正常情况下 connected_at >= disconnected_at

预期行为

心跳检测周期内,设备重新上线,推送的上下线事件记录 connected_at >= disconnected_at

实际行为

这不是很正常么,设备总是先 connected,才会 disconnected。那 connected < disconnected的。

这种是对应错了包,一般的流程是这样的:
先 connected 再 disconnected(他俩是一对)
如果你看到的是connected_at > disconnected ,那应该是这样:
connected1 → disconnected1 → connected2 → xxxx
你拿 connected2 去和 disconnected1 比(他俩都不是一对)所以就出现了你说的 connected_at >= disconnected_at

为了能准确的识别出他们是不是一对,在 5.8.6 上的 disconnected 事件中引入了 connected_at 字段,这样你就能判断(找到) 这个 disconnected对应的 connected 事件在哪里了。

应该是我描述得不够清晰。

流程:
connected1 → disconnected1 → connected2 → xxxx

我的场景是:connected1 正在连接,然后设备断电重连(此时emqx 未检测到离线),设备重连相当于“顶号”,然后 emqx 推送disconnected1 、connected2 。

但是发现 disconnected1#disconnected_at(1756839646561) 大于 connected2#connected_at(1756839646560),这个是不正常的,应该是先离线再上线。

而且事件也是先推送disconnected1 ,再推送 connected2,预期是 disconnected1#disconnected_at <= connected2#connected_at

这里是我测试“顶号”的上下线事件的数据。

messageArrived topic:$SYS/brokers/emqx@172.17.0.3/clients/p4on0scli6nq8syk0i76sa/connected
messageArrived Qos:0
messageArrived context:{"ipaddress":"192.168.46.119","expiry_interval":0,"clean_start":true,"sockport":1883,"proto_ver":4,"proto_name":"MQTT","connected_at":1756894131290,"clientid":"p4on0scli6nq8syk0i76sa","username":"p4on0scli6nq8syk0i76sa|productId=nrbhc2ebpfvqktjx3u5n,signMethod=hmacSha256,authType=1,timestamp=11111111111111","ts":1756894131290,"protocol":"mqtt","keepalive":60}

messageArrived topic:$SYS/brokers/emqx@172.17.0.3/clients/p4on0scli6nq8syk0i76sa/disconnected
messageArrived Qos:0
messageArrived context:{"ipaddress":"192.168.46.119","disconnected_at":1756894151733,"sockport":1883,"proto_ver":4,"proto_name":"MQTT","connected_at":1756894131290,"clientid":"p4on0scli6nq8syk0i76sa","username":"p4on0scli6nq8syk0i76sa|productId=nrbhc2ebpfvqktjx3u5n,signMethod=hmacSha256,authType=1,timestamp=11111111111111","ts":1756894151733,"protocol":"mqtt","reason":"discarded"}

messageArrived topic:$SYS/brokers/emqx@172.17.0.3/clients/p4on0scli6nq8syk0i76sa/connected
messageArrived Qos:0
messageArrived context:{"ipaddress":"192.168.46.119","expiry_interval":0,"clean_start":true,"sockport":1883,"proto_ver":4,"proto_name":"MQTT","connected_at":1756894151735,"clientid":"p4on0scli6nq8syk0i76sa","username":"p4on0scli6nq8syk0i76sa|productId=nrbhc2ebpfvqktjx3u5n,signMethod=hmacSha256,authType=1,timestamp=11111111111111","ts":1756894151735,"protocol":"mqtt","keepalive":60}

messageArrived topic:$SYS/brokers/emqx@172.17.0.3/clients/p4on0scli6nq8syk0i76sa/disconnected
messageArrived Qos:0
messageArrived context:{"ipaddress":"192.168.46.119","disconnected_at":1756894404089,"sockport":1883,"proto_ver":4,"proto_name":"MQTT","connected_at":1756894151735,"clientid":"p4on0scli6nq8syk0i76sa","username":"p4on0scli6nq8syk0i76sa|productId=nrbhc2ebpfvqktjx3u5n,signMethod=hmacSha256,authType=1,timestamp=11111111111111","ts":1756894404089,"protocol":"mqtt","reason":"discarded"}

messageArrived topic:$SYS/brokers/emqx@172.17.0.3/clients/p4on0scli6nq8syk0i76sa/connected
messageArrived Qos:0
messageArrived context:{"ipaddress":"192.168.46.119","expiry_interval":0,"clean_start":true,"sockport":1883,"proto_ver":4,"proto_name":"MQTT","connected_at":1756894404090,"clientid":"p4on0scli6nq8syk0i76sa","username":"p4on0scli6nq8syk0i76sa|productId=nrbhc2ebpfvqktjx3u5n,signMethod=hmacSha256,authType=1,timestamp=11111111111111","ts":1756894404090,"protocol":"mqtt","keepalive":60}

旧版本没有字段可以判断那一对是真的一对,不能用到达的顺序来决定,这事件是不保证顺序的,特别是在使用的是HTTP webhook (里面有进程池,还有pipeline,如果有事件时间非常近,顺序乱的概率就更大)ps 其他mqtt订阅也不能保证顺序,只是说http更大概率

您好,

1.我看此版本的 diconnect 事件的 connected_at 是对得上 connect 事件的 connected_at,在5.7版本中,这2个数据是不具体对照作用?

2.我这边不关心事件推送到达的顺序。我这边关心的是, “顶号”行为产生的 disconnected1 和 connected2 在 EMQX 集群中生成的先后顺序。按照您前面的描述,是先生成了 disconnected1 ,再 产生 connected2 ,如果逻辑是这样的话,为什么我这边收到的 disconnected1的 ts > connected2 的 ts 呢? 尽管 disconnected1 和 connected2 不是一对,但是在时间线上,肯定是有先后顺序的吧。

再次麻烦解答下,感谢:pray: