设备异常下线,下线原因为internal_error

环境

  • EMQX 版本: 5.6.0
  • 操作系统版本: 容器部署

近期发现一些设备掉线的情况,disconnect事件中显示为
{“reason”:“internal_error”,“peername”:“10.217..:8924”,“metadata”:{“rule_id”:“client_action”},“clientid”:“209”,“proto_ver”:5,“proto_name”:“MQTT”,“sockname”:“10.128..:1***”,“disconn_props”:{“User-Property”:{}},“node”:“@.*.”,“event”:“client.disconnected”,“disconnected_at”:1748415282424,“username”:“*****”,“timestamp”:1748415282424}

这个internal_error是个什么问题?是服务端的问题还是客户端的问题?

查看那个时段的服务端日志发现如下报错:
{“message”:“{"time":1748415282425671,"level":"error","msg":"malformed_mqtt_message","username":"","reason":{"remaining_bytes_length":78,"parsed_key_length":103,"cause":"user_property_not_enough_bytes"},"pid":"<0.20814.255>","peername":"10.217.59.:","otel_trace_id":"689425f578e951b49dfac171d5640b31","otel_trace_flags":"01","otel_span_id":"4dcf07f8fac4e888","clientid":"209"}”}

出现了user_property_not_enough_bytes这样的提示,我们在user_property中有放了一些业务字段,这个提示是说user_property中的业务字段太长了嘛?

客户端问题


有可能,也有可能是
你的mqtt client拼包时给错大小了,比如client算出来property 有1024bytes,但实际上只传了1020bytes ,所以emqx就会报包格式不对

**怎么判断是否是拼包给错大小了,抓mqtt协议包看嘛?**因为我们的User-Property里也没有放特别长的内容,应该也不是user-property太长的原因?

一条正常消息投递成功的事件如下:
{
“metadata”: {
“rule_id”: “message”
},
“peerhost”: “10...151”,
“clientid”: “209”,
“from_username”: “",
“flags”: {
“retain”: false,
“dup”: false
},
“puback_props”: {
“User-Property”: {}
},
“node”: "
”,
“from_clientid”: “*”,
“qos”: 1,
“payload”: “629个字符”,
“pub_props”: {
“User-Property”: {
“traceparent”: “00-689425f578e951b49dfac171d5640b31-43057a6c1b7d1399-01”
},
“User-Property-Pairs”: [
{
“value”: “00-689425f578e951b49dfac171d5640b31-43057a6c1b7d1399-01”,
“key”: “traceparent”
}
]
},
“publish_received_at”: 1748415251316,
“topic”: “c2d/唯一标识”,
“id”: “0006362CA5F97E46F41211011A63000C”,
“event”: “message.acked”,
“username”: “唯一标识”,
“timestamp”: 1748415251386
}

用tcpdump 抓包,然后用wireshark分析