emqx 5.4版本报错

环境

  • EMQX 版本: 5.4
  • 操作系统版本:

重现此问题的步骤

2024-10-15T16:51:46.742285+08:00 [warning] msg: socket_error, mfa: emqx_connection:handle_info/2(937), peername: 127.0.0.1:33414, clientid: school_local_dev_20240623001, reason: timeout
2024-10-15T16:51:46.742896+08:00 [warning] msg: alarm_is_deactivated, mfa: emqx_alarm:do_actions/3(424), name: <<“conn_congestion/school_local_dev_20240623001/admin”>>

第一条 timeout 日志是说 emqx 给客户端投递消息超时。第二条日志是个告警,表示 school_local_dev_20240623001 这个订阅者发生阻塞,可能是网络带宽受限,也可能是短时间收到太多消息超过了客户端的处理速度。

有一篇文章介绍如何处理订阅者的瓶颈问题:
https://segmentfault.com/a/1190000045059752

上面的问题已经解决了

2024-10-18T10:36:21.463851+08:00 [warning] msg: puback_packetId_not_found, mfa: emqx_channel:handle_in/2(429), peername: 172.31.37.134:51214, clientid: 2478436, packetId: 15
2024-10-18T10:36:27.451128+08:00 [warning] msg: puback_packetId_not_found, mfa: emqx_channel:handle_in/2(429), peername: 172.31.37.133:43266, clientid: 2478446, packetId: 17
2024-10-18T10:36:30.473704+08:00 [warning] msg: puback_packetId_not_found, mfa: emqx_channel:handle_in/2(429), peername: 172.31.37.134:51214, clientid: 2478436, packetId: 16
2024-10-18T10:36:36.459926+08:00 [warning] msg: puback_packetId_not_found, mfa: emqx_channel:handle_in/2(429), peername: 172.31.37.133:43266, clientid: 2478446, packetId: 18 现在出现这个日志,是什么问题啊

可能是类似的问题,emqx 给订阅者投递 QoS1 消息,但是太快了 session 队列满了就只能开始丢弃消息。PacketID 被丢掉了所以 PUBACK 回来的时候就找不到了。

这种该怎么解决啊,队列存储数量可以调大点吗?

如果可以的话在哪里调啊 :sweat_smile:


默认每个客户端已经是 1000 了,不建议调,还是得找到真正的原因。除了这个报错,还有其它的么日志么。

2024-10-24T22:00:58.517188+08:00 [warning] msg: alarm_is_deactivated, mfa: emqx_alarm:do_actions/3(424), name: <<“conn_congestion/school_local_dev_20240903001/admin”>>

2024-10-24T21:59:36.713586+08:00 [warning] msg: alarm_is_activated, mfa: emqx_alarm:do_actions/3(418), message: <<“connection congested: #{buffer => 4096,clientid => <<"school_local_dev_20240903001">>,conn_state => connected,connected_at => 1729775369880,high_msgq_watermark => 8192,high_watermark => 1048576,memory => 285400,message_queue_len => 0,peername => <<"127.0.0.1:45106">>,pid => <<"<0.6023.0>">>,proto_name => <<"MQTT">>,proto_ver => 4,recbuf => 2489589,recv_cnt => 4964,recv_oct => 336282,reductions”…>>, name: <<“conn_congestion/school_local_dev_20240903001/admin”>>
2024-10-24T21:59:43.486157+08:00 [warning] msg: busy_port, mfa: emqx_sys_mon:handle_info/2(182), portinfo: [{port,#Port<0.89>},{name,“tcp_inet”},{links,[<0.6023.0>]},{id,712},{connected,<0.6023.0>},{input,0},{output,130037740},{os_pid,undefined}], procinfo: [{pid,<0.6023.0>},{memory,285344},{total_heap_size,35462},{heap_size,6772},{stack_size,35},{min_heap_size,233},{proc_lib_initial_call,{emqx_connection,init,[‘Argument__1’,‘Argument__2’,‘Argument__3’,‘Argument__4’]}},{initial_call,{proc_lib,init_p,5}},{current_stacktrace,[{erlang,port_command,3,},{esockd_transport,do_port_command,3,[{file,“esockd_transport.erl”},{line,178}]},{emqx_connection,send,2,[{file,“emqx_connection.erl”},{line,912}]},{emqx_connection,handle_outgoing,2,[{file,“emqx_connection.erl”},{line,859}]},{emqx_connection,process_msg,2,[{file,“emqx_connection.erl”},{line,493}]},{emqx_connection,process_msg,2,[{file,“emqx_connection.erl”},{line,499}]},{emqx_connection,handle_recv,3,[{file,“emqx_connection.erl”},{line,455}]},{proc_lib,wake_up,3,[{file,“proc_lib.erl”},{line,250}]}]},{registered_name,},{status,suspended},{message_queue_len,0},{group_leader,<0.2435.0>},{priority,normal},{trap_exit,false},{reductions,6997955},{last_calls,false},{catchlevel,4},{trace,0},{suspending,},{sequential_trace_token,},{error_handler,error_handler}]
还报了这两个错

这个说明有部分的客户端的 QPS 太高了,处理不过来。你可以找找,是不是订阅了太多主题,如果确实有这个需求(就是要这么大的量),可以考虑一下共享订阅来分担消息量