emqx4.4.18运行一两天后报错退出

环境

  • EMQX 版本:4.4.18
  • 操作系统版本:windows server 2012 R2 datacenter

重现此问题的步骤

  1. bin\emqx start启动服务运行正常
  2. 运行一到两天时间后报错退出

node.process_limit = 2097152
node.max_ports = 1048576
listener.tcp.external.acceptors = 64
listener.tcp.external.max_connections = 1024000
zone.external.session_expiry_interval = 24h

预期行为

实际行为

服务停掉了,需要重新运行

2023-07-07T10:09:34.938000+08:00 [error] Error in process <0.28125.100> on node ‘emqx@192.168.0.89’ with exit value:, {badarg,[{ets,lookup,[emqx_hooks,‘message.dropped’],[{error_info,#{cause => id,module => erl_stdlib_errors}}]},{emqx_hooks,lookup,1,[{file,“emqx_hooks.erl”},{line,233}]},{emqx_hooks,run,2,[{file,“emqx_hooks.erl”},{line,172}]},{emqx_broker,drop_message,1,[{file,“emqx_broker.erl”},{line,240}]},{emqx_broker,dispatch,2,[{file,“emqx_broker.erl”},{line,291}]}]}

能提供完整的日志文件么,谢谢

核心error日志摘出来了。
2023-07-09T19:03:08.575000+08:00 [error] 343ADE@x.x.x.x:44322 [MQTT] , Parse failed for function_clause, [{emqx_frame,parse_packet,[{mqtt_packet_header,4,false,0,true},<<“+QIOPEN=1,0,"TCP","192.168.0.1",1883,0,2\rAT+QIOPEN=1,0,"TCP","192.168.0.1",188”>>,#{max_size => 1048576,strict_mode => false,version => 3}],[{file,“emqx_frame.erl”},{line,222}]},{emqx_frame,parse_frame,4,[{file,“emqx_frame.erl”},{line,199}]},{emqx_connection,parse_incoming,3,[{file,“emqx_connection.erl”},{line,649}]},{emqx_connection,handle_msg,2,[{file,“emqx_connection.erl”},{line,642}]},{emqx_connection,process_msg,2,[{file,“emqx_connection.erl”},{line,388}]},{emqx_connection,handle_recv,3,[{file,“emqx_connection.erl”},{line,352}]},{proc_lib,wake_up,3,[{file,“proc_lib.erl”},{line,236}]}], Frame data:<<“AT+QIOPEN=1,0,"TCP","192.168.0.1",1883,0,2\r”>>

2023-07-09T20:33:28.050000+08:00 [warning] aaaaaa@192.168.xx.xx:34704 [ACL http] Deny publish to topic xx/xxx/xxx, username: admin, due to request http server failure, path: “/mqtt/acl”, error: timeout

2023-07-09T20:33:28.066000+08:00 [error] Generic server memsup terminating. Reason: {timeout,{gen_server,call,[os_mon_sysinfo,get_mem_info]}}. Last message: {‘EXIT’,<0.26805.161>,{timeout,{gen_server,call,[os_mon_sysinfo,get_mem_info]}}}. State: [{data,[{“Timeout”,60000}]},{items,{“Memory Usage”,[{“Allocated”,5384769536},{“Total”,8589398016}]}},{items,{“Worst Memory User”,[{“Pid”,<0.802.0>},{“Memory”,4723800}]}}].

2023-07-09T20:33:28.066000+08:00 [error] crasher: initial call: memsup:init/1, pid: <0.11378.41>, registered_name: memsup, exit: {{timeout,{gen_server,call,[os_mon_sysinfo,get_mem_info]}},[{gen_server,handle_common_reply,8,[{file,“gen_server.erl”},{line,811}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,226}]}]}, ancestors: [os_mon_sup,<0.122.0>], message_queue_len: 0, messages: [], links: [<0.123.0>], dictionary: [], trap_exit: true, status: running, heap_size: 6772, stack_size: 29, reductions: 198549318; neighbours:

2023-07-09T20:33:28.066000+08:00 [error] Supervisor: {local,os_mon_sup}. Context: child_terminated. Reason: {timeout,{gen_server,call,[os_mon_sysinfo,get_mem_info]}}. Offender: id=memsup,pid=<0.11378.41>.

2023-07-09T20:34:00.566000+08:00 [error] Error in process <0.27063.161> on node ‘emqx@112.223.23.23’ with exit value:, {{badmatch,[]},[{memsup,get_memory_usage,1,[{file,“memsup.erl”},{line,615}]},{memsup,‘-handle_call/3-fun-1-’,2,[{file,“memsup.erl”},{line,304}]}]}

你的服务器是什么配置。
这段日志只能看出来:

  1. 你的 1883 的端口收到了非 MQTT 报文
  2. 你的 ACL HTTP Server 无法响应 ACL 查询
  3. 内存使用查询报错了

生产环境的话不建议在 Windows 下进行部署,windows 下优先考虑使用 docker