EMQX服务端偶发异常导致客户端CPU占用率100%

环境

  • EMQX 版本:windows10-v3.2.2
  • 操作系统版本:Windows Server2016

重现此问题的步骤

  1. 重启服务器时自启emqx,偶发问题

预期行为

服务正常启动

实际行为

EMQX日志报错{badmap,{error,timeout}},连接的客户端会报错MqttTimeoutException,客户端CPU占用率100%,导致计算机卡死。

emqx.log 如下:

2023-05-11 05:18:47.482 [error] Test001@10.11.55.110:52553 crasher:
initial call: emqx_channel:init/1
pid: <0.26235.77>
registered_name: []
exception error: {badmap,{error,timeout}}
in function maps:get/3
called as maps:get(<<“password”>>,{error,timeout},undefined)
in call from emqx_auth_mongo:‘-check/2-lc$^0/1-0-’/2 (c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx_auth_mongo/src/emqx_auth_mongo.erl, line 40)
in call from emqx_auth_mongo:check/2 (c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx_auth_mongo/src/emqx_auth_mongo.erl, line 40)
in call from emqx_hooks:do_run_fold/3 (c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx/src/emqx_hooks.erl, line 131)
in call from emqx_access_control:authenticate/1 (c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx/src/emqx_access_control.erl, line 31)
in call from emqx_protocol:process/2 (c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx/src/emqx_protocol.erl, line 401)
in call from emqx_channel:handle_incoming/3 (c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx/src/emqx_channel.erl, line 433)
in call from gen_statem:call_state_function/5 (gen_statem.erl, line 1660)
ancestors: [<0.479.0>,<0.478.0>,esockd_sup,<0.135.0>]
message_queue_len: 2
messages: [{cancel_timer,#Ref<0.1701519806.1154744321.102281>,12281},
{timeout,#Ref<0.1701519806.1154744321.102326>,emit_stats}]
links: [<0.479.0>]
dictionary: [{incoming_bytes,60},
{‘$logger_metadata$’,
#{client_id => <<“Test001”>>,
peername => “10.11.55.110:52553”}},
{force_shutdown_policy,
#{max_heap_size => 838860800,message_queue_len => 8000}},
{rand_seed,
{#{bits => 58,jump => #Fun<rand.8.10897371>,
next => #Fun<rand.5.10897371>,type => exrop,
uniform => #Fun<rand.6.10897371>,
uniform_n => #Fun<rand.7.10897371>,
weak_low_bits => 1},
[146480983190289784|68255294189285891]}}]
trap_exit: true
status: running
heap_size: 17731
stack_size: 27
reductions: 309520
neighbours:
2023-05-11 05:18:47.484 [error] supervisor: ‘esockd_connection_sup - <0.479.0>’
errorContext: connection_crashed
reason: {{badmap,{error,timeout}},
[{maps,get,
[<<“password”>>,{error,timeout},undefined],
[{file,“maps.erl”},{line,207}]},
{emqx_auth_mongo,‘-check/2-lc$^0/1-0-’,2,
[{file,
“c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx_auth_mongo/src/emqx_auth_mongo.erl”},
{line,40}]},
{emqx_auth_mongo,check,2,
[{file,
“c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx_auth_mongo/src/emqx_auth_mongo.erl”},
{line,40}]},
{emqx_hooks,do_run_fold,3,
[{file,
“c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx/src/emqx_hooks.erl”},
{line,131}]},
{emqx_access_control,authenticate,1,
[{file,
“c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx/src/emqx_access_control.erl”},
{line,31}]},
{emqx_protocol,process,2,
[{file,
“c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx/src/emqx_protocol.erl”},
{line,401}]},
{emqx_channel,handle_incoming,3,
[{file,
“c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx/src/emqx_channel.erl”},
{line,433}]},
{gen_statem,call_state_function,5,
[{file,“gen_statem.erl”},{line,1660}]}]}
offender: [{pid,<0.26235.77>},
{name,connection},
{mfargs,{emqx_channel,start_link,
[[{deflate_options,[]},
{max_conn_rate,500},
{active_n,100},
{zone,external}]]}}]
2023-05-11 05:20:21.341 [error] Test001@10.11.55.110:52555 ** State machine <0.27826.77> terminating
** Last event = {cast,
{incoming,
{mqtt_packet,
{mqtt_packet_header,1,false,0,false},
{mqtt_packet_connect,<<“MQTT”>>,4,false,true,
false,0,false,60,undefined,<<“Test001”>>,
undefined,undefined,undefined,
<<“HawkeyeClient”>>,<<“HawkeyeClient@Only”>>},
undefined}}}
** When server state = {idle,
{state,esockd_transport,
{ssl_socket,#Port<0.1245753>,
{sslsocket,
{gen_tcp,#Port<0.1245753>,tls_connection,
undefined},
[<0.27840.77>,<0.27841.77>]}},
{{10.11.55.110},52555},
undefined,running,100,
{pstate,external,#Fun<emqx_channel.0.134051180>,
{{192,168,10,13},9883},
{{10.11.55.110},52555},
undefined,4,<<“MQTT”>>,<<>>,false,<0.27826.77>,
undefined,undefined,undefined,undefined,false,#{},
undefined,undefined,undefined,false,
#{msg => 0,pkt => 0},
#{msg => 0,pkt => 0},
false,undefined,
#{from_client => 0,to_client => 0},
emqx_channel,#{},undefined},
{none,#{max_size => 1048576,version => 4}},
{emqx_gc,
#{cnt => {1000,999},oct => {1048576,1048516}}},
undefined,undefined,undefined,undefined,true,
#Ref<0.1701519806.1154744321.105364>,15000}}
** Reason for termination = error:{badmap,{error,timeout}}
** Callback mode = [state_functions,state_enter]
** Stacktrace =
** [{maps,get,
[<<“password”>>,{error,timeout},undefined],
[{file,“maps.erl”},{line,207}]},
{emqx_auth_mongo,‘-check/2-lc$^0/1-0-’,2,
[{file,“c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx_auth_mongo/src/emqx_auth_mongo.erl”},
{line,40}]},
{emqx_auth_mongo,check,2,
[{file,“c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx_auth_mongo/src/emqx_auth_mongo.erl”},
{line,40}]},
{emqx_hooks,do_run_fold,3,
[{file,“c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx/src/emqx_hooks.erl”},
{line,131}]},
{emqx_access_control,authenticate,1,
[{file,“c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx/src/emqx_access_control.erl”},
{line,31}]},
{emqx_protocol,process,2,
[{file,“c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx/src/emqx_protocol.erl”},
{line,401}]},
{emqx_channel,handle_incoming,3,
[{file,“c:/emqx/ce/emqx-rel/_build/emqx/lib/emqx/src/emqx_channel.erl”},
{line,433}]},
{gen_statem,call_state_function,5,[{file,“gen_statem.erl”},{line,1660}]}]

Hi 这可能是早期版本的 bug。但 v3.x 已经超出维护周期,请尝试使用正在维护的 EMQX 4.x 或 EMQX 5.0.x 并尽量在 Linux 上运行 EMQX 服务。