ExHook 在grpc server重启后一直没有重新链接,一直都是失败

环境信息

  • EMQX 版本:5.0.7
  • 操作系统及版本:helm install emqx ./emqx -nnew-dev
  • 其他

问题描述

配置文件及日志

 2022-10-11T07:34:56.622875+00:00 [error] clientid: master01_bench_pub_25308_2954865537, function: on_session_terminated, line: 409, mfa: emqx_exhook_server:do_call/5, module: emqx_exhook_v_2_hook_provider_client, msg: exhook_call_error, options: #{channel => <<"locahost">>,failed_action => deny,key_dispatch => <<"master01_bench_pub_25308_2954865537">>,timeout => 5000}, peername: 10.233.106.0:33098, reason: {deadline_exceeded,<<"Waiting for response timeout">>}, req: #{clientinfo => #{anonymous => true,clientid => <<"master01_bench_pub_25308_2954865537">>,cn => <<>>,dn => <<>>,is_superuser => false,mountpoint => <<>>,node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,password => <<>>,peerhost => <<"10.233.106.0">>,protocol => <<"mqtt">>,sockport => 1883,username => <<>>},meta => #{cluster_name => "emqxcl",node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,sysdescr => "EMQX",version => "5.0.7"},reason => <<"{shutdown,tcp_closed}">>}

 2022-10-11T07:34:56.623017+00:00 [error] clientid: master01_bench_pub_39229_3416105642, function: on_message_publish, line: 409, mfa: emqx_exhook_server:do_call/5, module: emqx_exhook_v_2_hook_provider_client, msg: exhook_call_error, options: #{channel => <<"locahost">>,failed_action => deny,key_dispatch => <<"master01_bench_pub_39229_3416105642">>,timeout => 5000}, peername: 10.233.106.0:34422, reason: {deadline_exceeded,<<"Waiting for response timeout">>}, req: #{message => #{from => <<"master01_bench_pub_39229_3416105642">>,headers => #{<<"peerhost">> => <<"10.233.106.0">>,<<"protocol">> => <<"mqtt">>},id => <<"0005EABD4FC595154238210054583DEB">>,node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,payload => <<"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa">>,qos => 0,timestamp => 1665473691620,topic => <<"bench/39229">>},meta => #{cluster_name => "emqxcl",node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,sysdescr => "EMQX",version => "5.0.7"}}

 2022-10-11T07:34:56.623248+00:00 [error] clientid: master01_bench_pub_39445_3378578092, function: on_client_disconnected, line: 409, mfa: emqx_exhook_server:do_call/5, module: emqx_exhook_v_2_hook_provider_client, msg: exhook_call_error, options: #{channel => <<"locahost">>,failed_action => deny,key_dispatch => <<"master01_bench_pub_39445_3378578092">>,timeout => 5000}, peername: 10.233.106.0:60392, reason: {deadline_exceeded,<<"Waiting for response timeout">>}, req: #{clientinfo => #{anonymous => true,clientid => <<"master01_bench_pub_39445_3378578092">>,cn => <<>>,dn => <<>>,is_superuser => false,mountpoint => <<>>,node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,password => <<>>,peerhost => <<"10.233.106.0">>,protocol => <<"mqtt">>,sockport => 1883,username => <<>>},meta => #{cluster_name => "emqxcl",node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,sysdescr => "EMQX",version => "5.0.7"},reason => <<"tcp_closed">>}

 2022-10-11T07:34:56.623735+00:00 [error] clientid: master01_bench_pub_22496_1627500407, function: on_client_disconnected, line: 409, mfa: emqx_exhook_server:do_call/5, module: emqx_exhook_v_2_hook_provider_client, msg: exhook_call_error, options: #{channel => <<"locahost">>,failed_action => deny,key_dispatch => <<"master01_bench_pub_22496_1627500407">>,timeout => 5000}, peername: 10.233.106.0:20768, reason: {deadline_exceeded,<<"Waiting for response timeout">>}, req: #{clientinfo => #{anonymous => true,clientid => <<"master01_bench_pub_22496_1627500407">>,cn => <<>>,dn => <<>>,is_superuser => false,mountpoint => <<>>,node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,password => <<>>,peerhost => <<"10.233.106.0">>,protocol => <<"mqtt">>,sockport => 1883,username => <<>>},meta => #{cluster_name => "emqxcl",node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,sysdescr => "EMQX",version => "5.0.7"},reason => <<"tcp_closed">>}

 2022-10-11T07:34:56.624120+00:00 [error] clientid: worker03_bench_pub_11647_3890671373, function: on_message_publish, line: 409, mfa: emqx_exhook_server:do_call/5, module: emqx_exhook_v_2_hook_provider_client, msg: exhook_call_error, options: #{channel => <<"locahost">>,failed_action => deny,key_dispatch => <<"worker03_bench_pub_11647_3890671373">>,timeout => 5000}, peername: 10.233.106.0:58212, reason: {deadline_exceeded,<<"Waiting for response timeout">>}, req: #{message => #{from => <<"worker03_bench_pub_11647_3890671373">>,headers => #{<<"peerhost">> => <<"10.233.106.0">>,<<"protocol">> => <<"mqtt">>},id => <<"0005EABD4FC59F6742381A00648C3E0E">>,node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,payload => <<"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa">>,qos => 0,timestamp => 1665473691623,topic => <<"bench/11647">>},meta => #{cluster_name => "emqxcl",node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,sysdescr => "EMQX",version => "5.0.7"}}



grpc服务重启后, 插件未自动恢复, 手动停用Exhook后,再手动开启, 消息就正常推送了

deadline_exceeded 一般是由于请求超时导致的。

重启 grpc 服务后,是一直都没法恢复么?必须重启 exhook 么

是的

手动把grpc服务关闭后,报错信息如下:

 2022-10-11T07:43:18.346703+00:00 [error] [gRPC Client] Stream shutdown reason: {closed,"The connection was lost."}, stream: #stream{st={closed,idle}, buff_size=0, mqueue=[]}

 2022-10-11T07:43:18.346731+00:00 [error] [gRPC Client] Stream shutdown reason: {closed,"The connection was lost."}, stream: #stream{st={closed,idle}, buff_size=0, mqueue=[]}

 2022-10-11T07:43:18.346693+00:00 [error] [gRPC Client] Stream shutdown reason: {closed,"The connection was lost."}, stream: #stream{st={closed,idle}, buff_size=0, mqueue=[]}

 2022-10-11T07:43:18.346745+00:00 [error] [gRPC Client] Stream shutdown reason: {closed,"The connection was lost."}, stream: #stream{st={closed,idle}, buff_size=0, mqueue=[]}

 2022-10-11T07:43:18.346379+00:00 [error] [gRPC Client] Stream shutdown reason: {closed,"The connection was lost."}, stream: #stream{st={closed,idle}, buff_size=0, mqueue=[]}

 2022-10-11T07:43:18.346761+00:00 [error] [gRPC Client] Stream shutdown reason: {closed,"The connection was lost."}, stream: #stream{st={closed,idle}, buff_size=0, mqueue=[]}

 2022-10-11T07:43:18.346786+00:00 [error] [gRPC Client] Stream shutdown reason: {closed,"The connection was lost."}, stream: #stream{st={closed,idle}, buff_size=0, mqueue=[]}

 2022-10-11T07:43:18.346833+00:00 [error] [gRPC Client] Stream shutdown reason: {closed,"The connection was lost."}, stream: #stream{st={closed,idle}, buff_size=0, mqueue=[]}

 2022-10-11T07:43:18.346821+00:00 [error] [gRPC Client] Stream shutdown reason: {closed,"The connection was lost."}, stream: #stream{st={closed,idle}, buff_size=0, mqueue=[]}

 2022-10-11T07:43:18.346416+00:00 [error] [gRPC Client] Stream shutdown reason: {closed,"The connection was lost."}, stream: #stream{st={closed,idle}, buff_size=0, mqueue=[]}

手动启动grpc服务后,报错信息如下

2022-10-11T07:34:56.622875+00:00 [error] clientid: master01_bench_pub_25308_2954865537, function: on_session_terminated, line: 409, mfa: emqx_exhook_server:do_call/5, module: emqx_exhook_v_2_hook_provider_client, msg: exhook_call_error, options: #{channel => <<"locahost">>,failed_action => deny,key_dispatch => <<"master01_bench_pub_25308_2954865537">>,timeout => 5000}, peername: 10.233.106.0:33098, reason: {deadline_exceeded,<<"Waiting for response timeout">>}, req: #{clientinfo => #{anonymous => true,clientid => <<"master01_bench_pub_25308_2954865537">>,cn => <<>>,dn => <<>>,is_superuser => false,mountpoint => <<>>,node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,password => <<>>,peerhost => <<"10.233.106.0">>,protocol => <<"mqtt">>,sockport => 1883,username => <<>>},meta => #{cluster_name => "emqxcl",node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,sysdescr => "EMQX",version => "5.0.7"},reason => <<"{shutdown,tcp_closed}">>}

 2022-10-11T07:34:56.623017+00:00 [error] clientid: master01_bench_pub_39229_3416105642, function: on_message_publish, line: 409, mfa: emqx_exhook_server:do_call/5, module: emqx_exhook_v_2_hook_provider_client, msg: exhook_call_error, options: #{channel => <<"locahost">>,failed_action => deny,key_dispatch => <<"master01_bench_pub_39229_3416105642">>,timeout => 5000}, peername: 10.233.106.0:34422, reason: {deadline_exceeded,<<"Waiting for response timeout">>}, req: #{message => #{from => <<"master01_bench_pub_39229_3416105642">>,headers => #{<<"peerhost">> => <<"10.233.106.0">>,<<"protocol">> => <<"mqtt">>},id => <<"0005EABD4FC595154238210054583DEB">>,node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,payload => <<"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa">>,qos => 0,timestamp => 1665473691620,topic => <<"bench/39229">>},meta => #{cluster_name => "emqxcl",node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,sysdescr => "EMQX",version => "5.0.7"}}

 2022-10-11T07:34:56.623248+00:00 [error] clientid: master01_bench_pub_39445_3378578092, function: on_client_disconnected, line: 409, mfa: emqx_exhook_server:do_call/5, module: emqx_exhook_v_2_hook_provider_client, msg: exhook_call_error, options: #{channel => <<"locahost">>,failed_action => deny,key_dispatch => <<"master01_bench_pub_39445_3378578092">>,timeout => 5000}, peername: 10.233.106.0:60392, reason: {deadline_exceeded,<<"Waiting for response timeout">>}, req: #{clientinfo => #{anonymous => true,clientid => <<"master01_bench_pub_39445_3378578092">>,cn => <<>>,dn => <<>>,is_superuser => false,mountpoint => <<>>,node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,password => <<>>,peerhost => <<"10.233.106.0">>,protocol => <<"mqtt">>,sockport => 1883,username => <<>>},meta => #{cluster_name => "emqxcl",node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,sysdescr => "EMQX",version => "5.0.7"},reason => <<"tcp_closed">>}

 2022-10-11T07:34:56.623735+00:00 [error] clientid: master01_bench_pub_22496_1627500407, function: on_client_disconnected, line: 409, mfa: emqx_exhook_server:do_call/5, module: emqx_exhook_v_2_hook_provider_client, msg: exhook_call_error, options: #{channel => <<"locahost">>,failed_action => deny,key_dispatch => <<"master01_bench_pub_22496_1627500407">>,timeout => 5000}, peername: 10.233.106.0:20768, reason: {deadline_exceeded,<<"Waiting for response timeout">>}, req: #{clientinfo => #{anonymous => true,clientid => <<"master01_bench_pub_22496_1627500407">>,cn => <<>>,dn => <<>>,is_superuser => false,mountpoint => <<>>,node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,password => <<>>,peerhost => <<"10.233.106.0">>,protocol => <<"mqtt">>,sockport => 1883,username => <<>>},meta => #{cluster_name => "emqxcl",node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,sysdescr => "EMQX",version => "5.0.7"},reason => <<"tcp_closed">>}

 2022-10-11T07:34:56.624120+00:00 [error] clientid: worker03_bench_pub_11647_3890671373, function: on_message_publish, line: 409, mfa: emqx_exhook_server:do_call/5, module: emqx_exhook_v_2_hook_provider_client, msg: exhook_call_error, options: #{channel => <<"locahost">>,failed_action => deny,key_dispatch => <<"worker03_bench_pub_11647_3890671373">>,timeout => 5000}, peername: 10.233.106.0:58212, reason: {deadline_exceeded,<<"Waiting for response timeout">>}, req: #{message => #{from => <<"worker03_bench_pub_11647_3890671373">>,headers => #{<<"peerhost">> => <<"10.233.106.0">>,<<"protocol">> => <<"mqtt">>},id => <<"0005EABD4FC59F6742381A00648C3E0E">>,node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,payload => <<"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa">>,qos => 0,timestamp => 1665473691623,topic => <<"bench/11647">>},meta => #{cluster_name => "emqxcl",node => <<"emqx@emqx-2.emqx-headless.new-dev.svc.cluster.local">>,sysdescr => "EMQX",version => "5.0.7"}}

手动停用ExHook,然后再手动开启, 消息转发恢复正常

我也出现过

+1,出现过好几次,grpc断了之后,再重连就会大量timeout

还有就是认证/鉴权采用http server方式,http断了之后也会这样