webhook 打开以后客户连接不上

环境信息

  • EMQ X 版本:4.3.5
  • 操作系统及版本:liunx
  • 其他

问题描述

配置了emq的webhook以后,当有客户端连接的时候能看到emq回调了配置的地址,服务端的webhook接口清晰的看到emq的回调信息如下:

正如上图,客户端总是连接不上,paho客户端的报错信息如下:

Exception in thread "main" 已断开连接 (32109) - java.io.IOException: Connection is lost.
	at org.eclipse.paho.client.mqttv3.internal.CommsReceiver.run(CommsReceiver.java:197)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Connection is lost.
	at org.eclipse.paho.client.mqttv3.internal.CommsReceiver.run(CommsReceiver.java:172)
	... 1 more

但只要关闭webhook 插件能正常连接。

配置文件及日志

webhook的配置文件如下:


emqx 的日志有如下error:

[error] client-91885@113.218.174.44:12096 file: emqx_cm.erl, line: 304, mfa: {emqx_cm,do_discard_session,2}, msg: failed_to_discard_session, pid: <0.4447.8>, reason: {'EXIT',{{shutdown,tcp_closed},{gen_server,call,[<45356.17562.3>,discard,15000]}}}, stacktrace: [{emqx_cm,rpc_call,3,[{file,"emqx_cm.erl"},{line,380}]},{emqx_cm,do_discard_session,2,[{file,"emqx_cm.erl"},{line,289}]},{lists,foreach,2,[{file,"lists.erl"},{line,1342}]},{emqx_cm,'-open_session/3-fun-0-',5,[{file,"emqx_cm.erl"},{line,217}]},{emqx_cm_locker,trans,3,[{file,"emqx_cm_locker.erl"},{line,46}]},{emqx_channel,process_connect,2,[{file,"emqx_channel.erl"},{line,492}]},{emqx_connection,with_channel,3,[{file,"emqx_connection.erl"},{line,659}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,364}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,370}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,328}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,236}]}]
2021-09-29T01:09:52.085385+08:00 [error] client-77039@113.218.174.44:11200 file: emqx_cm.erl, line: 304, mfa: {emqx_cm,do_discard_session,2}, msg: failed_to_discard_session, pid: <0.27620.8>, reason: {'EXIT',{{shutdown,tcp_closed},{gen_server,call,[<45357.7750.4>,discard,15000]}}}, stacktrace: [{emqx_cm,rpc_call,3,[{file,"emqx_cm.erl"},{line,380}]},{emqx_cm,do_discard_session,2,[{file,"emqx_cm.erl"},{line,289}]},{lists,foreach,2,[{file,"lists.erl"},{line,1342}]},{emqx_cm,'-open_session/3-fun-0-',5,[{file,"emqx_cm.erl"},{line,217}]},{emqx_cm_locker,trans,3,[{file,"emqx_cm_locker.erl"},{line,46}]},{emqx_channel,process_connect,2,[{file,"emqx_channel.erl"},{line,492}]},{emqx_connection,with_channel,3,[{file,"emqx_connection.erl"},{line,659}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,364}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,370}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,328}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,236}]}]

这部分的日志,连接已经被关闭,等级是 error,需要开启 debug 级别的日志查看原因,开启方式参考文档

下面是debug的日志,我看不出来什么问题

下面的日志是我发起一个客户端连接的dubug日志,麻烦您看看。

2021-09-29T10:50:14.484558+08:00 [debug] consumer_207517@8.208.78.41:43466 [MQTT] RECV <<192,0>>
2021-09-29T10:50:14.484692+08:00 [debug] consumer_207517@8.208.78.41:43466 [MQTT] RECV PINGREQ(Q0, R0, D0)
2021-09-29T10:50:14.484783+08:00 [debug] consumer_207517@8.208.78.41:43466 [MQTT] SEND PINGRESP(Q0, R0, D0)
2021-09-29T10:50:20.962560+08:00 [debug] 175.10.75.127:63734 [MQTT] RECV <<16,56,0,4,77,81,84,84,4,194,0,20,0,15,99,108,105,101,110,...>>
2021-09-29T10:50:20.962702+08:00 [debug] 175.10.75.127:63734 [MQTT] RECV CONNECT(Q0, R0, D0, ClientId=client-abc21547, ProtoName=MQTT, ProtoVsn=4, CleanStart=true, KeepAlive=20, Username=wisecloud, Password=******)
2021-09-29T10:50:20.962885+08:00 [debug] 175.10.75.127:63734 [WebHook] Send to: "/wisecloud-message-webhook-service/msg/webHook", params: {"username":"wisecloud","proto_ver":4,"node":"emqx2@10.100.10.216","keepalive":20,"ipaddress":"175.10.75.127","clientid":"client-abc21547","action":"client_connect"}
2021-09-29T10:50:22.117378+08:00 [debug] client-abc21547@175.10.75.127:63734 [WebHook] Send to: "/wisecloud-message-webhook-service/msg/webHook", params: {"username":"wisecloud","proto_ver":4,"node":"emqx2@10.100.10.216","keepalive":20,"ipaddress":"175.10.75.127","connected_at":1632883822117,"clientid":"client-abc21547","action":"client_connected"}
2021-09-29T10:50:22.234627+08:00 [debug] 175.10.75.127:63736 [MQTT] RECV <<16,58,0,6,77,81,73,115,100,112,3,194,0,20,0,15,99,108,105,...>>
2021-09-29T10:50:22.234746+08:00 [debug] 175.10.75.127:63736 [MQTT] RECV CONNECT(Q0, R0, D0, ClientId=client-abc21547, ProtoName=MQIsdp, ProtoVsn=3, CleanStart=true, KeepAlive=20, Username=wisecloud, Password=******)
2021-09-29T10:50:22.234897+08:00 [debug] 175.10.75.127:63736 [WebHook] Send to: "/wisecloud-message-webhook-service/msg/webHook", params: {"username":"wisecloud","proto_ver":3,"node":"emqx2@10.100.10.216","keepalive":20,"ipaddress":"175.10.75.127","clientid":"client-abc21547","action":"client_connect"}
2021-09-29T10:50:22.538652+08:00 [debug] consumer_374962@8.208.78.41:47027 [MQTT] RECV <<192,0>>
2021-09-29T10:50:22.538752+08:00 [debug] consumer_374962@8.208.78.41:47027 [MQTT] RECV PINGREQ(Q0, R0, D0)
2021-09-29T10:50:22.538843+08:00 [debug] consumer_374962@8.208.78.41:47027 [MQTT] SEND PINGRESP(Q0, R0, D0)
2021-09-29T10:50:22.539653+08:00 [debug] consumer_430934@8.208.78.41:42929 [MQTT] RECV <<192,0>>
2021-09-29T10:50:22.539727+08:00 [debug] consumer_430934@8.208.78.41:42929 [MQTT] RECV PINGREQ(Q0, R0, D0)
2021-09-29T10:50:22.539802+08:00 [debug] consumer_430934@8.208.78.41:42929 [MQTT] SEND PINGRESP(Q0, R0, D0)
2021-09-29T10:50:22.995907+08:00 [debug] client-abc21547@175.10.75.127:63734 [WebHook] Send to: "/wisecloud-message-webhook-service/msg/webHook", params: {"username":"wisecloud","proto_ver":4,"node":"emqx2@10.100.10.216","keepalive":20,"ipaddress":"175.10.75.127","conn_ack":"success","clientid":"client-abc21547","action":"client_connack"}
2021-09-29T10:50:23.295391+08:00 [debug] client-abc21547@175.10.75.127:63736 [WebHook] Send to: "/wisecloud-message-webhook-service/msg/webHook", params: {"username":"wisecloud","proto_ver":3,"node":"emqx2@10.100.10.216","keepalive":20,"ipaddress":"175.10.75.127","connected_at":1632883823295,"clientid":"client-abc21547","action":"client_connected"}
2021-09-29T10:50:23.598088+08:00 [debug] client-abc21547@175.10.75.127:63734 client_id: <<"client-abc21547">>, file: emqx_cm.erl, line: 114, mfa: {emqx_cm,insert_channel_info,3}, msg: insert_channel_info, pid: <0.4927.1>
2021-09-29T10:50:23.598263+08:00 [debug] client-abc21547@175.10.75.127:63734 [MQTT] SEND CONNACK(Q0, R0, D0, AckFlags=0, ReasonCode=0)
2021-09-29T10:50:23.598415+08:00 [debug] client-abc21547@175.10.75.127:63734 [WebHook] Send to: "/wisecloud-message-webhook-service/msg/webHook", params: {"username":"wisecloud","reason":"tcp_closed","node":"emqx2@10.100.10.216","disconnected_at":1632883823598,"clientid":"client-abc21547","action":"client_disconnected"}
2021-09-29T10:50:24.206800+08:00 [debug] client-abc21547@175.10.75.127:63734 [WebHook] Send to: "/wisecloud-message-webhook-service/msg/webHook", params: {"username":"wisecloud","reason":"tcp_closed","node":"emqx2@10.100.10.216","clientid":"client-abc21547","action":"session_terminated"}
2021-09-29T10:50:24.803057+08:00 [info] client-abc21547@175.10.75.127:63734 file: emqx_connection.erl, line: 520, mfa: {emqx_connection,terminate,2}, msg: terminate, pid: <0.4927.1>, reason: {shutdown,tcp_closed}
2021-09-29T10:50:24.803285+08:00 [debug] client-abc21547@175.10.75.127:63736 file: emqx_cm.erl, line: 301, mfa: {emqx_cm,do_discard_session,2}, msg: session_already_shutdown,pid: <0.4948.1>
2021-09-29T10:50:24.803539+08:00 [debug] client-abc21547@175.10.75.127:63736 [WebHook] Send to: "/wisecloud-message-webhook-service/msg/webHook", params: {"username":"wisecloud","proto_ver":3,"node":"emqx2@10.100.10.216","keepalive":20,"ipaddress":"175.10.75.127","conn_ack":"success","clientid":"client-abc21547","action":"client_connack"}
2021-09-29T10:50:25.217059+08:00 [debug] wisecloud-client-node-1000025061@8.208.78.41:30204 [MQTT] RECV <<192,0>>
2021-09-29T10:50:25.217192+08:00 [debug] wisecloud-client-node-1000025061@8.208.78.41:30204 [MQTT] RECV PINGREQ(Q0, R0, D0)
2021-09-29T10:50:25.217180+08:00 [debug] consumer_101991@8.208.78.41:24421 [MQTT] RECV <<192,0>>
2021-09-29T10:50:25.217290+08:00 [debug] wisecloud-client-node-10000960520@8.208.78.41:13574 [MQTT] RECV <<192,0>>
2021-09-29T10:50:25.217348+08:00 [debug] consumer_101991@8.208.78.41:24421 [MQTT] RECV PINGREQ(Q0, R0, D0)
2021-09-29T10:50:25.217405+08:00 [debug] wisecloud-client-node-10000960520@8.208.78.41:13574 [MQTT] RECV PINGREQ(Q0, R0, D0)
2021-09-29T10:50:25.217458+08:00 [debug] consumer_101991@8.208.78.41:24421 [MQTT] SEND PINGRESP(Q0, R0, D0)
2021-09-29T10:50:25.217327+08:00 [debug] wisecloud-client-node-1000025061@8.208.78.41:30204 [MQTT] SEND PINGRESP(Q0, R0, D0)
2021-09-29T10:50:25.217405+08:00 [debug] consumer_974038@8.208.78.41:42570 [MQTT] RECV <<192,0>>
2021-09-29T10:50:25.217493+08:00 [debug] wisecloud-client-node-10000960520@8.208.78.41:13574 [MQTT] SEND PINGRESP(Q0, R0, D0)
2021-09-29T10:50:25.217537+08:00 [debug] consumer_974038@8.208.78.41:42570 [MQTT] RECV PINGREQ(Q0, R0, D0)
2021-09-29T10:50:25.217589+08:00 [debug] consumer_974038@8.208.78.41:42570 [MQTT] SEND PINGRESP(Q0, R0, D0)
2021-09-29T10:50:25.218796+08:00 [debug] consumer_401947@8.208.78.41:2647 [MQTT] RECV <<192,0>>
2021-09-29T10:50:25.218861+08:00 [debug] consumer_401947@8.208.78.41:2647 [MQTT] RECV PINGREQ(Q0, R0, D0)
2021-09-29T10:50:25.218930+08:00 [debug] consumer_401947@8.208.78.41:2647 [MQTT] SEND PINGRESP(Q0, R0, D0)
2021-09-29T10:50:25.381736+08:00 [debug] client-abc21547@175.10.75.127:63736 client_id: <<"client-abc21547">>, file: emqx_cm.erl, line: 114, mfa: {emqx_cm,insert_channel_info,3}, msg: insert_channel_info, pid: <0.4948.1>
2021-09-29T10:50:25.381912+08:00 [debug] client-abc21547@175.10.75.127:63736 [MQTT] SEND CONNACK(Q0, R0, D0, AckFlags=0, ReasonCode=0)
2021-09-29T10:50:25.382079+08:00 [debug] client-abc21547@175.10.75.127:63736 [WebHook] Send to: "/wisecloud-message-webhook-service/msg/webHook", params: {"username":"wisecloud","reason":"tcp_closed","node":"emqx2@10.100.10.216","disconnected_at":1632883825382,"clientid":"client-abc21547","action":"client_disconnected"}
2021-09-29T10:50:25.960419+08:00 [debug] client-abc21547@175.10.75.127:63736 [WebHook] Send to: "/wisecloud-message-webhook-service/msg/webHook", params: {"username":"wisecloud","reason":"tcp_closed","node":"emqx2@10.100.10.216","clientid":"client-abc21547","action":"session_terminated"}
2021-09-29T10:50:26.537258+08:00 [info] client-abc21547@175.10.75.127:63736 file: emqx_connection.erl, line: 520, mfa: {emqx_connection,terminate,2}, msg: terminate, pid: <0.4948.1>, reason: {shutdown,tcp_closed}
2021-09-29T10:50:33.024622+08:00 [debug] consumer_911618@8.208.78.41:33221 [MQTT] RECV <<192,0>>
2021-09-29T10:50:33.024736+08:00 [debug] consumer_911618@8.208.78.41:33221 [MQTT] RECV PINGREQ(Q0, R0, D0)
2021-09-29T10:50:33.024860+08:00 [debug] consumer_911618@8.208.78.41:33221 [MQTT] SEND PINGRESP(Q0, R0, D0)
2021-09-29T10:50:33.963230+08:00 [debug] consumer_124034@8.208.78.41:45448 [MQTT] RECV <<192,0>>
2021-09-29T10:50:33.963353+08:00 [debug] consumer_124034@8.208.78.41:45448 [MQTT] RECV PINGREQ(Q0, R0, D0)
2021-09-29T10:50:33.963433+08:00 [debug] consumer_124034@8.208.78.41:45448 [MQTT] SEND PINGRESP(Q0, R0, D0)

日志看起来没有报错。但是终端的行为有点不对,看起来是登录完成之后直接断开了TCP连接,需要debug一下你的java代码,是不是login之后直接关闭了连接?
reason: {shutdown,tcp_closed},设备并不是通过logout离线的,直接断开了TCP

您好,确定java代码没有主动关闭,只有webhook打开的时候才有这个问题,只要关闭webhook插件,登陆连接正常。

你的 webhook 的响应码是 200 吗?因为我没看到日志中有报错,现在只有两种情况:
1 设备主动切断了TCP
2 webhook 返回设备错误,结束终端

主要是我没有看到日志中出现 webhook 的错误码情况,如果返回的数据是 非 2XX 的响应,都会有日志打出来,非常难定位原因。

如果是非 2XX 的返回值,设备应该不是 reason: {shutdown,tcp_closed}

webhook 配置的地址是:https://wisecloud3-dev.lianwuzizai.com/wisecloud-message-webhook-service/msg/webHook 返回是200的,您可以用curl -d ‘{“login”: “emma”, “pass”: “123”}’ -H ‘Content-Type: application/json’ https://wisecloud3-dev.lianwuzizai.com/wisecloud-message-webhook-service/msg/webHook 试试就能看到结果。

我可以调用你的服务吗?我想在本地用你的配置文件尝试复现一下

可以的。我现在配置了一个内网地址就可以了,但我还是不知道原因为什么内网就可以,因为内网和外网都是通的,当配置的外网地址,我是看到emq已经有消息过来了的,如我上面给您发的我服务端的日志,并且我也在emq服务器上面ping上面的外网地址,也是通的。但就是不知道为什么配置了外网,会导致设备 reason: {shutdown,tcp_closed}

我明白了,你的网络其实是没有通过去的,我注意到一个问题,你的配置文件是 http:// 开头的,是不是你的服务器不支持 http 调用,只支持 https ?所以并没有触发到webserver响应错误的日志,登陆流程触发了webhook,但是业务完成不正常被打断,TCP连接就被切断了

其实我一开始就http和https 都试过的,都是一样的错误。并且我内网配置的就是http 。多谢您耐心的为我分析和解答

我的服务器确定是http 和 HTTPS 都支持的

webhook对应的http服务如果产生过载,或者异常宕机,就会导致客户端无法连接上broker,这样设计是不是让webhook和broker强耦合到一起了?

因为emqx无法保证后面的webhook服务是一直能正常工作的,而外部系统的异常不应该影响到正常的消息流程。

1赞

需要强耦合。
举几个例子:
场景1:开启设备登陆认证,认证服务器宕机,此时非法设备登陆,无法鉴权,如果放行了,设备发布大量敏感信息。此时所有设备只能按照登陆失败处理。
场景2: 设备数据上报,webhook 接受端没有正常工作,数据流失败,但是终端无感知,业务数据全丢。此时正常的行为就是服务全挂,终端感知到业务失败。

加个配置来决定是否耦合是不是好一点,有些场景只需要通过webhook获取数据,即使丢失了也不影响业务流程。