数据包大时,客户端会被断开连接

环境信息

  • EMQ X 版本:4.2
  • 操作系统及版本:linux centos7.9
  • 其他

问题描述

当我40个tcp链接每个连接每秒发送1m数据,并发量也就是1秒40m,从emq接收这些数据并处理的server端会被断开连接
server端具体日志:

eg|2021-09-14 15:29:24.111|[MQTT Ping: eg-test-1]|ERROR|o.eclipse.paho.mqttv5.client.internal.ClientState.logToJsr47(210)|eg-test-1: Timed out as no activity, keepAlive=60,000,000,000 lastOutboundActivity=36,289,061,046,282,022 lastInboundActivity=36,289,001,249,567,043 time=36,289,121,046,338,588 lastPing=36,289,061,046,297,389

用的是eclipse.paho连接的,看日志应该是没有收到心跳,所以客户端断开了。。。。

麻烦提供一下 emqx 这边的日志

[error] <<"eg-test-2">>@192.168.10.11:30882   crasher:
    initial call: emqx_connection:init/4
    pid: <0.8817.6>
    registered_name: []
    exception exit: {timeout,
                        {gen_server,call,
                            [emqx_shared_sub,
                             {subscribe,<<"eg">>,
                                 <<"eap/L8WSQT7F/connect">>,<0.8817.6>}]}}
      in function  emqx_connection:terminate/2 (emqx_connection.erl, line 430)
    ancestors: [<0.1812.0>,<0.1811.0>,esockd_sup,<0.1398.0>]
    message_queue_len: 0
    messages: []
    links: [<0.1812.0>]
    dictionary: [{acl_cache_size,1},
                  {acl_keys_q,{[{subscribe,<<"eap/L8WSQT7F/connect">>}],[]}},
                  {recv_pkt,2},
                  {guid,{1631981893871120,37976201503345,0}},
                  {'$logger_metadata$',
                      #{clientid => <<"eg-test-2">>,
                        peername => "192.168.10.11:30882"}},
                  {send_pkt,1},
                  {{subscribe,<<"eap/L8WSQT7F/connect">>},
                   {allow,1631981895130}},
                  {incoming_bytes,90},
                  {outgoing_bytes,21},
                  {rand_seed,
                      {#{bits => 58,jump => #Fun<rand.13.8986388>,
                         next => #Fun<rand.10.8986388>,type => exsss,
                         uniform => #Fun<rand.11.8986388>,
                         uniform_n => #Fun<rand.12.8986388>},
                       [115385567854006191|27886652920878634]}}]
    trap_exit: false
    status: running
    heap_size: 2586
    stack_size: 27
    reductions: 3820
  neighbours:

2021-09-19 00:18:20.138 [error]     supervisor: 'esockd_connection_sup - <0.1812.0>'
    errorContext: connection_crashed
    reason: {timeout,
                {gen_server,call,
                    [emqx_shared_sub,
                     {subscribe,<<"eg">>,<<"eap/L8WSQT7F/connect">>,
                         <0.8817.6>}]}}
    offender: [{pid,<0.8817.6>},
               {name,connection},
               {mfargs,
                   {emqx_connection,start_link,
                       [[{deflate_options,[]},
                         {max_conn_rate,1000},
                         {active_n,100},
                         {zone,external}]]}}]
2021-09-19 00:18:20.237 [warning] Received gun_down with closed
2021-09-19 00:18:20.362 [warning] Received gun_down with closed
2021-09-19 00:18:20.843 [warning] Received gun_down with closed
2021-09-19 00:18:21.057 [warning] Received gun_down with closed

2021-09-19 00:19:06.194 [warning] Mnesia('emqx@192.168.10.23'): ** WARNING ** Mnesia is overloaded: {mnesia_tm,message_queue_len,[120,265]}

2021-09-19 00:19:06.194 [error] Ekka(Monitor): Mnesia overload: {mnesia_tm,message_queue_len,[120,265]}
2021-09-19 00:19:06.729 [warning] Received gun_down with closed

这是消费端消费太慢吗?已经启动两个消费端了,一个启动6小时后报上面错和emq断开了,另一个是12小时后断开。。。

看日志是订阅超时,你的消费端连接之后还会持续建立新的订阅吗?

就两个消费端在消费,,上面是1600个client端在以每秒10k的速率发送数据进行的压力测试,两个消费端一个在5个小时左右断开了,另一个是10个小时左右断开了,,好像断开在重连是可以连接上,但是无法订阅,必须要重启。。