通过docker部署了两个节点的emqx集群。版本 5.4.1,镜像id:562e462575aa。
其中一个节点经常会宕机,宕机后可以正常重启。
查看日志好像是 Mnesia相关的错误,具体原因不明。请帮忙看看宕机的原因是什么,如何解决。
2024-03-22T11:07:32.521472+08:00 [error] Mnesia('emqx1@192.168.3.221'): ** ERROR ** (core dumped to file: "/opt/emqx/MnesiaCore.emqx1@192.168.3.221_1711_76852_521307"), ** FATAL ** Failed to merge schema: {aborted,function_clause}
2024-03-22T11:07:42.522577+08:00 [error] crasher: initial call: application_master:init/4, pid: <0.2167.0>, registered_name: [], exit: {{normal,{mnesia_app,start,[normal,[]]}},[{application_master,init,4,[{file,"application_master.erl"},{line,142}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}, ancestors: [<0.2166.0>], message_queue_len: 1, messages: [{'EXIT',<0.2168.0>,normal}], links: [<0.2166.0>,<0.1999.0>], dictionary: [], trap_exit: true, status: running, heap_size: 376, stack_size: 28, reductions: 170; neighbours:
2024-03-22T11:07:42.522728+08:00 [error] Generic server mnesia_monitor terminating. Reason: killed. Last message: {'EXIT',<0.2172.0>,killed}. State: {state,<0.2172.0>,[],[],true,[],undefined,[],[]}.
2024-03-22T11:07:42.522758+08:00 [error] Generic server mnesia_recover terminating. Reason: killed. Last message: {'EXIT',<0.2172.0>,killed}. State: {state,<0.2172.0>,undefined,undefined,undefined,0,false,true,[]}.
2024-03-22T11:07:42.522782+08:00 [error] crasher: initial call: gen_event:init_it/6, pid: <0.2170.0>, registered_name: mnesia_event, exit: {killed,[{gen_event,terminate_server,4,[{file,"gen_event.erl"},{line,580}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}, ancestors: [mnesia_sup,<0.2168.0>], message_queue_len: 1, messages: [{notify,{mnesia_system_event,{mnesia_down,'emqx1@192.168.3.221'}}}], links: [], dictionary: [], trap_exit: true, status: running, heap_size: 1598, stack_size: 28, reductions: 3612; neighbours:
2024-03-22T11:07:42.522834+08:00 [error] crasher: initial call: mnesia_recover:init/1, pid: <0.2176.0>, registered_name: mnesia_recover, exit: {killed,[{gen_server,decode_msg,9,[{file,"gen_server.erl"},{line,909}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}, ancestors: [mnesia_kernel_sup,mnesia_sup,<0.2168.0>], message_queue_len: 0, messages: [], links: [], dictionary: [], trap_exit: true, status: running, heap_size: 233, stack_size: 28, reductions: 5012; neighbours:
2024-03-22T11:07:42.522913+08:00 [error] crasher: initial call: application_master:init/4, pid: <0.2159.0>, registered_name: [], exit: {{bad_return,{{mria_app,start,[normal,[]]},{'EXIT',{{badmatch,{error,{normal,{mnesia_app,start,[normal,[]]}}}},[{mria_mnesia,ensure_started,0,[{file,"mria_mnesia.erl"},{line,112}]},{mria_app,start,2,[{file,"mria_app.erl"},{line,36}]},{application_master,start_it_old,4,[{file,"application_master.erl"},{line,293}]}]}}}},[{application_master,init,4,[{file,"application_master.erl"},{line,142}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}, ancestors: [<0.2158.0>], message_queue_len: 1, messages: [{'EXIT',<0.2160.0>,normal}], links: [<0.2158.0>,<0.1999.0>], dictionary: [], trap_exit: true, status: running, heap_size: 376, stack_size: 28, reductions: 167; neighbours:
2024-03-22T11:07:42.522729+08:00 [error] Generic server mnesia_subscr terminating. Reason: killed. Last message: {'EXIT',<0.2172.0>,killed}. State: {state,<0.2172.0>,#Ref<0.2728886987.996278276.230299>}.
2024-03-22T11:07:42.523091+08:00 [error] crasher: initial call: mnesia_subscr:init/1, pid: <0.2174.0>, registered_name: mnesia_subscr, exit: {killed,[{gen_server,decode_msg,9,[{file,"gen_server.erl"},{line,909}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}, ancestors: [mnesia_kernel_sup,mnesia_sup,<0.2168.0>], message_queue_len: 0, messages: [], links: [], dictionary: [], trap_exit: true, status: running, heap_size: 2586, stack_size: 28, reductions: 2384; neighbours:
2024-03-22T11:07:42.523194+08:00 [error] crasher: initial call: application_master:init/4, pid: <0.2156.0>, registered_name: [], exit: {{bad_return,{{emqx_machine_app,start,[normal,[]]},{'EXIT',{{badmatch,{error,{mria,{bad_return,{{mria_app,start,[normal,[]]},{'EXIT',{{badmatch,{error,{normal,{mnesia_app,start,[normal,[]]}}}},[{mria_mnesia,ensure_started,0,[{file,"mria_mnesia.erl"},{line,112}]},{mria_app,start,2,[{file,"mria_app.erl"},{line,36}]},{application_master,start_it_old,4,[{file,"application_master.erl"},{line,293}]}]}}}}}}},[{mria,start,0,[{file,"mria.erl"},{line,125}]},{ekka,start,0,[{file,"ekka.erl"},{line,94}]},{emqx_machine,start,0,[{file,"emqx_machine.erl"},{line,54}]},{emqx_machine_app,start,2,[{file,"emqx_machine_app.erl"},{line,29}]},{application_master,start_it_old,4,[{file,"application_master.erl"},{line,293}]}]}}}},[{application_master,init,4,[{file,"application_master.erl"},{line,142}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}, ancestors: [<0.2155.0>], message_queue_len: 1, messages: [{'EXIT',<0.2157.0>,normal}], links: [<0.2155.0>,<0.1999.0>], dictionary: [], trap_exit: true, status: running, heap_size: 610, stack_size: 28, reductions: 188; neighbours:
2024-03-22T11:07:42.524335+08:00 [error] crasher: initial call: mnesia_monitor:init/1, pid: <0.2173.0>, registered_name: mnesia_monitor, exit: {killed,[{gen_server,decode_msg,9,[{file,"gen_server.erl"},{line,909}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}, ancestors: [mnesia_kernel_sup,mnesia_sup,<0.2168.0>], message_queue_len: 0, messages: [], links: [<0.2206.0>,<60217.2173.0>], dictionary: [], trap_exit: true, status: running, heap_size: 6772, stack_size: 28, reductions: 7224; neighbours:
{"Kernel pid terminated",application_controller,"{application_start_failure,emqx_machine,{bad_return,{{emqx_machine_app,start,[normal,[]]},{'EXIT',{{badmatch,{error,{mria,{bad_return,{{mria_app,start,[normal,[]]},{'EXIT',{{badmatch,{error,{normal,{mnesia_app,start,[normal,[]]}}}},[{mria_mnesia,ensure_started,0,[{file,\"mria_mnesia.erl\"},{line,112}]},{mria_app,start,2,[{file,\"mria_app.erl\"},{line,36}]},{application_master,start_it_old,4,[{file,\"application_master.erl\"},{line,293}]}]}}}}}}},[{mria,start,0,[{file,\"mria.erl\"},{line,125}]},{ekka,start,0,[{file,\"ekka.erl\"},{line,94}]},{emqx_machine,start,0,[{file,\"emqx_machine.erl\"},{line,54}]},{emqx_machine_app,start,2,[{file,\"emqx_machine_app.erl\"},{line,29}]},{application_master,start_it_old,4,[{file,\"application_master.erl\"},{line,293}]}]}}}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,emqx_machine,{bad_return,{{emqx_machine_app,start,[normal,[]]},{'EXIT',{{badmatch,{error,{mria,{bad_return,{{mria_app,start,[normal,[]]},{'EXIT',{{badmatch,{error,{normal,{mnesia_app,start,[normal,[]]}}}},[{mria_mnesia,ensure_started,0,[{file,"mria_mnesia.erl"},{line,112}]},{mria_app,start,2,[{file,"mria_app.erl"},{line,36}]},{application_master,start_it_old,4,[{file,"application_master.erl"},{line,293}]}]}}}}}}},[{mria,start,0,[{file,"mria.erl"},{line,125}]},{ekka,start,0,[{file,"ekka.erl"},{line,94}]},{emqx_machine,start,0,[{file,"emqx_machine.erl"},{line,54}]},{emqx_machine_app,start,2,[{file,"emqx_machine_app.erl"},{line,29}]},{application_master,start_it_old,4,[{file,"application_master.erl"},{line,293}]}]}}}}})
Crash dump is being written to: /opt/emqx/log/erl_crash.dump...done
hostname: Name or service not known
EMQX_RPC__PORT_DISCOVERY [rpc.port_discovery]: manual
EMQX_NODE__COOKIE [node.cookie]: ******
EMQX_NODE__NAME [node.name]: emqx1@192.168.3.221
Listener ssl:default on 0.0.0.0:8883 started.
Listener tcp:default on 0.0.0.0:1883 started.
Listener ws:default on 0.0.0.0:8083 started.
Listener wss:default on 0.0.0.0:8084 started.
Listener http:dashboard on :18083 started.
EMQX 5.4.1 is running now!
2024-03-22T11:09:40.460784+08:00 [warning] msg: Stopping mria, mfa: mria:stop/1(134), reason: join
Stop listener http:dashboard on :18083 successfully.
Listener ssl:default on 0.0.0.0:8883 stopped.
Listener tcp:default on 0.0.0.0:1883 stopped.
Listener ws:default on 0.0.0.0:8083 stopped.
Listener wss:default on 0.0.0.0:8084 stopped.