emqx重启后报错inconsistent_database起不来

k8s部署的emqx服务,在重启后出现如下报错,导致起不来。一共两个实例,只有第一个会出现这个问题。

错误报告

2022-12-14T06:41:09.022178+00:00 [error] Mnesia(‘emqx-6375de4204eb7ea575ba7481@emqx-6375de4204eb7ea575ba7481-0.emqx-6375de4204eb7ea575ba7481-headless.iot.svc.cluster.local’): ** ERROR ** mnesia_event got {inconsistent_database, starting_partitioned_network, ‘emqx-6375de4204eb7ea575ba7481@emqx-6375de4204eb7ea575ba7481-1.emqx-6375de4204eb7ea575ba7481-headless.iot.svc.cluster.local’}
2022-12-14T06:41:09.040638+00:00 [error] Mnesia(‘emqx-6375de4204eb7ea575ba7481@emqx-6375de4204eb7ea575ba7481-0.emqx-6375de4204eb7ea575ba7481-headless.iot.svc.cluster.local’): ** ERROR ** (core dumped to file: “/opt/emqx/MnesiaCore.emqx-6375de4204eb7ea575ba7481@emqx-6375de4204eb7ea575ba7481-0.emqx-6375de4204eb7ea575ba7481-headless.iot.svc.cluster.local_1671_69_40113”), ** FATAL ** Failed to merge schema: Bad cookie in table definition emqx_shared_subscription: ‘emqx-6375de4204eb7ea575ba7481@emqx-6375de4204eb7ea575ba7481-0.emqx-6375de4204eb7ea575ba7481-headless.iot.svc.cluster.local’ = {cstruct,emqx_shared_subscription,bag,[‘emqx-6375de4204eb7ea575ba7481@emqx-6375de4204eb7ea575ba7481-0.emqx-6375de4204eb7ea575ba7481-headless.iot.svc.cluster.local’],[],[],[],0,read_write,false,[],[],false,emqx_shared_subscription,[group,topic,subpid],[],[],[],{{1669275388470114444,-576460752303421951,1},‘emqx-6375de4204eb7ea575ba7481@emqx-6375de4204eb7ea575ba7481-0.emqx-6375de4204eb7ea575ba7481-headless.iot.svc.cluster.local’},{{18,1},{‘emqx-6375de4204eb7ea575ba7481@emqx-6375de4204eb7ea575ba7481-0.emqx-6375de4204eb7ea575ba7481-headless.iot.svc.cluster.local’,{1670,988330,523848}}}}, ‘emqx-6375de4204eb7ea575ba7481@emqx-6375de4204eb7ea575ba7481-1.emqx-6375de4204eb7ea575ba7481-headless.iot.svc.cluster.local’ = {cstruct,emqx_shared_subscription,bag,[‘emqx-6375de4204eb7ea575ba7481@emqx-6375de4204eb7ea575ba7481-1.emqx-6375de4204eb7ea575ba7481-headless.iot.svc.cluster.local’],[],[],[],0,read_write,false,[],[],false,emqx_shared_subscription,[group,topic,subpid],[],[],[],{{1670988720079328548,-576460752303422079,1},‘emqx-6375de4204eb7ea575ba7481@emqx-6375de4204eb7ea575ba7481-1.emqx-6375de4204eb7ea575ba7481-headless.iot.svc.cluster.local’},{{2,0},[]}}
2022-12-14T06:41:19.040681+00:00 [error] Generic server mnesia_recover terminating. Reason: killed. Last message: {‘EXIT’,<0.1835.0>,killed}. State: {state,<0.1835.0>,undefined,undefined,undefined,0,false,true,[]}.
2022-12-14T06:41:19.040861+00:00 [error] Generic server mnesia_subscr terminating. Reason: killed. Last message: {‘EXIT’,<0.1835.0>,killed}. State: {state,<0.1835.0>,#Ref<0.2424242544.556662785.198007>}.
2022-12-14T06:41:19.041036+00:00 [error] Generic server mnesia_monitor terminating. Reason: killed. Last message: {‘EXIT’,<0.1835.0>,killed}. State: {state,<0.1835.0>,[],[],true,[],undefined,[],[]}.
2022-12-14T06:41:19.040799+00:00 [error] crasher: initial call: gen_event:init_it/6, pid: <0.1833.0>, registered_name: mnesia_event, exit: {killed,[{gen_event,terminate_server,4,[{file,“gen_event.erl”},{line,405}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,226}]}]}, ancestors: [mnesia_sup,<0.1831.0>], message_queue_len: 1, messages: [{notify,{mnesia_system_event,{mnesia_down,‘emqx-6375de4204eb7ea575ba7481@emqx-6375de4204eb7ea575ba7481-0.emqx-6375de4204eb7ea575ba7481-headless.iot.svc.cluster.local’}}}], links: [], dictionary: [], trap_exit: true, status: running, heap_size: 17731, stack_size: 29, reductions: 20641; neighbours:
2022-12-14T06:41:19.040919+00:00 [error] crasher: initial call: application_master:init/4, pid: <0.1830.0>, registered_name: [], exit: {{normal,{mnesia_app,start,[normal,[]]}},[{application_master,init,4,[{file,“application_master.erl”},{line,142}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,226}]}]}, ancestors: [<0.1829.0>], message_queue_len: 1, messages: [{‘EXIT’,<0.1831.0>,normal}], links: [<0.1829.0>,<0.1685.0>], dictionary: [], trap_exit: true, status: running, heap_size: 376, stack_size: 29, reductions: 167; neighbours:
2022-12-14T06:41:19.041375+00:00 [error] crasher: initial call: mnesia_recover:init/1, pid: <0.1839.0>, registered_name: mnesia_recover, exit: {killed,[{gen_server,decode_msg,9,[{file,“gen_server.erl”},{line,481}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,226}]}]}, ancestors: [mnesia_kernel_sup,mnesia_sup,<0.1831.0>], message_queue_len: 0, messages: [], links: [], dictionary: [], trap_exit: true, status: running, heap_size: 2586, stack_size: 29, reductions: 11684; neighbours:
2022-12-14T06:41:19.041475+00:00 [error] crasher: initial call: mnesia_subscr:init/1, pid: <0.1837.0>, registered_name: mnesia_subscr, exit: {killed,[{gen_server,decode_msg,9,[{file,“gen_server.erl”},{line,481}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,226}]}]}, ancestors: [mnesia_kernel_sup,mnesia_sup,<0.1831.0>], message_queue_len: 0, messages: [], links: [], dictionary: [], trap_exit: true, status: running, heap_size: 2586, stack_size: 29, reductions: 6295; neighbours:
2022-12-14T06:41:19.041969+00:00 [error] crasher: initial call: application_master:init/4, pid: <0.1821.0>, registered_name: [], exit: {{bad_return,{{mria_app,start,[normal,[]]},{‘EXIT’,{{badmatch,{error,{normal,{mnesia_app,start,[normal,[]]}}}},[{mria_mnesia,ensure_started,0,[{file,“mria_mnesia.erl”},{line,108}]},{mria_app,start,2,[{file,“mria_app.erl”},{line,35}]},{application_master,start_it_old,4,[{file,“application_master.erl”},{line,293}]}]}}}},[{application_master,init,4,[{file,“application_master.erl”},{line,142}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,226}]}]}, ancestors: [<0.1820.0>], message_queue_len: 1, messages: [{‘EXIT’,<0.1822.0>,normal}], links: [<0.1820.0>,<0.1685.0>], dictionary: [], trap_exit: true, status: running, heap_size: 610, stack_size: 29, reductions: 187; neighbours:
2022-12-14T06:41:19.046655+00:00 [error] crasher: initial call: mnesia_monitor:init/1, pid: <0.1836.0>, registered_name: mnesia_monitor, exit: {killed,[{gen_server,decode_msg,9,[{file,“gen_server.erl”},{line,481}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,226}]}]}, ancestors: [mnesia_kernel_sup,mnesia_sup,<0.1831.0>], message_queue_len: 0, messages: [], links: [<0.1862.0>,<53670.1831.0>], dictionary: [], trap_exit: true, status: running, heap_size: 4185, stack_size: 29, reductions: 11067; neighbours:

环境

  • EMQX 版本:5.0.7
  • 操作系统版本:k8s部署

重现此问题的步骤

  1. xxx
  2. xxx
  3. xxx

预期行为

实际行为


功能请求

描述你需要的功能

为什么你需要这个功能


其他

请问你是如何部署的 EMQX,EMQX Helm Chart 还是 EMQX Operator ?

helm chart

看下来是在下线节点的时候直接关闭机器,没有驱逐pod的时候会出现

有什么方法解决么,我的同样出现该问题
emqx@emqx-clsuter-0:/opt/emqx$ ./bin/emqx ctl conf show cluster
cluster {
autoclean = 24h
autoheal = true
discovery_strategy = dns
dns {name = emqx-clsuter-headless.emqx-cluster.svc.cluster.local, record_type = srv}
name = emqxcl
proto_dist = inet_tcp
}

WARNING: Default (insecure) Erlang cookie is in use.
WARNING: Configure node.cookie in /opt/emqx/etc/emqx.conf or override from environment variable EMQX_NODE__COOKIE
WARNING: NOTE: Use the same cookie for all nodes in the cluster.
EMQX_DASHBOARD__DEFAULT_PASSWORD [dashboard.default_password]: ******
EMQX_DASHBOARD__DEFAULT_USERNAME [dashboard.default_username]: admin
EMQX_RPC__PORT_DISCOVERY [rpc.port_discovery]: manual
EMQX_CLUSTER__DNS__RECORD_TYPE [cluster.dns.record_type]: srv
EMQX_CLUSTER__DNS__NAME [cluster.dns.name]: emqx-clsuter-headless.emqx-cluster.svc.cluster.local
EMQX_CLUSTER__DISCOVERY_STRATEGY [cluster.discovery_strategy]: dns
EMQX_NODE__NAME [node.name]: emqx-clsuter@emqx-clsuter-1.emqx-clsuter-headless.emqx-cluster.svc.cluster.local
2024-01-26T01:31:34.127595+00:00 [error] Mnesia(‘emqx-clsuter@emqx-clsuter-1.emqx-clsuter-headless.emqx-cluster.svc.cluster.local’): ** ERROR ** (core dumped to file: “/opt/emqx/MnesiaCore.emqx-clsuter@emqx-clsuter-1.emqx-clsuter-headless.emqx-cluster.svc.cluster.local_1706_232694_127173”), ** FATAL ** Failed to merge schema: {aborted,function_clause}
2024-01-26T01:31:44.128183+00:00 [error] crasher: initial call: application_master:init/4, pid: <0.1963.0>, registered_name: , exit: {{normal,{mnesia_app,start,[normal,]}},[{application_master,init,4,[{file,“application_master.erl”},{line,142}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,240}]}]}, ancestors: [<0.1962.0>], message_queue_len: 1, messages: [{‘EXIT’,<0.1964.0>,normal}], links: [<0.1962.0>,<0.1801.0>], dictionary: , trap_exit: true, status: running, heap_size: 376, stack_size: 28, reductions: 170; neighbours:
2024-01-26T01:31:44.128519+00:00 [error] Generic server mnesia_subscr terminating. Reason: killed. Last message: {‘EXIT’,<0.1968.0>,killed}. State: {state,<0.1968.0>,#Ref<0.3521175414.1904869377.219840>}.
2024-01-26T01:31:44.128657+00:00 [error] crasher: initial call: mnesia_subscr:init/1, pid: <0.1970.0>, registered_name: mnesia_subscr, exit: {killed,[{gen_server,decode_msg,9,[{file,“gen_server.erl”},{line,909}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,240}]}]}, ancestors: [mnesia_kernel_sup,mnesia_sup,<0.1964.0>], message_queue_len: 0, messages: , links: , dictionary: , trap_exit: true, status: running, heap_size: 1598, stack_size: 28, reductions: 2398; neighbours:
2024-01-26T01:31:44.128446+00:00 [error] Generic server mnesia_recover terminating. Reason: killed. Last message: {‘EXIT’,<0.1968.0>,killed}. State: {state,<0.1968.0>,undefined,undefined,undefined,0,false,true,}.
2024-01-26T01:31:44.128537+00:00 [error] Generic server mnesia_monitor terminating. Reason: killed. Last message: {‘EXIT’,<0.1968.0>,killed}. State: {state,<0.1968.0>,,,true,,undefined,,}.
2024-01-26T01:31:44.128821+00:00 [error] crasher: initial call: mnesia_recover:init/1, pid: <0.1972.0>, registered_name: mnesia_recover, exit: {killed,[{gen_server,decode_msg,9,[{file,“gen_server.erl”},{line,909}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,240}]}]}, ancestors: [mnesia_kernel_sup,mnesia_sup,<0.1964.0>], message_queue_len: 0, messages: , links: , dictionary: , trap_exit: true, status: running, heap_size: 1598, stack_size: 28, reductions: 6718; neighbours:
2024-01-26T01:31:44.128627+00:00 [error] crasher: initial call: gen_event:init_it/6, pid: <0.1966.0>, registered_name: mnesia_event, exit: {killed,[{gen_event,terminate_server,4,[{file,“gen_event.erl”},{line,580}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,240}]}]}, ancestors: [mnesia_sup,<0.1964.0>], message_queue_len: 1, messages: [{notify,{mnesia_system_event,{mnesia_down,‘emqx-clsuter@emqx-clsuter-1.emqx-clsuter-headless.emqx-cluster.svc.cluster.local’}}}], links: , dictionary: , trap_exit: true, status: running, heap_size: 4185, stack_size: 28, reductions: 4574; neighbours:
2024-01-26T01:31:44.128652+00:00 [error] crasher: initial call: application_master:init/4, pid: <0.1955.0>, registered_name: , exit: {{bad_return,{{mria_app,start,[normal,]},{‘EXIT’,{{badmatch,{error,{normal,{mnesia_app,start,[normal,]}}}},[{mria_mnesia,ensure_started,0,[{file,“mria_mnesia.erl”},{line,112}]},{mria_app,start,2,[{file,“mria_app.erl”},{line,36}]},{application_master,start_it_old,4,[{file,“application_master.erl”},{line,293}]}]}}}},[{application_master,init,4,[{file,“application_master.erl”},{line,142}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,240}]}]}, ancestors: [<0.1954.0>], message_queue_len: 1, messages: [{‘EXIT’,<0.1956.0>,normal}], links: [<0.1954.0>,<0.1801.0>], dictionary: , trap_exit: true, status: running, heap_size: 376, stack_size: 28, reductions: 167; neighbours:
2024-01-26T01:31:44.129103+00:00 [error] crasher: initial call: application_master:init/4, pid: <0.1952.0>, registered_name: , exit: {{bad_return,{{emqx_machine_app,start,[normal,]},{‘EXIT’,{{badmatch,{error,{mria,{bad_return,{{mria_app,start,[normal,]},{‘EXIT’,{{badmatch,{error,{normal,{mnesia_app,start,[normal,]}}}},[{mria_mnesia,ensure_started,0,[{file,“mria_mnesia.erl”},{line,112}]},{mria_app,start,2,[{file,“mria_app.erl”},{line,36}]},{application_master,start_it_old,4,[{file,“application_master.erl”},{line,293}]}]}}}}}}},[{mria,start,0,[{file,“mria.erl”},{line,124}]},{ekka,start,0,[{file,“ekka.erl”},{line,94}]},{emqx_machine,start,0,[{file,“emqx_machine.erl”},{line,45}]},{emqx_machine_app,start,2,[{file,“emqx_machine_app.erl”},{line,27}]},{application_master,start_it_old,4,[{file,“application_master.erl”},{line,293}]}]}}}},[{application_master,init,4,[{file,“application_master.erl”},{line,142}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,240}]}]}, ancestors: [<0.1951.0>], message_queue_len: 1, messages: [{‘EXIT’,<0.1953.0>,normal}], links: [<0.1951.0>,<0.1801.0>], dictionary: , trap_exit: true, status: running, heap_size: 610, stack_size: 28, reductions: 188; neighbours:
2024-01-26T01:31:44.129611+00:00 [error] crasher: initial call: mnesia_monitor:init/1, pid: <0.1969.0>, registered_name: mnesia_monitor, exit: {killed,[{gen_server,decode_msg,9,[{file,“gen_server.erl”},{line,909}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,240}]}]}, ancestors: [mnesia_kernel_sup,mnesia_sup,<0.1964.0>], message_queue_len: 1, messages: [{‘$gen_call’,{<0.1973.0>,#Ref<0.3521175414.1904738306.219462>},{close_log,latest_log}}], links: [<56870.1969.0>,<56871.1974.0>,<0.2009.0>], dictionary: , trap_exit: true, status: running, heap_size: 4185, stack_size: 28, reductions: 10473; neighbours:
{“Kernel pid terminated”,application_controller,“{application_start_failure,emqx_machine,{bad_return,{{emqx_machine_app,start,[normal,]},{‘EXIT’,{{badmatch,{error,{mria,{bad_return,{{mria_app,start,[normal,]},{‘EXIT’,{{badmatch,{error,{normal,{mnesia_app,start,[normal,]}}}},[{mria_mnesia,ensure_started,0,[{file,"mria_mnesia.erl"},{line,112}]},{mria_app,start,2,[{file,"mria_app.erl"},{line,36}]},{application_master,start_it_old,4,[{file,"application_master.erl"},{line,293}]}]}}}}}}},[{mria,start,0,[{file,"mria.erl"},{line,124}]},{ekka,start,0,[{file,"ekka.erl"},{line,94}]},{emqx_machine,start,0,[{file,"emqx_machine.erl"},{line,45}]},{emqx_machine_app,start,2,[{file,"emqx_machine_app.erl"},{line,27}]},{application_master,start_it_old,4,[{file,"application_master.erl"},{line,293}]}]}}}}}”}
Kernel pid terminated (application_controller) ({application_start_failure,emqx_machine,{bad_return,{{emqx_machine_app,start,[normal,]},{‘EXIT’,{{badmatch,{error,{mria,{bad_return,{{mria_app,start,[normal,]},{‘EXIT’,{{badmatch,{error,{normal,{mnesia_app,start,[normal,]}}}},[{mria_mnesia,ensure_started,0,[{file,“mria_mnesia.erl”},{line,112}]},{mria_app,start,2,[{file,“mria_app.erl”},{line,36}]},{application_master,start_it_old,4,[{file,“application_master.erl”},{line,293}]}]}}}}}}},[{mria,start,0,[{file,“mria.erl”},{line,124}]},{ekka,start,0,[{file,“ekka.erl”},{line,94}]},{emqx_machine,start,0,[{file,“emqx_machine.erl”},{line,45}]},{emqx_machine_app,start,2,[{file,“emqx_machine_app.erl”},{line,27}]},{application_master,start_it_old,4,[{file,“application_master.erl”},{line,293}]}]}}}}})

Crash dump is being written to: /opt/emqx/log/erl_crash.dump…done

清空问题节点的数据目录,重新拉起来就可以了
[root@1 emqx]# ls
cluster.uuid configs mnesia node.uuid trace
[root@1 emqx]# rm -rf ./*