您好,登錄后才能下訂單哦!
這篇文章主要介紹“分析MySQL的slave_skip_errors參數(shù)對MGR可用性的影響”,在日常操作中,相信很多人在分析MySQL的slave_skip_errors參數(shù)對MGR可用性的影響問題上存在疑惑,小編查閱了各式資料,整理出簡單好用的操作方法,希望對大家解答”分析MySQL的slave_skip_errors參數(shù)對MGR可用性的影響”的疑惑有所幫助!接下來,請跟著小編一起來學(xué)習(xí)吧!
MGR在遇到表不存在的情況下,節(jié)點(diǎn)沒有退出節(jié)點(diǎn)而是爆出一個警告,并且節(jié)點(diǎn)狀態(tài)也正常,警告如下:
2019-10-17T21:16:11.564211+08:00 10 [Warning] Slave SQL for channel group_replication_applier': Worker 1 failed executing transaction 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:8' at master log , end_log_pos 220; Error executing row event: 'Table 'test.a_1' doesn't exist', Error_code: 1146
集群狀態(tài)如下:
[root@mysql.sock][test]>select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+-------------+-------------+--------------+ | group_replication_applier | 9fd479bb-f0d8-11e9-9381-000c29105312 | mysql_1 | 3306 | ONLINE | | group_replication_applier | a8833a96-f0d8-11e9-a9f4-000c291fd9a5 | mysql_2 | 3306 | ONLINE | | group_replication_applier | b2968fe2-f0d8-11e9-a8ff-000c29c89e42 | mysql_3 | 3306 | ONLINE | +---------------------------+--------------------------------------+-------------+-------------+--------------+ 3 rows in set (0.00 sec)
當(dāng)時覺得很奇怪,我們知道這種錯誤即便是在主從情況下也是報(bào)錯的SQL線程退出的,MGR居然還能在線,這種情況數(shù)據(jù)已經(jīng)不同步了,應(yīng)該報(bào)錯并且剔除節(jié)點(diǎn)才對。
隨即一些感興趣的同學(xué)馬上進(jìn)行了測試,測試結(jié)果和上面不一致,測試結(jié)果是報(bào)錯而不是出警告如下:
2019-10-17T09:16:34.317542Z 84 [ERROR] Slave SQL for channel 'group_replication_applier': Error executing row event: 'Table 'test.emp1' doesn't exist', Error_code: 1146
并且這種情況表不存在的節(jié)點(diǎn)已經(jīng)被剔除掉了。下面是正常情況的節(jié)點(diǎn)狀態(tài):
secondary 1節(jié)點(diǎn): [root@mysql.sock][test]>select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+-------------+-------------+--------------+ | group_replication_applier | a8833a96-f0d8-11e9-a9f4-000c291fd9a5 | mysql_2 | 3306 | ERROR | +---------------------------+--------------------------------------+-------------+-------------+--------------+ 1 row in set (0.00 sec) secondary 2節(jié)點(diǎn): [root@mysql.sock][test]>select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+-------------+-------------+--------------+ | group_replication_applier | b2968fe2-f0d8-11e9-a8ff-000c29c89e42 | mysql_3 | 3306 | ERROR | +---------------------------+--------------------------------------+-------------+-------------+--------------+ 1 row in set (0.00 sec)
那么疑問就是為什么同樣是MGR一個是警告一個是錯誤呢,并且前者還能處于正常同步狀態(tài)。不錯看到題目就知道這里和slave_skip_errors參數(shù)有關(guān)。
我們知道再M(fèi)aster-Slave中如果遇到從庫表不存在肯定是報(bào)錯的,除非設(shè)置slave_skip_errors參數(shù),當(dāng)然我在線上重來沒有設(shè)置過這個參數(shù),并且通過這個案例我們發(fā)現(xiàn)本參數(shù)對MGR也有影響,如下測試方法:
我們在3個節(jié)點(diǎn)都開啟slave-skip-errors= ddl_exist_errors
如下圖:
然后搭建3節(jié)點(diǎn)single-primary模式的MGR集群。
集群搭建正常。
然后執(zhí)行如下操作:
[root@mysql.sock][(none)]>set sql_log_bin=0; Query OK, 0 rows affected (0.00 sec) [root@mysql.sock][(none)]>create table test.a_1(id bigint auto_increment primary key,name varchar(20)); Query OK, 0 rows affected (0.01 sec) [root@mysql.sock][(none)]>set sql_log_bin=1; Query OK, 0 rows affected (0.00 sec)
此時primary節(jié)點(diǎn)是有a_1表的,但是因?yàn)閎inlog關(guān)閉的原因,兩個secondary節(jié)點(diǎn)是不存在a_1表的。
然后我們插入數(shù)據(jù):
[root@mysql.sock][test]>insert into test.a_1 values(null,'tom'); Query OK, 1 row affected (0.02 sec)
此時,primary節(jié)點(diǎn)因?yàn)榇嬖赼_1表,所以能夠插入,但是兩個secondary節(jié)點(diǎn)不存在a_1表,所以插入是失敗的。數(shù)據(jù)產(chǎn)生不一致。正常情況下這種數(shù)據(jù)不一致會導(dǎo)致2個secondary節(jié)點(diǎn)被提出集群才對。但是實(shí)際上3個節(jié)點(diǎn)都是正常的,集群并沒有失效。
[root@mysql.sock][test]>select * from test.a_1; +----+------+ | id | name | +----+------+ | 1 | tom | +----+------+ 1 row in set (0.00 sec) [root@mysql.sock][test]>select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+-------------+-------------+--------------+ | group_replication_applier | 9fd479bb-f0d8-11e9-9381-000c29105312 | mysql_1 | 3306 | ONLINE | | group_replication_applier | a8833a96-f0d8-11e9-a9f4-000c291fd9a5 | mysql_2 | 3306 | ONLINE | | group_replication_applier | b2968fe2-f0d8-11e9-a8ff-000c29c89e42 | mysql_3 | 3306 | ONLINE | +---------------------------+--------------------------------------+-------------+-------------+--------------+ 3 rows in set (0.00 sec)
此時去2個secondary節(jié)點(diǎn)讀取test.a_1表,表是不存在的。
secondary 1: [root@mysql.sock][test]>select * from test.a_1; ERROR 1146 (42S02): Table 'test.a_1' doesn't exist [root@mysql.sock][test]> secondary 2: [root@mysql.sock][test]>select * from test.a_1; ERROR 1146 (42S02): Table 'test.a_1' doesn't exist
error log輸出信息:(set global log_error_verbosity = 3;)
2019-10-17T21:16:11.564211+08:00 10 [Warning] Slave SQL for channel 'group_replication_applier': Worker 1 failed executing transaction 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:8' at master log , end_log_pos 220; Error executing row event: 'Table 'test.a_1' doesn't exist', Error_code: 1146
這個設(shè)置在Rows_log_event::do_apply_event 函數(shù)中生效,也就是DML Event開始應(yīng)用的時候生效,這是常規(guī)的SQL線程(或者Worker線程)調(diào)用的。
#ifdef HAVE_REPLICATION if (opt_slave_skip_errors) add_slave_skip_errors(opt_slave_skip_errors); #endif if (open_and_lock_tables(thd, rli->tables_to_lock, 0))//打開表 { uint actual_error= thd->get_stmt_da()->mysql_errno(); if (thd->is_slave_error || thd->is_fatal_error) { if (ignored_error_code(actual_error)) //這里受到 slave_skip_errors 參數(shù)控制 ignored_error_code會將slave_skip_errors的參數(shù)設(shè)置讀取出來 { if (log_warnings > 1) rli->report(WARNING_LEVEL, actual_error, "Error executing row event: '%s'", (actual_error ? thd->get_stmt_da()->message_text() : "unexpected success or fatal error")); thd->get_stmt_da()->reset_condition_info(thd); clear_all_errors(thd, const_cast<Relay_log_info*>(rli)); error= 0; goto end; } else { rli->report(ERROR_LEVEL, actual_error, "Error executing row event: '%s'", (actual_error ? thd->get_stmt_da()->message_text() : "unexpected success or fatal error")); thd->is_slave_error= 1; const_cast<Relay_log_info*>(rli)->slave_close_thread_tables(thd); DBUG_RETURN(actual_error); } }
可以看到MGR的執(zhí)行邏輯受到了該參數(shù)的影響。
到此,關(guān)于“分析MySQL的slave_skip_errors參數(shù)對MGR可用性的影響”的學(xué)習(xí)就結(jié)束了,希望能夠解決大家的疑惑。理論與實(shí)踐的搭配能更好的幫助大家學(xué)習(xí),快去試試吧!若想繼續(xù)學(xué)習(xí)更多相關(guān)知識,請繼續(xù)關(guān)注億速云網(wǎng)站,小編會繼續(xù)努力為大家?guī)砀鄬?shí)用的文章!
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。