溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點擊 登錄注冊 即表示同意《億速云用戶服務條款》

PostgreSQL DBA(32) - HA#1(pg_rewind切換)

發(fā)布時間:2020-08-11 05:36:49 來源:ITPUB博客 閱讀:206 作者:husthxd 欄目:關系型數(shù)據(jù)庫

基于streaming replication搭建的PostgreSQL HA環(huán)境,如出現(xiàn)網(wǎng)絡訪問/硬件故障等原因導致Standby節(jié)點升級為Master節(jié)點,但舊Master節(jié)點數(shù)據(jù)庫并未損壞,在排除故障后舊Master節(jié)點是否不需要通過備份方式重新搭建成為Standby節(jié)點呢?答案是肯定的,PG提供了pg_rewind這個工具實現(xiàn).

原理
在PostgreSQL HA環(huán)境中,Standby節(jié)點升級為Master節(jié)點后,時間線會切換為新的時間線,比如從1變?yōu)?.而舊Master節(jié)點的時間線仍然為原來的時間線,比如仍為1,那么使用pg_rewind工具,舊Master節(jié)點如何從新Master節(jié)點讀取相關的數(shù)據(jù)成為新的Standby節(jié)點?
簡單來說,有以下幾步:
1.確定新Master和舊Master數(shù)據(jù)一致性的Checkpoint位置.在該位置上,新Master和舊Master數(shù)據(jù)完全一致.這可以通過讀取新舊Master節(jié)點時間線歷史文件可以獲得,該文件位于$PGDATA/pg_wal/目錄下,文件名稱為XX.history
2.舊Master節(jié)點根據(jù)上一步獲取的Checkpoint讀取本機日志文件WAL Record,獲取在此Checkpoint之后出現(xiàn)變化的Block,并以鏈表的方式存儲Block編號等信息
3.根據(jù)第2步獲取的Block信息從新Master節(jié)點拷貝相應的Block,替換舊Master節(jié)點相應的Block
4.拷貝新Master節(jié)點上除數(shù)據(jù)文件外的所有其他文件,包括配置文件等(如果拷貝數(shù)據(jù)文件,與備份方式搭建區(qū)別不大)
5.舊Master啟動數(shù)據(jù)庫,應用從Checkpoint開始后的WAL Record.

示例
考慮兩節(jié)點的HA環(huán)境,舊Master節(jié)點IP為192.168.26.25,Standby節(jié)點(新Master)IP為192.168.26.26,模擬主庫宕機,備庫升級為主庫,然后通過pg_rewind把舊Master節(jié)點切換為Standby節(jié)點.
切換前,26.25節(jié)點相關信息


testdb=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 1537
usesysid         | 90113
usename          | replicator
application_name | standby_26
client_addr      | 192.168.26.26
client_hostname  | 
client_port      | 53164
backend_start    | 2019-03-27 15:19:09.254987+08
backend_xmin     | 654
state            | streaming
sent_lsn         | 0/6B000060
write_lsn        | 0/6B000060
flush_lsn        | 0/6B000060
replay_lsn       | 0/6B000060
write_lag        | 00:00:00.156457
flush_lag        | 00:00:00.158792
replay_lag       | 00:00:00.158815
sync_priority    | 0
sync_state       | async
testdb=# 
testdb=# select pg_is_in_recovery();
-[ RECORD 1 ]-----+--
pg_is_in_recovery | f

插入測試數(shù)據(jù)


testdb=# create table t_oldmaster(id int);
CREATE TABLE
testdb=# insert into t_oldmaster select generate_series(1,1000000);
INSERT 0 1000000
testdb=#

切換前26.26節(jié)點信息,數(shù)據(jù)已復制,角色為Standby


testdb=# select count(*) from t_oldmaster;
  count  
---------
 1000000
(1 row)
testdb=# select pg_is_in_recovery();
 pg_is_in_recovery 
-------------------
 t
(1 row)

模擬Master節(jié)點宕機


[xdb@localhost testdb]$ ps -ef|grep postgres
xdb       1353     1  0 14:46 pts/1    00:00:00 /appdb/xdb/pg11.2/bin/postgres
xdb       1354  1353  0 14:46 ?        00:00:00 postgres: logger   
xdb       1356  1353  0 14:46 ?        00:00:00 postgres: checkpointer   
xdb       1357  1353  0 14:46 ?        00:00:00 postgres: background writer   
xdb       1358  1353  0 14:46 ?        00:00:00 postgres: walwriter   
xdb       1359  1353  0 14:46 ?        00:00:00 postgres: autovacuum launcher  
xdb       1360  1353  0 14:46 ?        00:00:00 postgres: archiver   last was 00000010000000000000006D
xdb       1361  1353  0 14:46 ?        00:00:00 postgres: stats collector   
xdb       1362  1353  0 14:46 ?        00:00:00 postgres: logical replication launcher  
xdb       1418  1353  0 15:11 ?        00:00:02 postgres: xdb testdb [local] idle
xdb       1537  1353  0 15:19 ?        00:00:00 postgres: walsender replicator 192.168.26.26(53164) streaming 0/6ED4FDA8
xdb       1555  1317  0 15:25 pts/1    00:00:00 grep --color=auto postgres
[xdb@localhost testdb]$ kill -9 1353
[xdb@localhost testdb]$ ps -ef|grep postgres
xdb       1557  1317  0 15:25 pts/1    00:00:00 grep --color=auto postgres
[xdb@localhost testdb]$

Standby節(jié)點執(zhí)行切換
編輯recovery.conf文件,添加trigger_file和recovery_target_timeline


[xdb@localhost testdb]$ cat recovery.conf 
standby_mode = 'on'
primary_conninfo = 'host=192.168.26.25 port=5432 user=replicator password=replicator application_name=standby_26'
restore_command = 'cp /data/archivelog/%f %p'
archive_cleanup_command = 'pg_archivecleanup /data/archivelog %r'
trigger_file = '/tmp/trigger_file'
recovery_target_timeline = 'latest'
#生成trigger文件
[xdb@localhost testdb]$ touch /tmp/trigger_file

重啟Standby節(jié)點


[xdb@localhost testdb]$ pg_ctl restart
pg_ctl: PID file "/data/pgsql/testdb/postmaster.pid" does not exist
Is server running?
trying to start server anyway
waiting for server to start....2019-03-27 15:33:28.218 CST [1901] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2019-03-27 15:33:28.219 CST [1901] LOG:  listening on IPv6 address "::", port 5432
2019-03-27 15:33:28.223 CST [1901] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2019-03-27 15:33:28.253 CST [1901] LOG:  redirecting log output to logging collector process
2019-03-27 15:33:28.253 CST [1901] HINT:  Future log output will appear in directory "pg_log".
. done
server started
[xdb@localhost testdb]$

日志輸出


2019-03-27 15:33:28.259 CST,,,1903,,5c9b2748.76f,2,,2019-03-27 15:33:28 CST,,0,LOG,00000,"entering standby mode",,,,,,,,,""
2019-03-27 15:33:28.264 CST,,,1903,,5c9b2748.76f,3,,2019-03-27 15:33:28 CST,1/0,0,LOG,00000,"redo starts at 0/6B0164C0",,,,,,,,,""
2019-03-27 15:33:29.416 CST,,,1903,,5c9b2748.76f,4,,2019-03-27 15:33:28 CST,1/0,0,LOG,00000,"consistent recovery state reached at 0/6ED4FDA8",,,,,,,,,""
2019-03-27 15:33:29.416 CST,,,1903,,5c9b2748.76f,5,,2019-03-27 15:33:28 CST,1/0,0,LOG,00000,"invalid record length at 0/6ED4FDA8: wanted 24, got 0",,,,,,,,,""
2019-03-27 15:33:29.416 CST,,,1903,,5c9b2748.76f,6,,2019-03-27 15:33:28 CST,1/0,0,LOG,00000,"trigger file found: /tmp/trigger_file",,,,,,,,,""
2019-03-27 15:33:29.416 CST,,,1903,,5c9b2748.76f,7,,2019-03-27 15:33:28 CST,1/0,0,LOG,00000,"redo done at 0/6ED4FD70",,,,,,,,,""
2019-03-27 15:33:29.416 CST,,,1903,,5c9b2748.76f,8,,2019-03-27 15:33:28 CST,1/0,0,LOG,00000,"last completed transaction was at log time 2019-03-27 15:21:57.519456+08",,,,,,,,,""
2019-03-27 15:33:29.416 CST,,,1901,,5c9b2748.76d,2,,2019-03-27 15:33:28 CST,,0,LOG,00000,"database system is ready to accept read only connections",,,,,,,,,""
2019-03-27 15:33:29.421 CST,,,1903,,5c9b2748.76f,9,,2019-03-27 15:33:28 CST,1/0,0,LOG,00000,"selected new timeline ID: 17",,,,,,,,,""

Standby節(jié)點已升級為新Master節(jié)點
recovery.conf 文件重命名為 recovery.done
/tmp/trigger_file文件被刪除
pg_is_in_recovery返回false


[xdb@localhost testdb]$ cat recovery.done 
standby_mode = 'on'
primary_conninfo = 'host=192.168.26.25 port=5432 user=replicator password=replicator application_name=standby_26'
restore_command = 'cp /data/archivelog/%f %p'
archive_cleanup_command = 'pg_archivecleanup /data/archivelog %r'
trigger_file = '/tmp/trigger_file'
recovery_target_timeline = 'latest'
[root@localhost testdb]# ls -l /tmp/trigger_file
ls: cannot access /tmp/trigger_file: No such file or directory
[xdb@localhost testdb]$ psql -d testdb
psql (11.2)
Type "help" for help.
testdb=# select pg_is_in_recovery();
 pg_is_in_recovery 
-------------------
 f
(1 row)

在pg_wal目錄下生成了時間線history文件


[xdb@localhost testdb]$ ls ./pg_wal/
00000010000000000000006E.partial  00000011000000000000006F  000000110000000000000072
00000010.history                  000000110000000000000070  00000011.history
00000011000000000000006E          000000110000000000000071  archive_status
[xdb@localhost testdb]$ cat ./pg_wal/00000011.history 
7    0/27000000    no recovery target specified
8    0/2A0012E8    no recovery target specified
12    0/32000000    no recovery target specified
14    0/39A63BD0    no recovery target specified
15    0/40A63B08    no recovery target specified
16    0/6ED4FDA8    no recovery target specified
[xdb@localhost testdb]$ cat ./pg_wal/00000010.history 
7    0/27000000    no recovery target specified
8    0/2A0012E8    no recovery target specified
12    0/32000000    no recovery target specified
14    0/39A63BD0    no recovery target specified
15    0/40A63B08    no recovery target specified

生成測試數(shù)據(jù)


testdb=# create table t_new_master(id int);
CREATE TABLE
testdb=# insert into t_new_master select generate_series(1,1000000);
INSERT 0 1000000
testdb=#

舊Master節(jié)點降級為Standby節(jié)點
重新啟動舊master節(jié)點,插入測試數(shù)據(jù),用于模擬時間線分叉后的數(shù)據(jù)變動


[xdb@localhost testdb]$ pg_ctl start
pg_ctl: another server might be running; trying to start server anyway
waiting for server to start....2019-03-27 15:39:07.989 CST [1561] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2019-03-27 15:39:07.989 CST [1561] LOG:  listening on IPv6 address "::", port 5432
2019-03-27 15:39:07.991 CST [1561] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2019-03-27 15:39:08.015 CST [1561] LOG:  redirecting log output to logging collector process
2019-03-27 15:39:08.015 CST [1561] HINT:  Future log output will appear in directory "pg_log".
. done
server started
[xdb@localhost testdb]$ psql -d testdb
psql (11.2)
Type "help" for help.
testdb=# 
testdb=# create table t_fork(id int);
CREATE TABLE
testdb=# insert into t_fork select generate_series(1,100000);
INSERT 0 100000
testdb=# 
testdb=# \q
[xdb@localhost testdb]$

關閉舊主庫,執(zhí)行pg_rewind


[xdb@localhost testdb]$ pg_ctl stop
waiting for server to shut down.... done
server stopped
[xdb@localhost testdb]$ pg_rewind --target-pgdata=$PGDATA --source-server="host=192.168.26.26 port=5432 dbname=testdb" --progress --debug -n
connected to server
fetched file "global/pg_control", length 8192
target server needs to use either data checksums or "wal_log_hints = on"
Failure, exiting

主從設置postgres.conf中的wal_log_hints參數(shù)為on,重啟數(shù)據(jù)庫


[xdb@localhost testdb]$ pg_ctl start
waiting for server to start....2019-03-27 15:54:56.038 CST [1603] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2019-03-27 15:54:56.038 CST [1603] LOG:  listening on IPv6 address "::", port 5432
2019-03-27 15:54:56.042 CST [1603] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2019-03-27 15:54:56.079 CST [1603] LOG:  redirecting log output to logging collector process
2019-03-27 15:54:56.079 CST [1603] HINT:  Future log output will appear in directory "pg_log".
 done
server started
[xdb@localhost testdb]$ pg_ctl stop
waiting for server to shut down.... done
server stopped

再次執(zhí)行pg_rewind


[xdb@localhost testdb]$ pg_rewind --target-pgdata=$PGDATA --source-server="host=192.168.26.26 port=5432 dbname=testdb" --progress --debug -n
connected to server
fetched file "global/pg_control", length 8192
fetched file "pg_wal/00000011.history", length 261
Source timeline history:
Target timeline history:
7: 0/0 - 0/27000000
8: 0/27000000 - 0/2A0012E8
12: 0/2A0012E8 - 0/32000000
14: 0/32000000 - 0/39A63BD0
15: 0/39A63BD0 - 0/40A63B08
16: 0/40A63B08 - 0/0
servers diverged at WAL location 0/6ED4FDA8 on timeline 16
could not open file "/data/pgsql/testdb/pg_wal/00000010000000000000006E": No such file or directory
could not find previous WAL record at 0/6ED4FDA8
Failure, exiting
[xdb@localhost testdb]$

拷貝本機和新Master節(jié)點的歸檔wal文件


[xdb@localhost testdb]$ scp -P 22 root@192.168.26.26:/data/archivelog/* $PGDATA/pg_wal/
The authenticity of host '192.168.26.26 (192.168.26.26)' can't be established.
ECDSA key fingerprint is SHA256:wT+B26a2wJvmsOz8QH2UK6QUqodwBYvDu/WMrmqZ5fU.
ECDSA key fingerprint is MD5:79:63:4d:55:83:d9:d6:78:21:d1:b6:a1:e8:74:0f:b0.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.26.26' (ECDSA) to the list of known hosts.
root@192.168.26.26's password: 
00000010000000000000006E.partial                                                           100%   16MB  44.1MB/s   00:00    
00000011000000000000006E                                                                   100%   16MB  71.5MB/s   00:00    
00000011000000000000006F                                                                   100%   16MB  47.0MB/s   00:00    
000000110000000000000070                                                                   100%   16MB  26.7MB/s   00:00    
000000110000000000000071                                                                   100%   16MB  26.7MB/s   00:00    
000000110000000000000072                                                                   100%   16MB  31.6MB/s   00:00    
000000110000000000000073                                                                   100%   16MB  37.1MB/s   00:00    
00000011.history                                                                           100%  261   323.1KB/s   00:00    
[xdb@localhost testdb]$ ls ./pg_wal/
00000010000000000000006A.00000028.backup  000000100000000000000074  00000010.history          000000110000000000000073
00000010000000000000006E.partial          000000100000000000000075  00000011000000000000006E  00000011.history
000000100000000000000070                  000000100000000000000076  00000011000000000000006F  archive_status
000000100000000000000071                  000000100000000000000077  000000110000000000000070  RECOVERYHISTORY
000000100000000000000072                  000000100000000000000078  000000110000000000000071
000000100000000000000073                  000000100000000000000079  000000110000000000000072
[xdb@localhost testdb]$ 
[xdb@localhost testdb]$ cp /data/archivelog/20190327/00000010000000000000006* ./pg_wal

重新執(zhí)行pg_rewind


[xdb@localhost testdb]$ pg_rewind --target-pgdata=$PGDATA --source-server="host=192.168.26.26 port=5432 dbname=testdb" --progress -n
connected to server
servers diverged at WAL location 0/6ED4FDA8 on timeline 16
rewinding from last common checkpoint at 0/6B0164F8 on timeline 16
reading source file list
reading target file list
reading WAL in target
need to copy 296 MB (total source directory size is 437 MB)
303659/303659 kB (100%) copied
creating backup label and updating control file
syncing target data directory
Done!

修改配置文件recovery.conf


[xdb@localhost testdb]$ scp -P 22 root@192.168.26.26:$PGDATA/recovery.done $PGDATA
root@192.168.26.26's password: 
recovery.done                                                                              100%  317   312.7KB/s   00:00    
[xdb@localhost testdb]$ mv recovery.done recovery.conf
[xdb@localhost testdb]$ vim recovery.conf 
[xdb@localhost testdb]$ cat recovery.conf 
standby_mode = 'on'
primary_conninfo = 'host=192.168.26.26 port=5432 user=replicator password=replicator application_name=standby_25'
restore_command = 'cp /data/archivelog/%f %p'
archive_cleanup_command = 'pg_archivecleanup /data/archivelog %r'
[xdb@localhost testdb]$

重啟原主庫

新Master節(jié)點
日志提示有誤,搭建出現(xiàn)問題


2019-03-27 16:22:19.800 CST,"replicator","",2360,"192.168.26.25:32974",5c9b32bb.938,1,"idle",2019-03-27 16:22:19 CST,4/0,0,ERROR,XX000,"requested starting point 0/71000000 on timeline 16 is not in this server's history","This server's history forked from timeline 16 at 0/6ED4FDA8.",,,,,,,,"standby_25"
2019-03-27 16:22:24.813 CST,"replicator","",2361,"192.168.26.25:32976",5c9b32c0.939,1,"idle",2019-03-27 16:22:24 CST,4/0,0,ERROR,XX000,"requested starting point 0/71000000 on timeline 16 is not in this server's history","This server's history forked from timeline 16 at 0/6ED4FDA8.",,,,,,,,"standby_25"

作為失敗的案例,供參考.

參考資料
神奇的pg_rewind

向AI問一下細節(jié)

免責聲明:本站發(fā)布的內容(圖片、視頻和文字)以原創(chuàng)、轉載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權請聯(lián)系站長郵箱:is@yisu.com進行舉報,并提供相關證據(jù),一經查實,將立刻刪除涉嫌侵權內容。

AI