挽救DG中主庫(kù)的nologging操作的塊
眾所周知我們的Data Guard數(shù)據(jù)同步是基于日志流的。所以在主庫(kù)執(zhí)行nologging操作是不被允許的。這也就是為什么我們需要在配置Data Guard階段需要使用Force Logging。但是這也會(huì)帶來(lái)很多問(wèn)題(SQL執(zhí)行效率),例如:當(dāng)我們使用數(shù)據(jù)泵進(jìn)行遷移時(shí)我們希望最少停機(jī)時(shí)間完成,這時(shí)候我們就可能會(huì)考慮到以最小日志導(dǎo)入的方式以加快導(dǎo)入速度,然后重新同步備庫(kù)。
在一些場(chǎng)景中,我們會(huì)去使用nologging操作去節(jié)省大量數(shù)據(jù)插入的時(shí)間,而這種操作所帶來(lái)的問(wèn)題就是,如果該庫(kù)在有備庫(kù)的情況下,因?yàn)橹鲙?kù)的nologging插入操作不會(huì)生成redo,所以不會(huì)在備庫(kù)上傳輸和應(yīng)用,這會(huì)導(dǎo)致備庫(kù)的數(shù)據(jù)出現(xiàn)問(wèn)題。
在Oracle 11g,如果遇到這樣的問(wèn)題,可以通過(guò)在備庫(kù)恢復(fù)有問(wèn)題的數(shù)據(jù)文件來(lái)解決問(wèn)題,示例如下:
在一個(gè)具有主備關(guān)系的主庫(kù)上將force_logging設(shè)置為nologging模式,隨后創(chuàng)建一張表,設(shè)置為nologging模式
-
SQL> alter database no force logging;
-
SQL> create table DEMO tablespace users pctfree 99 as select rownum n from xmltable('1 to 1000');
-
SQL> alter table DEMO nologging;
之后使用/* +append*/插入數(shù)據(jù)并提交
-
SQL> insert /*+ append */ into DEMO select rownum n from xmltable('1 to 100000');
-
SQL> commit
這時(shí)候在備庫(kù)對(duì)該表進(jìn)行查詢會(huì)看到如下報(bào)錯(cuò)信息
-
SQL>select count(1) from demo;
-
select count(1) from demo
-
*
-
ERROR at line 1:
-
ORA-01578: ORACLE data block corrupted (file # 4, block # 819)
-
ORA-01110: data file 4: '/data/data1/ORCL2/datafile/o1_mf_users_3ft1e9qb_.dbf'
-
ORA-26040: Data block was loaded using the NOLOGGING option
而要修復(fù)這個(gè)問(wèn)題,需要將包含缺少的數(shù)據(jù)的數(shù)據(jù)文件從主庫(kù)復(fù)制到物理備庫(kù)。
步驟一
1、查詢主庫(kù)
-
SQL> SELECT NAME, UNRECOVERABLE_CHANGE# FROM V$DATAFILE;
-
NAME UNRECOVERABLE_CHANGE#
-
--------------------------------------------------------------------------- ---------------------
-
+DATADG/orcl/datafile/system.270.972381717 0
-
+DATADG/orcl/datafile/sysaux.265.972381717 0
-
+DATADG/orcl/datafile/undotbs1.261.972381717 0
-
+DATADG/orcl/datafile/users.259.972381717 6252054
-
+DATADG/orcl/datafile/example.264.972381807 0
-
+DATADG/orcl/datafile/undotbs2.258.972381927 0
-
+DATADG/orcl/datafile/example.266.972400297 0
-
+DATADG/orcl/datafile/ax.268.973612569 0
2、查詢備庫(kù)
-
sys@ORCL>SELECT NAME, UNRECOVERABLE_CHANGE# FROM V$DATAFILE;
-
NAME UNRECOVERABLE_CHANGE#
-
--------------------------------------------------------------------------- ---------------------
-
/data/data1/ORCL2/datafile/o1_mf_system_3dt1e9op_.dbf 0
-
/data/data1/ORCL2/datafile/o1_mf_sysaux_3ct1e9nb_.dbf 0
-
/data/data1/ORCL2/datafile/o1_mf_undotbs1_3gt1e9qq_.dbf 0
-
/data/data1/ORCL2/datafile/o1_mf_users_3ft1e9qb_.dbf 5383754
-
/data/data1/ORCL2/datafile/o1_mf_example_3et1e9ps_.dbf 0
-
/data/data1/ORCL2/datafile/o1_mf_undotbs2_3ht1e9r1_.dbf 0
-
/data/data1/ORCL2/datafile/o1_mf_example_3at1e9nb_.dbf 0
-
/data/data1/ORCL2/datafile/o1_mf_ax_3bt1e9nb_.dbf 0
3、比較主數(shù)據(jù)庫(kù)和備用數(shù)據(jù)庫(kù)的查詢結(jié)果
在兩個(gè)查詢結(jié)果中比較UNRECOVERABLE_CHANGE#列的值。如果主庫(kù)中UNRECOVERABLE_CHANGE#列的值大于備庫(kù)中的同一列,則需要將這些數(shù)據(jù)文件在備庫(kù)恢復(fù)。
步驟二
將主庫(kù)對(duì)應(yīng)的數(shù)據(jù)文件拷貝至備庫(kù)
-
SQL> alter tablespace users begin backup
-
SQL> exit
-
ASMCMD>cp +DATADG/orcl/datafile/users.259.972381717 /tmp
-
$ scp /tmp/users.259.972381717 10.10.60.123:/data/data1/ORCL2/datafile/
-
SQL> alter tablespace users end backup
步驟三
備庫(kù)將舊的數(shù)據(jù)文件rename至新的數(shù)據(jù)文件
-
SQL> alter database recover managed standby database cancel;
-
SQL> alter system set standby_file_management=manual; #在備庫(kù)執(zhí)行rename操作時(shí),需要此參數(shù)為manual
-
SQL> alter database rename file '/data/data1/ORCL2/datafile/o1_mf_users_3ft1e9qb_.dbf' to '/data/data1/ORCL2/datafile/users.259.972381717';
-
SQL> alter system set standby_file_management=auto;
-
SQL> alter database recover managed standby database using current logfile disconnect from session;
之后就可以在備庫(kù)查詢到實(shí)例表DEMO
-
SQL> select count(1) from demo;
-
COUNT(1)
-
----------
-
101000
對(duì)于這種情況,在12.1版本中,RMAN提供了一種便捷的方式讓我們不需要在主庫(kù)上進(jìn)行數(shù)據(jù)文件的備份傳輸而可以在備庫(kù)使用 restore database (or datafile ) from service去從主庫(kù)進(jìn)行恢復(fù)。
當(dāng)然,Oracle的RMAN是足夠聰明的:如果數(shù)據(jù)文件是正常的狀態(tài),RMAN可以根據(jù)它們的數(shù)據(jù)文件頭進(jìn)行跳躍恢復(fù)。如果,由于nologging操作導(dǎo)致某些塊被標(biāo)記為損壞的,那么這部分?jǐn)?shù)據(jù)文件就是需要恢復(fù)的,然后怎么辦?在恢復(fù)命令中有FORCE選項(xiàng)。但我們可能并不需要它。因?yàn)橛行r(shí)候數(shù)據(jù)文件是同步的,實(shí)時(shí)日志應(yīng)用進(jìn)程還是在運(yùn)行的。這個(gè)時(shí)候,為了恢復(fù),我們需要停止應(yīng)用。
一旦我們停止了應(yīng)用,那么我們就不需要執(zhí)行RESOTORE DATABASE FORCE操作,因?yàn)楝F(xiàn)在數(shù)據(jù)文件的狀態(tài)是過(guò)舊的,就算你不加FORCE選項(xiàng)RMAN也是不會(huì)跳過(guò)這些數(shù)據(jù)文件的。
步驟一
備庫(kù)關(guān)掉實(shí)時(shí)日志應(yīng)用,并開(kāi)啟至mount狀態(tài)。
-
SQL> alter database recover managed standby database cancel;
-
SQL> shutdown immediate
-
Database closed.
-
Database dismounted.
-
ORACLE instance shut down.
-
SQL> startup mount
-
ORACLE instance started
步驟二
備庫(kù)登陸RMAN,使用restore database (or datafile ) from service進(jìn)行恢復(fù)
-
RMAN> restore database from service 'primary_db'; #這里的primary_db,為備庫(kù)至主庫(kù)的tns連接串的別名
-
Starting restore at 2018-05-03 17:00:35
-
using target database control file instead of recovery catalog
-
allocated channel: ORA_DISK_1
-
channel ORA_DISK_1: SID=29 device type=DISK
-
channel ORA_DISK_1: starting datafile backup set restore
-
channel ORA_DISK_1: using network backup set from service primary_db
-
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
-
channel ORA_DISK_1: restoring datafile 00001 to /data/data1/ORCL2/datafile/o1_mf_system_02t1t9ck_.dbf
-
channel ORA_DISK_1: restore complete, elapsed time: 00:00:25
-
channel ORA_DISK_1: starting datafile backup set restore
-
channel ORA_DISK_1: using network backup set from service primary_db
-
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
-
channel ORA_DISK_1: restoring datafile 00003 to /data/data1/ORCL2/datafile/o1_mf_sysaux_03t1t9d3_.dbf
-
channel ORA_DISK_1: restore complete, elapsed time: 00:00:15
-
channel ORA_DISK_1: starting datafile backup set restore
-
channel ORA_DISK_1: using network backup set from service primary_db
-
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
-
channel ORA_DISK_1: restoring datafile 00004 to /data/data1/ORCL2/datafile/o1_mf_undotbs1_04t1t9di_.dbf
-
channel ORA_DISK_1: restore complete, elapsed time: 00:00:02
-
channel ORA_DISK_1: starting datafile backup set restore
-
channel ORA_DISK_1: using network backup set from service primary_db
-
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
-
channel ORA_DISK_1: restoring datafile 00006 to /data/data1/ORCL2/datafile/o1_mf_users_05t1t9dm_.dbf
-
channel ORA_DISK_1: restore complete, elapsed time: 00:00:15
-
Finished restore at 2018-05-03 17:01:34
當(dāng)然要記得去起庫(kù)并開(kāi)啟實(shí)時(shí)日志應(yīng)用進(jìn)程!
而在12.2中,Oracle提供了一種更方便的方式去進(jìn)行恢復(fù)主庫(kù)會(huì)將未記錄的塊的列表發(fā)送至備庫(kù),并記錄在備庫(kù)控制文件中,我們可以從備庫(kù)的v$nonlogged_block這個(gè)視圖查看到相關(guān)信息。不需要發(fā)送主庫(kù)的整個(gè)數(shù)據(jù)文件,而是在RMAN執(zhí)行一個(gè)簡(jiǎn)單的命令來(lái)恢復(fù)它們:
RECOVER DATABASE NONLOGGED BLOCK
步驟一
停止備庫(kù)實(shí)時(shí)日志應(yīng)用
-
SQL> alter database recover managed standby database cancel;
步驟二
備庫(kù)登陸RMAN執(zhí)行
RECOVER DATABASE NONLOGGED BLOCK
注意:執(zhí)行此步驟前請(qǐng)確認(rèn)主備庫(kù)的log_archive_config參數(shù)已經(jīng)設(shè)置
-
RMAN> recover database nonlogged block;
-
Starting recover at 2018-05-03 14:54:22
-
using target database control file instead of recovery catalog
-
allocated channel: ORA_DISK_1
-
channel ORA_DISK_1: SID=56 device type=DISK
-
starting recovery of nonlogged blocks
-
List of Datafiles
-
=================
-
File Status Nonlogged Blocks Blocks Examined Blocks Skipped
-
---- ------ ---------------- --------------- --------------
-
1 OK 0 0 107519
-
3 OK 0 0 262399
-
4 OK 0 0 149759
-
5 OK 0 0 31999
-
6 OK 0 0 42239
-
7 OK 0 16707 21532
-
8 OK 0 0 12799
-
9 OK 0 0 76799
-
18 OK 0 0 33279
-
19 OK 0 0 57599
-
20 OK 0 0 24959
-
21 OK 0 0 33279
-
22 OK 0 0 51199
-
23 OK 0 0 12799
-
29 OK 0 0 1310719
-
30 OK 0 0 12799
-
31 OK 0 0 33279
-
32 OK 0 0 52479
-
33 OK 0 0 923519
-
34 OK 0 16822 8777
-
35 OK 0 0 12799
-
37 OK 0 0 24959
-
Details of nonlogged blocks can be queried from v$nonlogged_block view
-
recovery of nonlogged blocks complete, elapsed time: 00:00:08
-
Finished recover at 2018-05-03 14:54:32
最后別忘了開(kāi)啟實(shí)時(shí)日志應(yīng)用進(jìn)程。
綜上來(lái)看,12.2中這個(gè)特性在數(shù)據(jù)倉(cāng)庫(kù)等一些場(chǎng)景是可以嘗試的。以往我們開(kāi)啟force logging造成大量的redo日志并且影響一部分dml語(yǔ)句的執(zhí)行效率。在12.2我們可以嘗試使用nonlogging操作去節(jié)省大量數(shù)據(jù)插入的時(shí)間,然后在系統(tǒng)空閑時(shí)間進(jìn)行備庫(kù)恢復(fù)操作。但是注意這種操作也存在弊端,這樣你的備庫(kù)的可用性就大大降低了。凡事總有取舍!
| 作者簡(jiǎn)介
陳康,沃趣科技數(shù)據(jù)庫(kù)技術(shù)專(zhuān)家
主要參與公司產(chǎn)品實(shí)施、測(cè)試、維護(hù)以及優(yōu)化。