您好,登錄后才能下訂單哦!
一業(yè)務(wù)系統(tǒng)在白天業(yè)務(wù)時(shí)間出現(xiàn)了嚴(yán)重了IO性能問題,下面是下午業(yè)務(wù)高峰時(shí)間(3-5)的awr報(bào)告
從等待事件來看主要都是與IO相關(guān)
從上面可以看到除了幾個(gè)語句的邏輯讀很高,其實(shí)物理不是很高,每秒產(chǎn)生的重做日志以及物理讀也不高.
檢查磁盤IO
rx6600-1:[/]#sar -d 1 10 HP-UX rx6600-1 B.11.23 U ia64 07/15/14 16:18:45 device %busy avque r+w/s blks/s avwait avserv 16:18:46 c39t0d3 100.00 0.50 18 1130 0.00 132.11 c41t0d3 83.50 0.50 6 450 0.00 290.78 16:18:47 c3t0d0 0.99 0.50 2 63 0.00 7.12 c39t0d3 91.09 0.50 10 982 0.00 115.53 c41t0d3 100.00 0.50 12 586 0.00 291.67 16:18:48 c3t0d0 3.03 0.50 2 32 0.00 15.93 c39t0d3 100.00 0.50 9 1034 0.00 139.76 c41t0d3 92.93 0.50 7 388 0.00 310.07 16:18:49 c3t0d0 2.00 0.50 4 64 0.00 19.59 c39t0d3 100.00 0.50 12 1088 0.00 127.33 c41t0d3 86.00 0.50 8 416 0.00 251.32 16:18:50 c3t0d0 1.01 0.50 1 2 0.00 8.99 c39t0d3 100.00 0.50 16 954 0.00 117.10 c41t0d3 100.00 0.50 9 614 0.00 295.52 16:18:51 c3t0d0 0.99 0.50 1 8 0.00 10.60 c39t0d3 93.07 0.50 17 913 0.00 110.59 c41t0d3 100.00 0.50 9 350 0.00 326.92 16:18:52 c39t0d3 100.00 0.50 21 1168 0.00 127.22 c41t0d3 88.00 0.50 11 544 0.00 252.08 16:18:53 c3t0d0 2.02 0.50 3 48 0.00 18.51 c39t0d3 88.89 0.50 19 1164 0.00 98.25 c41t0d3 100.00 0.50 11 630 0.00 324.39 16:18:54 c3t0d0 3.00 0.50 3 20 0.00 12.39 c39t0d3 95.00 0.50 20 954 0.00 131.90 c41t0d3 81.00 0.50 9 610 0.00 289.05 16:18:55 c3t0d0 9.00 0.50 11 134 0.00 8.62 c39t0d3 100.00 0.50 19 1090 0.00 137.20 c41t0d3 100.00 0.50 11 512 0.00 327.16 Average c39t0d3 99.50 0.50 16 1048 0.00 123.38 Average c41t0d3 100.00 0.50 9 510 0.00 296.44 Average c3t0d0 2.20 0.50 3 37 0.00 12.28 rx6600-1:[/]#sar -d 1 10 HP-UX rx6600-1 B.11.23 U ia64 07/15/14 16:20:04 device %busy avque r+w/s blks/s avwait avserv 16:20:05 c3t0d0 1.00 0.50 1 16 0.00 8.33 c39t0d3 98.00 0.50 16 928 0.00 114.86 c41t0d3 98.00 0.50 10 684 0.00 266.43 16:20:06 c3t0d0 1.98 0.50 4 81 0.00 8.57 c39t0d3 93.07 0.50 19 1251 0.00 128.81 c41t0d3 91.09 0.50 6 475 0.00 365.83 16:20:07 c3t0d0 2.00 0.50 3 48 0.00 5.87 c39t0d3 98.00 0.50 23 1216 0.00 113.66 c41t0d3 98.00 0.50 8 576 0.00 307.92 16:20:08 c3t0d0 1.00 0.50 2 32 0.00 5.36 c39t0d3 100.00 0.50 21 1132 0.00 118.47 c41t0d3 100.00 0.50 7 592 0.00 300.71 16:20:09 c3t0d0 6.00 0.58 13 194 2.22 26.05 c39t0d3 89.00 0.50 17 1152 0.00 123.54 c41t0d3 87.00 0.50 8 512 0.00 298.26 16:20:10 c3t0d0 3.00 0.50 6 96 0.00 22.78 c39t0d3 85.00 0.50 17 1136 0.00 114.79 c41t0d3 98.00 0.50 9 592 0.00 252.52 16:20:11 c3t0d0 1.00 0.50 1 2 0.00 8.04 c39t0d3 100.00 0.50 17 1216 0.00 138.04 c41t0d3 100.00 0.50 12 672 0.00 291.69 16:20:12 c3t0d0 2.00 0.50 3 34 0.00 9.24 c39t0d3 99.00 0.50 16 1024 0.00 122.11 c41t0d3 88.00 0.50 9 476 0.00 299.79 16:20:13 c39t0d3 91.00 0.50 18 1024 0.00 111.77 c41t0d3 92.00 0.50 3 384 0.00 396.25 16:20:14 c39t0d3 99.00 0.50 17 892 0.00 132.15 c41t0d3 100.00 0.50 10 608 0.00 233.54 Average c3t0d0 1.80 0.53 3 50 0.87 17.64 Average c39t0d3 96.00 0.50 18 1097 0.00 121.54 Average c41t0d3 100.00 0.50 8 557 0.00 290.35
在業(yè)務(wù)人員下班后重啟的雙機(jī)軟件,但在啟動(dòng)數(shù)據(jù)庫(kù)時(shí)停在了Completed redo application這一步
SQL> startup ORACLE instance started. Total System Global Area 1.0318E+10 bytes Fixed Size 2073176 bytes Variable Size 3238006184 bytes Database Buffers 7063207936 bytes Redo Buffers 14700544 bytes Database mounted.
從alert.log文件中可以看到如下信息:
Tue Jul 15 22:23:29 2014 Starting ORACLE instance (normal) LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 Picked latch-free SCN scheme 3 Autotune of undo retention is turned on. IMODE=BR ILAT =61 LICENSE_MAX_USERS = 0 SYS auditing is disabled ksdpec: called for event 13740 prior to event group initialization Starting up ORACLE RDBMS Version: 10.2.0.4.0. System parameters with non-default values: processes = 500 sessions = 555 __shared_pool_size = 3154116608 __large_pool_size = 16777216 __java_pool_size = 33554432 __streams_pool_size = 33554432 sga_target = 10317987840 control_files = /sx_data/ORCL/control01.ctl, /sx_data/ORCL/control02.ctl, /sx_data/ORCL/control03.ctl db_block_size = 8192 __db_cache_size = 7063207936 compatible = 10.2.0.3.0 log_archive_dest_1 = LOCATION=/sx_data/arch_ORCL/ log_archive_format = %t_%s_%r.dbf db_file_multiblock_read_count= 16 db_recovery_file_dest = /oracle/flash_recovery_area db_recovery_file_dest_size= 2147483648 undo_management = AUTO undo_tablespace = UNDOTBS1 undo_retention = 39600 fast_start_parallel_rollback= FALSE remote_login_passwordfile= EXCLUSIVE db_domain = dispatchers = (PROTOCOL=TCP) (SERVICE=ORCLXDB) local_listener = ORCL job_queue_processes = 10 background_dump_dest = /oracle/admin/ORCL/bdump user_dump_dest = /oracle/admin/ORCL/udump core_dump_dest = /oracle/admin/ORCL/cdump audit_file_dest = /oracle/admin/ORCL/adump db_name = ORCL open_cursors = 2000 optimizer_index_cost_adj = 20 optimizer_index_caching = 90 pga_aggregate_target = 2576351232 PMON started with pid=2, OS id=13613 PSP0 started with pid=3, OS id=13615 MMAN started with pid=4, OS id=13617 DBW0 started with pid=5, OS id=13619 LGWR started with pid=6, OS id=13621 CKPT started with pid=7, OS id=13623 SMON started with pid=8, OS id=13625 RECO started with pid=9, OS id=13627 CJQ0 started with pid=10, OS id=13629 MMON started with pid=11, OS id=13631 Tue Jul 15 22:23:30 2014 starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'... MMNL started with pid=12, OS id=13635 Tue Jul 15 22:23:30 2014 starting up 1 shared server(s) ... Tue Jul 15 22:23:31 2014 ALTER DATABASE MOUNT Tue Jul 15 22:23:39 2014 Setting recovery target incarnation to 2 Tue Jul 15 22:23:42 2014 Successful mount of redo thread 1, with mount id 1380841571 Tue Jul 15 22:23:42 2014 Database mounted in Exclusive Mode Completed: ALTER DATABASE MOUNT Tue Jul 15 22:23:42 2014 ALTER DATABASE OPEN Tue Jul 15 22:23:47 2014 Beginning crash recovery of 1 threads parallel recovery started with 7 processes Tue Jul 15 22:23:50 2014 Started redo scan Tue Jul 15 22:23:52 2014 Completed redo scan 336597 redo blocks read, 78835 data blocks need recovery Tue Jul 15 22:23:52 2014 Started redo application at Thread 1: logseq 2270, block 29 Tue Jul 15 22:23:53 2014 Recovery of Online Redo Log: Thread 1 Group 4 Seq 2270 Reading mem 0 Mem# 0: /sx_data/ORCL/redo04.log Tue Jul 15 22:23:58 2014 Completed redo application
一直停在Completed redo application這,而這時(shí)的等待事件是checkpoint complete開始以為是并行恢復(fù)慢造成的,就查詢了v$transaction,v$fast_start_transactions但視圖中并沒有進(jìn)行恢復(fù)操作的事務(wù)存在.后來咨詢了老熊,老熊說檢查一下IO情況看是不是存儲(chǔ)出問題了,如是再次檢查存儲(chǔ)IO性能:
rx6600-1:[/]#sar 1 10 HP-UX rx6600-1 B.11.23 U ia64 07/15/14 22:36:41 %usr %sys %wio %idle 22:36:42 2 2 12 84 22:36:43 1 0 12 87 22:36:44 0 0 17 83 22:36:45 0 0 13 87 22:36:46 0 1 12 87 22:36:47 2 1 13 84 22:36:48 1 1 16 82 22:36:49 0 0 12 88 22:36:50 0 0 12 88 22:36:51 0 0 22 78 Average 1 0 14 85
從上面可以看到現(xiàn)在實(shí)際上并沒有業(yè)務(wù)在跑居然還存在IO等待這是不正常的
rx6600-2:[/]#bdf Filesystem kbytes used avail %used Mounted on /dev/vg00/lvol3 983040 422504 556176 43% / /dev/vg00/lvol1 1835008 135048 1686776 7% /stand /dev/vg00/lvol8 8912896 8535352 374824 96% /var /dev/vg00/lvol7 7962624 2762312 5159704 35% /usr /dev/vg00/lvol4 524288 83192 437784 16% /tmp /dev/vg00/tmplv 2064384 93512 1847942 5% /oratmp /dev/vg00/orasoft 10256384 3144652 6668390 32% /orasoft /dev/vg00/oracle 20480000 5480497 14062042 28% /oracle /dev/vg00/lvol6 9076736 5206384 3840128 58% /opt /dev/vg00/lvol5 131072 25472 104824 20% /home /dev/cwjcvg/cwjc_datalv 414973952 134188114 263239842 34% /cwjc_data /dev/sxvg/sx_datalv 624689152 298665485 305654147 49% /sx_data rx6600-2:[/]#time dd if=/dev/zero of=/var/test bs=8k count=100000
下面對(duì)小機(jī)自身的磁盤進(jìn)行IO測(cè)試寫800M的數(shù)據(jù)只要12秒左右
msgcnt 2 vxfs: mesg 001: vx_nospace - /dev/vg00/lvol8 file system full (1 block extent) I/O error 47185+0 records in 47184+1 records out real 11.7 user 0.0 sys 0.8
但是對(duì)EMC存儲(chǔ)進(jìn)行IO測(cè)試寫800M的數(shù)據(jù)只要30多分還沒有完成
rx6600-2:[/]#time dd if=/dev/zero of=/sx_data/test bs=8k count=100000 711856+0 records in 711855+0 records out real 30:58.4 user 0.5 sys 13.0
這明顯的是存儲(chǔ)出了問題,后面得知管理人員早上10點(diǎn)多發(fā)現(xiàn)了存儲(chǔ)有一個(gè)磁盤損壞了,存儲(chǔ)做的raid 5,有熱備盤.而且還有上百G的數(shù)據(jù)進(jìn)行存儲(chǔ)級(jí)的同步.與出現(xiàn)性能問題的時(shí)間一至。
到此問題原因找到了解決起來也就簡(jiǎn)單了.幸虧問題解決了,第二天有大領(lǐng)導(dǎo)來檢查要不就.......哈哈
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。