您好,登錄后才能下訂單哦!
今天在檢查SMIDB的時候,發(fā)現(xiàn)CRS的告警日志中出現(xiàn)很多錯誤,具體為:
2015-08-19 17:12:21.745:
[/oracle/app/11.2.0/grid_1/bin/oraagent.bin(6227)]CRS-5013:Agent "/oracle/app/11.2.0/grid_1/bin/oraagent.bin" failed to start process "/oracle/app/11.2.0/grid_1/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/11.2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/oraagent_grid.log" 2015-08-19 17:13:09.986: [/oracle/app/11.2.0/grid_1/bin/oraagent.bin(6227)]CRS-5013:Agent "/oracle/app/11.2.0/grid_1/bin/oraagent.bin" failed to start process "/oracle/app/11.2.0/grid_1/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/11.2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/oraagent_grid.log" 2015-08-19 17:13:21.758: [/oracle/app/11.2.0/grid_1/bin/oraagent.bin(6227)]CRS-5013:Agent "/oracle/app/11.2.0/grid_1/bin/oraagent.bin" failed to start process "/oracle/app/11.2.0/grid_1/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/11.2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/oraagent_grid.log"
進一步跟蹤日志發(fā)現(xiàn):
2015-08-19 17:14:09.993: [ora.LISTENER.lsnr][1342174976]{1:63186:26462} [check] clsn_agent::check: Exception SclsProcessSpawnException 2015-08-19 17:14:21.744: [ora.asm][1342174976]{0:21:2} [check] CrsCmd::ClscrsCmdData::stat entity 1 statflag 33 useFilter 0 2015-08-19 17:14:21.759: [ora.asm][1342174976]{0:21:2} [check] AsmProxyAgent::check clsagfw_res_status 0 2015-08-19 17:14:21.761: [ora.LISTENER_SCAN1.lsnr][1339545344]{0:21:2} [check] Utils:execCmd action = 3 flags = 38 ohome = (null) cmdname = lsnrctl. 2015-08-19 17:14:21.761: [ora.LISTENER_SCAN1.lsnr][1339545344]{0:21:2} [check] (:CLSN00008:)Utils:execCmd scls_process_spawn() failed 1 2015-08-19 17:14:21.761: [ora.LISTENER_SCAN1.lsnr][1339545344]{0:21:2} [check] (:CLSN00008:) category: -2, operation: fork, loc: spawnproc28, OS error: 11, other: forked failed [-1] 2015-08-19 17:14:21.761: [ora.LISTENER_SCAN1.lsnr][1339545344]{0:21:2} [check] clsnUtils::error Exception type=2 string= CRS-5013: Agent "/oracle/app/11.2.0/grid_1/bin/oraagent.bin" failed to start process "/oracle/app/11.2.0/grid_1/bin/lsnrctl" for action "check": details at "(:CLSN00008:)" in "/oracle/app/11.2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/oraagent_grid.log"
ONS的日志:
[grid@smidb11 logs]$ tail ons.out pthread_create() Resource temporarily unavailable pthread_create() Resource temporarily unavailable pthread_create() Resource temporarily unavailable pthread_create() Resource temporarily unavailable pthread_create() Resource temporarily unavailable pthread_create() Resource temporarily unavailable pthread_create() Resource temporarily unavailable pthread_create() Resource temporarily unavailable pthread_create() Resource temporarily unavailable [2015-05-07T03:09:22+08:00] [ons] [TRACE:2] [] [internal] ONS worker process stopped (0)
報這個錯誤說明是由于系統(tǒng)資源不足而導致的進程無法啟動,檢查ulimit設置
[grid@smidb11 logs]$ ulimit -u 10240
limit.conf
# End of file grid soft nproc 10240 grid hard nofile 65536 oracle soft nproc 10240 oracle hard nofile 65536
limit.conf配置有一些問題,沒有配置hard nproc 和 soft nofle,下周一重啟前進行修正
[grid@smidb11 pam.d]$ cat login #%PAM-1.0 auth [user_unknown=ignore success=ok ignore=ignore default=bad] pam_securetty.so auth include system-auth account required pam_nologin.so account include system-auth password include system-auth # pam_selinux.so close should be the first session rule session required pam_selinux.so close session required pam_loginuid.so session optional pam_console.so # pam_selinux.so open should only be followed by sessions to be executed in the user context session required pam_selinux.so open session required pam_namespace.so session optional pam_keyinit.so force revoke session include system-auth -session optional pam_ck_connector.so [grid@smidb11 pam.d]$
/etc/pam.d/login 文件沒有添加資源限制模塊,這里應該添加一行
session required /lib64/security/pam_limits.so
經(jīng)過網(wǎng)上查找資料,發(fā)現(xiàn)Oracle MOS上面的一個文檔,和我們的情況完全一致:
The processes and resources started by CRS (Grid Infrastructure) do not inherit the ulimit setting for "max user processes" from /etc/security/limits.conf setting (文檔 ID 1594606.1)
通過驗證,發(fā)現(xiàn)雖然我們的grid用戶的ulimit -u已經(jīng)設置為10240.但是實際運行的時候依然是1024.
這個是Oracle的一個Bug 17301761 ,我們的數(shù)據(jù)庫版本是11.2.0.4,正好是這個bug的影響范圍.
解決辦法有兩個,
1. 打補丁
2. 通過MOS給出的辦法進行規(guī)避,如下:
The ohasd script needs to be modified to setthe ulimit explicitly for all grid and database resources that are started bythe Grid Infrastructure (GI).
1) go to GI_HOME/bin
2) make a backup of ohasd script file
3) in the ohasd script file, locate thefollowing code:
Linux)
# MEMLOCK limit is for Bug 9136459
ulimit -l unlimited
if [ "$?" != "0"]
then
$CLSECHO -phas -f crs -l -m 6021 "l" "unlimited"
fi
ulimit -c unlimited
if [ "$?" != "0"]
then
$CLSECHO -phas -f crs -l -m 6021 "c" "unlimited"
fi
ulimit -n 65536
In the above code, insert the following linejust before the line with "ulimit -n 65536"
ulimit -u 16384
4) Recycle CRS manually so that the ohasdwill not use new ulimit setting for open files.
After the database is started, please issue "ps -ef | grep pmon" andget the pid of it.
Then, issue "cat /proc/<pid of the pmon proces>/limits | grepprocess" and find out if the Max process is set to 16384.
Setting the number of processes to 16384 should be enough for most serverssince having 16384 processes normally mean the server to loaded veryheavily. using smaller number like 4096 or 8192 should also suffice formost users.
In addition to above, the ohasd template needs to be modified to insure thatnew ulimit setting persists even after a patch is applied.
1) go to GI_HOME/crs/sbs
2) make a backup of crswrap.sh.sbs
3) in crswrap.sh.sbs, insert the followingline just before the line "# MEMLOCK limit is for Bug 9136459"
ulimit -u 16384
Finally, although the above setting is successfully used to increase the numberof processes setting, please test this on the test server first before settingthe ulimit on the production.
參考:http://blog.csdn.net/weiwangsisoftstone/article/details/42460585
免責聲明:本站發(fā)布的內容(圖片、視頻和文字)以原創(chuàng)、轉載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權請聯(lián)系站長郵箱:is@yisu.com進行舉報,并提供相關證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權內容。