您好,登錄后才能下訂單哦!
問題說明
最近的幾臺(tái)機(jī)器在同一天的不同時(shí)段都出現(xiàn)以下警告信息:
Mar 26 20:55:03 host1 kernel: WARNING: at fs/xfs/xfs_aops.c:1045 xfs_vm_releasepage+0xcb/0x100 [xfs]() Mar 26 20:55:03 host1 kernel: Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter ip_tables ebtable_filter ebtables ip6table_ filter ip6_tables devlink bridge stp llc xt_multiport sunrpc dm_mirror dm_region_hash dm_log dm_mod intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypa ss crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support dcdbas ipmi_devintf ipmi_si sg pcspkr ipmi_msg handler shpchp i2c_i801 lpc_ich nfit libnvdimm acpi_power_meter kgwttm(OE) xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_i ntel mgag200 drm_kms_helper igb syscopyarea sysfillrect sysimgblt ptp fb_sys_fops ttm pps_core dca ahci drm i2c_algo_bit libahci megaraid_sas i2c_core libata Mar 26 20:55:03 host1 kernel: fjes [last unloaded: nf_defrag_ipv4] Mar 26 20:55:03 host1 kernel: CPU: 10 PID: 224 Comm: kswapd0 Tainted: G OE ------------ 3.10.0-514.21.2.el7.x86_64 #1 Mar 26 20:55:03 host1 kernel: Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 1.3.7 02/08/2018 Mar 26 20:55:03 host1 kernel: 0000000000000000 00000000e02a0d05 ffff88103c7ebaa0 ffffffff81687073 Mar 26 20:55:03 host1 kernel: ffff88103c7ebad8 ffffffff81085cb0 ffffea0000687620 ffffea0000687600 Mar 26 20:55:03 host1 kernel: ffff88004a71daf8 ffff88103c7ebda0 ffffea0000687600 ffff88103c7ebae8 Mar 26 20:55:03 host1 kernel: Call Trace: Mar 26 20:55:03 host1 kernel: [<ffffffff81687073>] dump_stack+0x19/0x1b Mar 26 20:55:03 host1 kernel: [<ffffffff81085cb0>] warn_slowpath_common+0x70/0xb0 Mar 26 20:55:03 host1 kernel: [<ffffffff81085dfa>] warn_slowpath_null+0x1a/0x20 Mar 26 20:55:03 host1 kernel: [<ffffffffa038bfdb>] xfs_vm_releasepage+0xcb/0x100 [xfs] Mar 26 20:55:03 host1 kernel: [<ffffffff81180b22>] try_to_release_page+0x32/0x50 Mar 26 20:55:03 host1 kernel: [<ffffffff81196ad6>] shrink_active_list+0x3d6/0x3e0 Mar 26 20:55:03 host1 kernel: [<ffffffff81196ed1>] shrink_lruvec+0x3f1/0x770 Mar 26 20:55:03 host1 kernel: [<ffffffff811972c6>] shrink_zone+0x76/0x1a0 Mar 26 20:55:03 host1 kernel: [<ffffffff8119857c>] balance_pgdat+0x48c/0x5e0 Mar 26 20:55:03 host1 kernel: [<ffffffff81198843>] kswapd+0x173/0x450 Mar 26 20:55:03 host1 kernel: [<ffffffff810b1b20>] ? wake_up_atomic_t+0x30/0x30 Mar 26 20:55:03 host1 kernel: [<ffffffff811986d0>] ? balance_pgdat+0x5e0/0x5e0 Mar 26 20:55:03 host1 kernel: [<ffffffff810b0a4f>] kthread+0xcf/0xe0 Mar 26 20:55:03 host1 kernel: [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 Mar 26 20:55:03 host1 kernel: [<ffffffff81697698>] ret_from_fork+0x58/0x90 Mar 26 20:55:03 host1 kernel: [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 Mar 26 20:55:03 host1 kernel: ---[ end trace 24823c5c7a1ea2be ]---
這幾臺(tái)機(jī)器的 kernel 及應(yīng)用程序等崩潰信息由 abrtd 服務(wù)接管, 可以通過 abrt-cli 查看概要信息:
# abrt-cli list --since 1547518209 id 2181dce8f72761585cb6a904dbff1806c1315c27 reason: WARNING: at fs/xfs/xfs_aops.c:1045 xfs_vm_releasepage+0xcb/0x100 [xfs]() time: Sat 23 Mar 2019 08:30:45 PM CST cmdline: BOOT_IMAGE=/boot/vmlinuz-3.10.0-514.16.1.el7.x86_64 root=/dev/sda1 ro crashkernel=auto net.ifnames=0 biosdevname=0 package: kernel uid: 0 (root) count: 1 Directory: /var/spool/abrt/oops-2019-03-23-20:30:45-163925-0
內(nèi)核版本如下:
Centos7
Linux host1 3.10.0-514.21.2.el7.x86_64
分析處理
紅帽知識(shí)庫
參考紅帽知識(shí)庫文檔, xfs 的這類警告信息在 xfs 模塊遍歷代碼路徑的時(shí)候會(huì)打印該信息, 不影響主機(jī)使用. 可升級(jí)內(nèi)核到 kernel-3.10.0-693.el7 版本避免該警告信息, 詳細(xì)參見: redhat-access-2893711
Root Cause:
The messages were informational and they do not affect the system in a negative manner. They are seen because the XFS module is traversing through XFS code path.
代碼分析
紅帽知識(shí)庫中并未提到內(nèi)存回收的相關(guān)信息, 不過從堆棧信息來看, 像是因?yàn)閮?nèi)核回收內(nèi)存而引起的, 查看對(duì)應(yīng)時(shí)間點(diǎn)的內(nèi)存使用情況如下所示:
04:30:01 PM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty ...... 08:40:01 PM 513940 130976220 99.61 876 104616380 28610584 21.76 92439660 34840920 524 08:50:01 PM 479896 131010264 99.64 876 104666496 28557292 21.72 92513872 34804240 400 09:00:01 PM 455948 131034212 99.65 876 104675712 28588852 21.74 92418724 34926132 572 09:10:01 PM 556980 130933180 99.58 876 104610352 28552656 21.71 94287212 32983892 900 # sysctl vm.min_free_kbytes vm.min_free_kbytes = 90112
20:50 到 21:00 之間的可用內(nèi)存并沒有增加, 這意味著系統(tǒng)可能沒有做內(nèi)存回收操作, 我們按照 kernel 日志的堆棧信息來看函數(shù)的調(diào)用關(guān)系:
shrink_active_list -> try_to_release_page -> xfs_vm_releasepage //source/mm/filemap.c 3225 int try_to_release_page(struct page *page, gfp_t gfp_mask) 3226 { 3227 struct address_space * const mapping = page->mapping; ...... 3233 if (mapping && mapping->a_ops->releasepage) 3234 return mapping->a_ops->releasepage(page, gfp_mask); xfs_vm_releasepage 3235 return try_to_free_buffers(page); 3236 } //source/fs/xfs/xfs_aops.c 1034 STATIC int 1035 xfs_vm_releasepage( 1036 struct page *page, 1037 gfp_t gfp_mask) 1038 { 1039 int delalloc, unwritten; 1040 1041 trace_xfs_releasepage(page->mapping->host, page, 0, 0); 1042 1043 xfs_count_page_state(page, &delalloc, &unwritten); 1044 1045 if (WARN_ON_ONCE(delalloc)) 1046 return 0; 1047 if (WARN_ON_ONCE(unwritten)) 1048 return 0; 1049 1050 return try_to_free_buffers(page); 1051 } ...... 1827 const struct address_space_operations xfs_address_space_operations = { 1833 .releasepage = xfs_vm_releasepage,
對(duì)應(yīng) kernel 日志 kernel: WARNING: at fs/xfs/xfs_aops.c:1045 即可看出源文件 source/fs/xfs/xfs_aops.c 的 1045 行打印出了該堆棧信息, 實(shí)際上并沒有執(zhí)行 try_to_free_buffers 就已經(jīng)返回:
1045 if (WARN_ON_ONCE(delalloc)) 1046 return 0;
WARN_ON_ONCE 則相對(duì)簡(jiǎn)單, 在源文件 source/include/asm-generic/bug.h 即可找到:
73 #define __WARN() warn_slowpath_null(__FILE__, __LINE__) 85 #define WARN_ON(condition) ({ \ ... 88 __WARN(); \ 136 #define WARN_ON_ONCE(condition) ({ \ .... 140 if (unlikely(__ret_warn_once)) \ 141 if (WARN_ON(!__warned)) \
__WARN 函數(shù)則調(diào)用了堆棧信息里的 warn_slowpath_null 函數(shù), 進(jìn)而調(diào)用 warn_slowpath_common 函數(shù)打印了堆棧信息:
//source/kernel/panic.c 517 void warn_slowpath_null(const char *file, int line) 518 { 519 warn_slowpath_common(file, line, __builtin_return_address(0), 520 TAINT_WARN, NULL); 521 } 463 static void warn_slowpath_common(const char *file, int line, void *caller, 464 unsigned taint, struct slowpath_args *args) 465 { 466 disable_trace_on_warning(); 467 468 printk(KERN_WARNING "------------[ cut here ]------------\n"); 469 printk(KERN_WARNING "WARNING: at %s:%d %pS()\n", file, line, caller); 470 471 if (args) 472 vprintk(args->fmt, args->args); ...... 485 print_modules(); 486 dump_stack(); 487 print_oops_end_marker();
我們大致可以看出這個(gè)堆棧信息只是警告, 和紅帽知識(shí)庫中描述的一致, 并不影響主機(jī)的使用.
總結(jié)說明
從上面源文件的函數(shù)來看, 只要 kswapd 內(nèi)存回收的時(shí)候調(diào)用了 xfs_vm_releasepage 就有可能打印堆棧信息, 如果打印堆棧則不會(huì)執(zhí)行 try_to_free_buffers 操作, 所以查看內(nèi)存使用的時(shí)候可用內(nèi)存并沒有增加. 如果不希望出現(xiàn)堆棧信息可以開啟 disable_trace_on_warning 函數(shù)對(duì)應(yīng)的 kernel.traceoff_on_warning 內(nèi)核參數(shù)關(guān)閉堆棧提示, 不過關(guān)閉后其他的內(nèi)核信息也就不會(huì)再打印, 所以從這方面來看只有升級(jí)內(nèi)核版本才會(huì)避免出現(xiàn)這個(gè)信息.
好了,以上就是這篇文章的全部?jī)?nèi)容了,希望本文的內(nèi)容對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,謝謝大家對(duì)億速云的支持。
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。