Ansible が oom-killer をトリガーします。

Ansible が oom-killer をトリガーします。

ArchLinux
uname -aを実行します。

Linux localhost 4.7.2-1-ARCH #1 SMP PREEMPT Sat Aug 20 23:02:56 CEST 2016 x86_64 GNU/Linux

16GB メモリ 14GB スワップ領域

大規模なAnsibleジョブを実行すると、oom-killerが実行されます。この種のタスクを実行するには16GBで十分だと思いますが、私はoomログの専門家(またはLinuxメモリの専門家)ではありません。ログは次のとおりです。

Feb 14 11:35:36 localhost kernel: Out of memory: Kill process 22698 (systemd-coredum) score 503 or sacrifice child
Feb 14 11:35:36 localhost kernel: Killed process 22698 (systemd-coredum) total-vm:880316kB, anon-rss:37604kB, file-rss:67380kB, shmem-rss:0kB
Feb 14 11:42:52 localhost kernel: ansible invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=2, oom_score_adj=0
Feb 14 11:42:52 localhost kernel: ansible cpuset=/ mems_allowed=0
Feb 14 11:42:52 localhost kernel: CPU: 0 PID: 27123 Comm: ansible Not tainted 4.7.2-1-ARCH #1
Feb 14 11:42:52 localhost kernel: Hardware name: Dell Inc. OptiPlex 7020/08WKV3, BIOS A02 11/20/2014
Feb 14 11:42:52 localhost kernel:  0000000000000286 00000000a544d0e1 ffff8803b3147b48 ffffffff812eb132
Feb 14 11:42:52 localhost kernel:  ffff8803b3147d28 ffff88024193f000 ffff8803b3147bb8 ffffffff811f6e5c
Feb 14 11:42:52 localhost kernel:  ffff8803b3148000 0000000000000000 ffffffff81b28920 ffffffff811789c0
Feb 14 11:42:52 localhost kernel: Call Trace:
Feb 14 11:42:52 localhost kernel:  [<ffffffff812eb132>] dump_stack+0x63/0x81
Feb 14 11:42:52 localhost kernel:  [<ffffffff811f6e5c>] dump_header+0x60/0x1e8
Feb 14 11:42:52 localhost kernel:  [<ffffffff811789c0>] ? page_alloc_cpu_notify+0x50/0x50
Feb 14 11:42:52 localhost kernel:  [<ffffffff811762fa>] oom_kill_process+0x22a/0x440
Feb 14 11:42:52 localhost kernel:  [<ffffffff8117696a>] out_of_memory+0x40a/0x4b0
Feb 14 11:42:52 localhost kernel:  [<ffffffff812ffe08>] ? find_next_bit+0x18/0x20
Feb 14 11:42:52 localhost kernel:  [<ffffffff8117c05b>] __alloc_pages_nodemask+0xf0b/0xf30
Feb 14 11:42:52 localhost kernel:  [<ffffffff8117c3d4>] alloc_kmem_pages_node+0x54/0xd0
Feb 14 11:42:52 localhost kernel:  [<ffffffff81077c06>] copy_process.part.8+0x136/0x19a0
Feb 14 11:42:52 localhost kernel:  [<ffffffff811a974a>] ? handle_mm_fault+0xa7a/0x1f60
Feb 14 11:42:52 localhost kernel:  [<ffffffff81079647>] _do_fork+0xd7/0x3d0
Feb 14 11:42:52 localhost kernel:  [<ffffffff810655f5>] ? __do_page_fault+0x1f5/0x510
Feb 14 11:42:52 localhost kernel:  [<ffffffff810799e9>] SyS_clone+0x19/0x20
Feb 14 11:42:52 localhost kernel:  [<ffffffff81003c07>] do_syscall_64+0x57/0xb0
Feb 14 11:42:52 localhost kernel:  [<ffffffff815de861>] entry_SYSCALL64_slow_path+0x25/0x25
Feb 14 11:42:52 localhost kernel: Mem-Info:
Feb 14 11:42:52 localhost kernel: active_anon:548787 inactive_anon:232682 isolated_anon:0
                                   active_file:28394 inactive_file:24931 isolated_file:8
                                   unevictable:0 dirty:1 writeback:0 unstable:0
                                   slab_reclaimable:1897009 slab_unreclaimable:19547
                                   mapped:51240 shmem:28342 pagetables:20339 bounce:0
                                   free:1284106 free_pcp:446 free_cma:0
Feb 14 11:42:52 localhost kernel: Node 0 DMA free:15628kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15900k
Feb 14 11:42:52 localhost kernel: lowmem_reserve[]: 0 3468 15978 15978
Feb 14 11:42:52 localhost kernel: Node 0 DMA32 free:1221320kB min:14632kB low:18288kB high:21944kB active_anon:274224kB inactive_anon:273556kB active_file:40556kB inactive_file:36556kB unevictable:0kB isolated(anon):0kB isolated(file):32k
Feb 14 11:42:52 localhost kernel: lowmem_reserve[]: 0 0 12510 12510
Feb 14 11:42:52 localhost kernel: Node 0 Normal free:3899476kB min:52884kB low:66104kB high:79324kB active_anon:1920924kB inactive_anon:657172kB active_file:73020kB inactive_file:63168kB unevictable:0kB isolated(anon):0kB isolated(file):0
Feb 14 11:42:52 localhost kernel: lowmem_reserve[]: 0 0 0 0
Feb 14 11:42:52 localhost kernel: Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (ME) = 15628kB
Feb 14 11:42:52 localhost kernel: Node 0 DMA32: 166992*4kB (UME) 68889*8kB (UE) 7*16kB (H) 11*32kB (H) 11*64kB (H) 2*128kB (H) 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1220760kB
Feb 14 11:42:52 localhost kernel: Node 0 Normal: 721354*4kB (UME) 126667*8kB (UEH) 16*16kB (H) 2*32kB (H) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3899072kB
Feb 14 11:42:52 localhost kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Feb 14 11:42:52 localhost kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Feb 14 11:42:52 localhost kernel: 125644 total pagecache pages
Feb 14 11:42:52 localhost kernel: 43931 pages in swap cache
Feb 14 11:42:52 localhost kernel: Swap cache stats: add 2753281, delete 2709350, find 730647/1154037
Feb 14 11:42:52 localhost kernel: Free swap  = 12677364kB
Feb 14 11:42:52 localhost kernel: Total swap = 14124028kB
Feb 14 11:42:52 localhost kernel: 4179504 pages RAM
Feb 14 11:42:52 localhost kernel: 0 pages HighMem/MovableOnly
Feb 14 11:42:52 localhost kernel: 84923 pages reserved
Feb 14 11:42:52 localhost kernel: 0 pages hwpoisoned
(...)
Feb 14 11:42:52 localhost kernel: Out of memory: Kill process 27876 (firefox) score 41 or sacrifice child
Feb 14 11:42:52 localhost kernel: Killed process 27876 (firefox) total-vm:4003016kB, anon-rss:1091960kB, file-rss:41516kB, shmem-rss:80216kB

以下は少し役に立ついくつかのsysctl値を使用していますが、より大きな操作ではまだ発生します。

vm.overcommit_memory = 2
vm.overcommit_ratio = 100

私のAnsible操作の一部がシステムのメモリ+スワップ領域の両方を使用しているのは本当ですか?

ベストアンサー1

Ansibleは確かにそれほど多くのメモリを使用してはいけません。あなたがしていることについてもっと詳しく説明できますか? (いくつかあり、何をしているのか、使用されているモジュール、例など)そこでFirefoxが終了するのを見ました。 Firefoxで多くのことを始めましたか?

おすすめ記事