determine vm size of process killed by oom-killer
Asked Answered
S

2

3

Is there any way to determine the Virtual Memory Size of the process at the time it is killed by linux oom-killer .

I can't find any parameter in file /var/log/messages, which may tell me the total VM size of the process being killed. There is lots of other information available in /var/log/messages, but not the total VM size of the process.

This is a centos 5.7 x64 machine.

Following are the contents of /var/log/messages :

Mar  1 18:51:45 c42 kernel: NameService invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
Mar  1 18:51:45 c42 kernel: 
Mar  1 18:51:46 c42 kernel: Call Trace:
Mar  1 18:51:46 c42 kernel:  [<ffffffff800c9d3a>] out_of_memory+0x8e/0x2f3
Mar  1 18:51:46 c42 kernel:  [<ffffffff8002dfd7>] __wake_up+0x38/0x4f
Mar  1 18:51:46 c42 kernel:  [<ffffffff8000f677>] __alloc_pages+0x27f/0x308
Mar  1 18:51:46 c42 kernel:  [<ffffffff80013034>] __do_page_cache_readahead+0x96/0x17b
Mar  1 18:51:46 c42 kernel:  [<ffffffff80013971>] filemap_nopage+0x14c/0x360
Mar  1 18:51:46 c42 kernel:  [<ffffffff8000896c>] __handle_mm_fault+0x1fd/0x103b
Mar  1 18:51:46 c42 kernel:  [<ffffffff800671f2>] do_page_fault+0x499/0x842
Mar  1 18:51:46 c42 kernel:  [<ffffffff80031143>] do_fork+0x148/0x1c1
Mar  1 18:51:46 c42 kernel:  [<ffffffff8005dde9>] error_exit+0x0/0x84
Mar  1 18:51:46 c42 kernel: 
Mar  1 18:51:46 c42 kernel: Mem-info:
Mar  1 18:51:47 c42 kernel: Node 0 DMA per-cpu:
Mar  1 18:51:48 c42 kernel: cpu 0 hot: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 0 cold: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 1 hot: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 1 cold: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 2 hot: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 2 cold: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 3 hot: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 3 cold: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 4 hot: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 4 cold: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 5 hot: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 5 cold: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 6 hot: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 6 cold: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 7 hot: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 7 cold: high 0, batch 1 used:0
Mar  1 18:51:48 c42 kernel: cpu 8 hot: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 8 cold: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 9 hot: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 9 cold: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 10 hot: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 10 cold: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 11 hot: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 11 cold: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 12 hot: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 12 cold: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 13 hot: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 13 cold: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 14 hot: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 14 cold: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 15 hot: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: cpu 15 cold: high 0, batch 1 used:0
Mar  1 18:51:49 c42 kernel: Node 0 DMA32 per-cpu:
Mar  1 18:51:49 c42 kernel: cpu 0 hot: high 186, batch 31 used:31
Mar  1 18:51:49 c42 kernel: cpu 0 cold: high 62, batch 15 used:35

............

Mar  1 18:51:58 c42 kernel: cpu 14 cold: high 62, batch 15 used:18
Mar  1 18:51:58 c42 kernel: cpu 15 hot: high 186, batch 31 used:6
Mar  1 18:51:59 c42 kernel: cpu 15 cold: high 62, batch 15 used:14
Mar  1 18:51:59 c42 kernel: Node 1 HighMem per-cpu: empty
Mar  1 18:51:59 c42 kernel: Free pages:       50396kB (0kB HighMem)
Mar  1 18:51:59 c42 kernel: Active:1559270 inactive:2490421 dirty:0 writeback:0 unstable:0 free:12599 slab:8740 mapped-file:1186 mapped-anon:4051463 pagetables:16277
Mar  1 18:51:59 c42 kernel: Node 0 DMA free:10068kB min:8kB low:8kB high:12kB active:0kB inactive:0kB present:9660kB pages_scanned:0 all_unreclaimable? yes
Mar  1 18:51:59 c42 kernel: lowmem_reserve[]: 0 1965 8025 8025
Mar  1 18:51:59 c42 kernel: Node 0 DMA32 free:26176kB min:1980kB low:2472kB high:2968kB active:1020328kB inactive:922224kB present:2012496kB pages_scanned:4075359 all_unreclaimable? yes
Mar  1 18:51:59 c42 kernel: lowmem_reserve[]: 0 0 6060 6060
Mar  1 18:51:59 c42 kernel: Node 0 Normal free:6060kB min:6108kB low:7632kB high:9160kB active:490800kB inactive:5569172kB present:6205440kB pages_scanned:21679912 all_unreclaimable? yes
Mar  1 18:51:59 c42 kernel: lowmem_reserve[]: 0 0 0 0
Mar  1 18:51:59 c42 kernel: Node 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar  1 18:52:00 c42 kernel: lowmem_reserve[]: 0 0 0 0
Mar  1 18:52:00 c42 kernel: Node 1 DMA free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar  1 18:52:00 c42 kernel: lowmem_reserve[]: 0 0 8080 8080
Mar  1 18:52:00 c42 kernel: Node 1 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar  1 18:52:00 c42 kernel: lowmem_reserve[]: 0 0 8080 8080
Mar  1 18:52:00 c42 kernel: Node 1 Normal free:8092kB min:8144kB low:10180kB high:12216kB active:4725952kB inactive:3470288kB present:8273920kB pages_scanned:15611005 all_unreclaimable? yes
Mar  1 18:52:00 c42 kernel: lowmem_reserve[]: 0 0 0 0
Mar  1 18:52:00 c42 kernel: Node 1 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar  1 18:52:01 c42 kernel: lowmem_reserve[]: 0 0 0 0
Mar  1 18:52:02 c42 kernel: Node 0 DMA: 5*4kB 2*8kB 5*16kB 5*32kB 5*64kB 2*128kB 0*256kB 0*512kB 1*1024kB 0*2048kB 2*4096kB = 10068kB
Mar  1 18:52:02 c42 kernel: Node 0 DMA32: 30*4kB 1*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 6*4096kB = 26176kB
Mar  1 18:52:02 c42 kernel: Node 0 Normal: 9*4kB 7*8kB 3*16kB 1*32kB 0*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 1*4096kB = 6060kB
Mar  1 18:52:02 c42 kernel: Node 0 HighMem: empty
Mar  1 18:52:03 c42 kernel: Node 1 DMA: empty
Mar  1 18:52:03 c42 kernel: Node 1 DMA32: empty
Mar  1 18:52:03 c42 kernel: Node 1 Normal: 49*4kB 3*8kB 0*16kB 0*32kB 1*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 1*4096kB = 8092kB
Mar  1 18:52:03 c42 kernel: Node 1 HighMem: empty
Mar  1 18:52:03 c42 kernel: 1624 pagecache pages
Mar  1 18:52:04 c42 kernel: Swap cache: add 2581210, delete 2580953, find 6957/9192, race 0+16
Mar  1 18:52:04 c42 kernel: Free swap  = 0kB
Mar  1 18:52:04 c42 kernel: Total swap = 10241428kB
Mar  1 18:52:04 c42 kernel: Free swap:            0kB
Mar  1 18:52:06 c42 kernel: 4718592 pages of RAM
Mar  1 18:52:06 c42 kernel: 616057 reserved pages
Mar  1 18:52:07 c42 kernel: 17381 pages shared
Mar  1 18:52:08 c42 kernel: 260 pages swap cached
Mar  1 18:52:09 c42 kernel: Out of memory: Killed process 16727, UID 501, (ApplicationMoni).
Shively answered 4/3, 2016 at 6:54 Comment(5)
This may be related.Seale
@SamProtsenko The above link has no mention of VM size.Shively
@ArpitAggarwal,do you mean, at what memory this process got killed right ?Triune
I mean the total Virtual memory size of the process at the time it was killed by OOM killer. My process is getting killed by OOM killer, I want to see how much was its total Virtual memory size and also the total Resident memory usage of my program.Shively
As per linux, total vm is physical memory + virtual memory i.e RAM+SWAP.Triune
T
3

As per linux, total memory is the sum of physical memory and virtual memory i.e RAM+SWAP.

when ever your process got killed,you will get the score of the process got killed in kern log.

By observing the top command and oom_score of process. I figured that,

oom_score <= to percent it used in total memory

For example : My system having 16GB of RAM and 1GB of SWAP, so total Memory is 17GB. Tomcat process got killed with oom score '602', Then the usage of tomcat is greater than or equivalent 60.2% of total memory, i.e 10.23 + GB of RAM is occupied by tomcat.

Here is another example: enter image description here enter image description here

score is 249 i.e memory usage is 24.9+ %

Triune answered 13/10, 2017 at 12:11 Comment(1)
But OOM Killer doesn't print the oom_score of the process at the time of killing it. How am I supposed to know that.Shively
D
2

This is reported in dmesg, after the stack trace that caused the crash (usually a memory allocation request)

Dutch answered 10/12, 2017 at 8:8 Comment(1)
Is there a way to get the output that dmesg logs at will? (to monitor stuff?)Previse

© 2022 - 2024 — McMap. All rights reserved.