Process is in interruptible sleep - how to find out what it is waiting for
Asked Answered
C

1

6

I have an daemon running on Debain on Arm. Sometimes this daemon hangs/is not responding anymore. When looking at the process with "ps ax" the stat column is "Dl" which means "uninterruptible sleep (usually IO)". Is it possible to find out more details on what the process is waiting - for example which IO is hanging?

Thanks!

Consecution answered 14/3, 2016 at 12:48 Comment(2)
Did you ever track this issue down? I'm seeing the same "Dl" processes issue on a linux arm box. Mine is just from running imagemagick's convert. Thanks!Barneybarnhart
No, unfortunately not.Consecution
S
3

I had the same question with Jetson Nano (ARMv8, kernel architecture: aarch64), which hanging on ptxas command.

First we neet to understand what does process in uninterruptible sleep state mean. Read this: What is an uninterruptible process? and Linux process states.

In short: the process in the uninterruptible sleep can by only woken by what it's waiting for. It can't be woken up by any signal.

To investigate what is going on, you can check stack of process

cat /proc/<PROCESS_PID>/stack

In my case it was

[nano]<$:~$ sudo cat /proc/6816/stack
[<ffffff80080863bc>] __switch_to+0x9c/0xc0
[<ffffff80081c6fdc>] wait_on_page_bit_killable+0x8c/0x98
[<ffffff80081c7948>] __lock_page_or_retry+0xc0/0xe8
[<ffffff80082029b8>] do_swap_page+0x5d0/0x840
[<ffffff8008204be4>] handle_mm_fault+0x60c/0xa68
[<ffffff80080a36b0>] do_page_fault+0x308/0x518
[<ffffff80080a392c>] do_translation_fault+0x6c/0x80
[<ffffff8008080954>] do_mem_abort+0x54/0xb0
[<ffffff80080833c8>] el0_da+0x20/0x24
[<ffffffffffffffff>] 0xffffffffffffffff

How do I stop that from happening?

It's depend on the case. In my case I increased the swap size.

This is probably a dumb question, but is there any way to interrupt it without restarting my computer?

Process in uninterruptible sleep state may occasionally change to interruptible state and then goes back to the previous uninterruptible state again. So, you can try

while true; do kill -9 <PROCESS_PID>; done
Southing answered 4/4, 2020 at 11:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.