Why is interrupt disabled between spin_lock and spin_unlock in Linux?
Asked Answered
A

3

8

I was reading the implementation of Linux semaphores. Due to atomicity, signal and wait (up and down in the source code) use spin locks. Then I saw Linux disabled interrupt in spin_lock_irqsave and reenabled interrupt in spin_unlock. This confused me. In my opinion, there is really no point disabling interrupt within a critical section.

For example, proc A (currently active) acquired the lock, proc B (blocked) is waiting for the lock and proc C is doing some unrelated stuff. It makes perfect sense to context switch to C within the critical section between A and B. Even if C also tries to acquire the lock, since the lock is already locked by A, the result would be C being blocked and A resuming execution.

Therefore, I don't know why Linux decided to disable interrupt within critical sections guarded by spin locks. It probably won't cause any problems but seems like a redundant operation to me.

Anselme answered 10/5, 2016 at 18:49 Comment(2)
This may helps you understand why : makelinux.net/ldd3/chp-5-sect-5Marshmallow
Thank you! Section 5.5.2 of the suggested article addressed my concerns.Anselme
V
12

Allow me to start off with a disclaimer that I am not a Linux expert, so my answer may not be the most accurate. Please point out any flaws and problems that you may find.

Imagine if some shared data is used by various parts of the kernel, including operations such as interrupt handlers that need to be fast and cannot block. Let's say system call foo is currently active and has acquired a lock to use/access shared data bar, and interrupts are not disabled when/before acquiring said lock.

Now a (hardware) interrupt handler, e.g. the keyboard, kicks in and also needs access to bar (hardware interrupts have higher priority than system calls). Since bar is currently being locked by syscall foo, the interrupt handler cannot do anything. Interrupt handlers do need to be fast & not be blocked though, so they just keep spinning while trying to acquire the lock, which would cause a deadlock (i.e. system freeze) since syscall foo never gets a chance to finish and release its lock.

If you disable interrupts before trying to acquire the lock in foo, though, then foo will be able to finish up whatever it's doing and ultimately release the lock (and restore interrupts). Any interrupts trying to come in while foo holds the spinlock will be left on the queue, and will be able to start when the lock is released. This way, you won't run into the problem described above. However, care must also be taken to ensure that the lock for bar is held for as short as possible, so that other higher priority operations can take over whenever required.

Vingtetun answered 10/5, 2016 at 19:20 Comment(9)
Thank you for your answer. I did not consider this scenario indeed. Then how about on a multicore system? Is it sufficient to just disable the local core's interrupt? Since in that case there would be no deadlock.Anselme
Yes it's sufficient - your data would be protected from multicore access with a global spinlock. If the data can be concurrently accessed by a maximum of x number of processes, and if the data is not used by interrupt handlers, then you could use a semaphore to protect instead.Vingtetun
It's also worth mentioning that if the data is used by multiple interrupt handlers, you should disable interrupts before getting a spinlock (using spin_lock_irqsave) for all handlers & functions that can be interrupted by a higher priority interrupt that also uses the same data. Only the interrupt handler with the highest priority does not need to disable interrupts.Vingtetun
That makes perfect sense. Thanks!Anselme
this is called priority inversion in short. To avoid priority inversion on same core. In case of multiprocessor system also process disables interrupt on its own core to ensure it gets the enough cpu time to run.Jasisa
@AshishAggarwal this is not priority inversion, it is simply deadlock.Siward
@Siward I was thinking of the scenario where say a process A(low priority) has the lock and new process B(high priority) comes then B will be scheduled and if it will try to gain the same lock which A holds. This is priority inversion which results in deadlock. To prevent this process A may disable interrupts. Let me know if I am wrong here.Jasisa
@AshishAggarwal priority inversion is when a higher priority process waits on a resource (eg a spinlock) held by a lower priority process; this is problematic when there is a third process with "in the middle" priority, because it will take precedence over the lower priority process, and will be scheduled in favour of it - even though a higher priority process is waiting. It has little to do with deadlock.Siward
@AshishAggarwal (actually spinlocks are not a good example for priority inversion, because the high priority process doesn't really go into a waiting state when trying to acquire the resource. The point is that a lower priority process is able to proceed in favour of a higher-priority process, or a process that a higher-priority is waiting on. That's why it's "inversion". See wikipedia page: en.wikipedia.org/wiki/Priority_inversion)Siward
E
1

The answer is very simple: There is no way for the thread that tries to acquire a lock, to know if the ISR that will interrupt it, will try to acquire the same lock. If that will happen, the ISR will spin forever on that same lock and the system will deadlock.

Encaenia answered 10/12, 2020 at 23:19 Comment(0)
C
0

But what if an interrupt wants to signal a waiting thread ? Or want to test the sempahore value ? The irq disabling is not here to prevent context switch between two process, but to protect from irq. It's all in the comment at the beginning of the file :

  /*
   * Some notes on the implementation:
   *
   * The spinlock controls access to the other members of the semaphore.
   * down_trylock() and up() can be called from interrupt context, so we
   * have to disable interrupts when taking the lock.  It turns out various
   * parts of the kernel expect to be able to use down() on a semaphore in
   * interrupt context when they know it will succeed, so we have to use
   * irqsave variants for down(), down_interruptible() and down_killable()
   * too.
   *
   * The ->count variable represents how many more tasks can acquire this
   * semaphore.  If it's zero, there may be tasks waiting on the wait_list.
   */
Cannae answered 10/5, 2016 at 19:2 Comment(1)
Thanks for your reply. If I understand correctly, if interrupt tries to acquired a lock which was locked by the context before interrupt, then there would be a deadlock since interrupt handlers cannot be preempted? Next time I'll make sure I read the comments.Anselme

© 2022 - 2024 — McMap. All rights reserved.