Friday, February 26, 2016

Introduction to Memory Barriers


The memory barrier instructions halt execution of the application code until a memory write of an instruction has finished executing. They are used to ensure that a critical section of code has been completed before continuing execution of the application code.

Memory accesses are randomly performed by CPU, however this could be a problem when multiple CPUs or I/O are accessing the same memory. The one way is to create a critical section using spinlocks or mutex or semaphore however this approach has its more overheads.

Linux can use its memory barriers to give a sort of security to memory via aligned or ordered access to the RAM. The random accesses to memory by CPU becomes more problematic in case of a Multi-core system. Memory barriers are only required where there's a possibility of interaction between two CPUs or between a CPU and a device( Refer the Abstract Memory Access Model as below). If it can be guaranteed that there won't be any such interaction in any particular piece of code, then memory barriers are not required. 

Memory barriers impose a perceived partial ordering over the memory operations on either side of the barrier.Such enforcement is important because the CPUs and other devices in a system can use a variety of tricks to improve performance, including reordering, deferral and combination of memory operations; speculative loads; speculative branch prediction and various types of caching. Memory barriers are used to override or suppress these tricks, allowing the code to sanely control the interaction of multiple CPUs and/or devices.


                                                      Abstract Memory Access Model 

It is important to note here that Memory Barriers are not suitable for bit fields. As compilers often modify the bit field code to non-atomic read/write code. Hence accessing to bit fields cannot be synchronized.

Even in cases where bitfields are protected by locks, all fields in a given bitfield must be protected by one lock.  If two fields in a given bitfield are protected by different locks, the compiler's non-atomic read-modify-write sequences can cause an update to one
field to corrupt the value of an adjacent field.





Wednesday, February 24, 2016

Usage of Mutex, Semaphore and Spinlocks

This article is about usage of semaphore, mutex and spinlocks with reference to Linux kernel.

Semaphore: Use a semaphore when your (thread) want to sleep till some other thread tells you to wake up. Semaphore 'down' happens in one thread (producer) and semaphore 'up' (for same semaphore) happens in another thread (consumer) e.g.: In producer-consumer problem, producer wants to sleep till at least one buffer slot is empty - only the consumer thread can tell when a buffer slot is empty. 


Hence this is mainly used for such thread synchronization where two threads are somewhere dependent upon each other.
Mutex: Use a mutex when you (thread) want to execute code that should not be executed by any other thread at the same time. Mutex 'down' happens in one thread and mutex 'up' must happen in the same thread later on. This is called 'Ownership property'  e.g.: If you are deleting a node from a global linked list, you do not want another thread to muck around with pointers while you are deleting the node. When you acquire a mutex and are busy deleting a node, if another thread tries to acquire the same mutex, it will be put to sleep till you release the mutex.

It is also important to note that semaphores/Mutexes makes the thread sleep when blocked hence they are never used in IRQ handlers. However a mutex/semaphore can be unlocked from an IRQ handler.
Spinlock: Use a spinlock when you really want to use a mutex but your thread is not allowed to sleep. e.g.: An interrupt handler within OS kernel must never sleep. If it does the system will freeze / crash. If you need to insert a node to globally shared linked list from the interrupt handler, acquire a spinlock - insert node - release spinlock. 

In other words, a spinlock is actually a special type of semaphore which doesn't sleep in fact is in busy-wait loop. Linus himself agrees to this fact.

 [ Refer : - http://yarchive.net/comp/linux/semaphores.html ]

It is important to note that while holding spinlock, your interrupt might be disabled hence release of lock should happen as soon as possible. Since the usual practice is to make your critical sections as short as possible, the result is that the kerne luses a lot more spinlocks than semaphore_t's.

Why does locking a Semaphore/Mutex in IRQ handler is illegal or a strict no no ?

These locks tend to sleep when it is blocked and sleeping in IRQ handlers are not allowed due to following reasons.
  • Interrupts needed to be disabled inside the IRQ handler.
  • If the Interrupts are disabled and then a mutex/semaphore is acquired, it might hold the lock and sleep hence other dependent threads which are waiting for other interrupts might get blocked and hence system freeze or watch dog reset might happen. 
  • Spinlock is more likely to be used in IRQ handlers as while you acquire spinlock the local CPU never sleeps.