How three engineers made Linux's process synchronization dramatically faster โ by keeping the kernel out of the fast path.
When multiple processes or threads access the same data simultaneously, chaos ensues. Locks are the traffic lights of the computing world.
Imagine two bank tellers reading the same account balance of $100, each processing a $60 withdrawal. Without a lock, both approve โ and the bank loses $20.
A lock ensures only one process can access shared data at a time. The first teller locks the account, processes the withdrawal, then unlocks โ the second teller sees the updated balance.
Before futexes, every lock/unlock required a system call โ a costly trip into the kernel. Even when nobody else was competing for the lock!
The key insight: most of the time, there's no contention. So why bother the kernel at all?
Every single lock operation requires a system call โ even when no one else wants the lock.
Use atomic CPU instructions in user space. Only call the kernel when there's actual contention.
Click the buttons to simulate processes trying to acquire a shared lock. Watch how futexes handle contention.
From early prototypes to kernel inclusion โ the journey of fast userlevel locking in Linux.
Different wakeup policies trade off fairness for performance. The paper explores three strategies.
Benchmarks on a real database (TDB) show futexes dramatically outperforming traditional locks.
6 processes doing 200,000 random database operations each
With 1000 tasks competing for a single lock, convoy-avoidance futexes showed over 100ร the throughput improvement over System V semaphores.
Regular (convoy-avoidance) futexes, (9,1) configuration โ lower contention
The elegance is in the simplicity. A futex is just an integer in shared memory.
Futexes identify locks by their physical page + offset in memory, not by a kernel handle. This means different processes can map the same lock at different virtual addresses, and the kernel still knows it's the same lock. Kernel objects are created on-demand on the stack โ no pre-allocation needed!
"I expect this value at the futex address. If it's still that value, put me to sleep. If it changed, return immediately." This prevents race conditions.
"Wake up N processes waiting on this futex." Doesn't change the futex value โ that's left to user space. Returns how many were actually woken.
With FIFO locking, all processes move at the speed of the slowest one โ like a highway convoy stuck behind a truck. Convoy-avoidance locking lets the fastest process grab the lock immediately.
Waking ALL waiters when a lock is released wastes CPU โ only one can succeed. The solution: wake only one waiter (FUTEX_WAKE with val=1).
If a process dies while holding a lock, the lock is "dead" โ no one can acquire it. This remains an unsolved challenge for futexes since lock acquisition is in user space.
Futexes became the foundation of virtually all synchronization in modern Linux.
Every pthread_mutex_lock(), pthread_cond_wait(), and semaphore in modern Linux is built on top of futexes.
Systems like PostgreSQL, MySQL, and SQLite benefit from futex-based locking for internal buffer management and coordination.
Any multi-threaded program on Linux โ from web servers to desktop apps โ uses futexes under the hood whenever threads synchronize.