Catalog / Multithreading Cheatsheet

Multithreading Cheatsheet

A quick reference guide to multithreading concepts, techniques, and potential pitfalls. This cheat sheet provides a concise overview of multithreading for developers.

Fundamentals of Multithreading

Core Concepts

Thread: A lightweight unit of execution within a process.

Process: An instance of a program that has its own memory space and resources.

Concurrency: Multiple tasks making progress seemingly simultaneously, but not necessarily at the exact same time. Achieved via interleaving.

Parallelism: Multiple tasks executing simultaneously on different cores or processors. Requires multiple processing units.

Multithreading: A technique that allows multiple threads to exist within the context of a single process, sharing its resources but executing independently.

Benefits of Multithreading

Improved Responsiveness

Applications can remain responsive to user input even while performing lengthy operations in the background.

Increased Throughput

By utilizing multiple cores, multithreading can significantly increase the amount of work completed in a given time.

Resource Sharing

Threads within the same process share memory and resources, reducing the overhead compared to multiple processes.

Drawbacks of Multithreading

Complexity

Multithreaded code can be significantly more complex to design, implement, and debug than single-threaded code.

Synchronization Overhead

Managing access to shared resources requires synchronization mechanisms (locks, semaphores), which can introduce overhead and contention.

Deadlocks and Race Conditions

Improper synchronization can lead to deadlocks (threads blocking each other indefinitely) and race conditions (unpredictable behavior due to unsynchronized access to shared data).

Synchronization Primitives

Locks (Mutexes)

A lock (or mutex) provides exclusive access to a shared resource. Only one thread can hold the lock at a time.

acquire(): Acquires the lock. Blocks if the lock is already held by another thread.

release(): Releases the lock, allowing another waiting thread to acquire it.

Semaphores

A semaphore is a signaling mechanism that controls access to a shared resource using a counter. It can allow multiple threads to access the resource concurrently, up to a certain limit.

acquire(): Decrements the counter. Blocks if the counter is zero.

release(): Increments the counter, potentially waking up a waiting thread.

Condition Variables

Condition variables allow threads to wait for a specific condition to become true. They are typically used in conjunction with a lock.

wait(lock): Releases the lock and waits for a signal. Reacquires the lock before returning.

signal(): Wakes up one waiting thread.

broadcast(): Wakes up all waiting threads.

Avoiding Common Pitfalls

Race Conditions

A race condition occurs when multiple threads access shared data concurrently, and the final result depends on the unpredictable order of execution.

Prevention: Use locks or other synchronization mechanisms to protect shared data.

Example (Incorrect):

counter = 0

def increment():
  global counter
  counter += 1 # Not thread-safe

Example (Correct):

import threading

counter = 0
lock = threading.Lock()

def increment():
  global counter
  with lock:
    counter += 1 # Thread-safe

Deadlocks

A deadlock occurs when two or more threads are blocked indefinitely, waiting for each other to release resources.

Prevention: Avoid circular dependencies in resource acquisition. Use lock ordering or timeouts.

Example: Thread A holds lock L1 and waits for L2. Thread B holds lock L2 and waits for L1.

Livelocks

A livelock is similar to a deadlock, but threads continuously react to each other’s state, preventing any progress.

Prevention: Introduce randomness or backoff mechanisms to break the cycle.

Example: Two threads repeatedly attempt to acquire the same locks but back off when they detect a conflict, leading to no progress.

Thread Pools

Thread Pool Concept

A thread pool is a collection of pre-initialized threads that are ready to execute tasks. It reduces the overhead of creating and destroying threads for each task.

Benefits: Improved performance, resource management, and simplified task scheduling.

Common Use Cases

Web servers (handling incoming requests)

Batch processing (executing multiple tasks in parallel)

Image processing (applying transformations to multiple images)

Example

import concurrent.futures
import time

def task(n):
    print(f'Processing {n}')
    time.sleep(1) # Simulate work
    return n*n

with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
    results = [executor.submit(task, i) for i in range(5)]

    for future in concurrent.futures.as_completed(results):
        print(f'Result: {future.result()}')