Parallel Programming Models
- Programming model is made up of the languages and libraries that create an abstract view of the machine The programming model enables us to identify
- Control
- How is parallelism created?
- What orderings exist between operations
- Data:
- what data is private vs. shared?
- How is logically shared data accessed or communicated
Programming Model: Shared Memory
- Program is a collection of threads of control, can be created mid-execution
- Each thread has a set of private variables, e.g., local stack variables
- Also a set of shared variables, e.g., static variables
- Threads communicate implicitly by writing and reading shared variables
Overview of POSIX Threads
- POSIX: Portable Operating System Interface
- Interface to Operating System utilities
- PThraeds: The POSIX threading interface
- System calls to create and synchronize threads
- Should be relatively uniform across UNIX-like OS platforms -PThreads contain support for
- Creating parallelism
- Synchronizing
- No explicit support for communication, because shared memory is implicit; A pointer to shared data is passed to a thread
Synchronization
- Threads interact in a multiprogrammed system
- To share resources
- To coordinate their execution
- Arbitrary interleaving of thread executions can have unexpected consequences
- We need a way to restrict the possible interleaving of executions
- Scheduling is invisible to the application => cannot know when we lose control of the CPU and another thread/process runs
- Synchronization is the mechanism that gives us this control
Race conditions and synchronization
- What happens when 2 or more concurrent threads manipulate a shared resource without any synchronization?
- The outcome depends on the order in which accesses take place
- This is called a race condition
- We need to ensure that only one thread at a time can manipulate the shared resource
- So that we can reason about correct program behaviour
- We need synchronization
Mutual Exclusion
- Given:
- A set of n threads
- A set of resources shared between threads
- A segment of code which accesses the shared resources, called the critical section
- We want to ensure that:
- Only one thread at a time can execute in the critical section
- All other threads are forced to wait on entry
- When a thread leaves the CS, another can enter
- Mutex locks
- Typically associated to a resource, to ensure one access at a time, to that resource
- Ensure mutual exclusion to a critical section
- For mutexes, a thread go to sleep when they see the lock is busy
Overheads of locking
- When threads access the same locks => lock contention
- If lots of threads are contending for the same lock, impacts performance
- Solution: Compute as much as possible locally, use synchronization scarcely
Barriers
- Threads that reach the barrier stop until all threads have reached it as well
- If execution has stages, barrier ensures that data needed from a previous stage is ready
OpenMP
- OpenMP = Open specification for Multi-Processing
- Motivation: capture common usage and simplify programming
- OpenMP will:
- Allow a programmer to separate a program into serial regions and parallel regions, rather than P concurrently-executing threads
- Hide stack management
- Provide synchronization constructs
- OpenMP will not:
- Parallelize automatically
- Guarantee speedup
- Provide freedom from data races
Execution model: Fork-and-Join
- Start with a single (master) thread
- Fork: Create a group of worker threads
- Everything in a parallel block is executed by every thread
- Join: All threads synchronize at the end of the parallel block
- Execution continues with only the initial thread
- Threads can do the exact same work, share the same tasks (work sharing), or perform distinct tasks in parallel