Classes of Computers, Trends in Technology, Power, Energy and Cost – Dependability –Measuring, Reporting and Summarizing Performance. Single core to Multi-core architectures: Limitations of Single Core Processors - The Multi core era – Case Studies of Multi core Architectures. System Overview of Threading: Defining Threads, System View of Threads, Threading above the Operating System, Threads inside the OS, Threads inside the Hardware, What Happens When a Thread Is Created, Application Programming Models and Threading,
Fundamental Concepts of Parallel Programming:
Designing for Threads, Task Decomposition, Data Decomposition, Data Flow Decomposition, Implications of Different Decompositions, Parallel Programming Patterns, A Motivating Problem: Error Diffusion, Analysis of the Error Diffusion Algorithm, An Alternate Approach: Parallel Error Diffusion. Threading and Parallel Programming Constructs: Performance – Scalability – Synchronization and data sharing – Data races – Synchronization primitives (mutexes, locks, semaphores, barriers) – deadlocks and livelocks – communication between threads (condition variables, signals, message queues and pipes).
TLP AND MULTIPROCESSORS:
Symmetric and Distributed Shared Memory Architectures – Cache Coherence Issues -Performance Issues – Synchronization Issues – Models of Memory Consistency - Interconnection Networks – Buses, Crossbar and Multi-stage Interconnection Networks.
A Portable Solution for Threading :
Challenges in Threading a Loop, Loop-carried Dependence, Datarace Conditions, Managing Shared and Private Data, Loop Scheduling and Portioning, Effective Use of Reductions, Minimizing Threading Overhead, Work-sharing Sections, Performance-oriented Programming, Using Barrier and No wait, Interleaving Single-thread and Multi-thread Execution. OpenMP: OpenMP Execution Model – Memory Model – OpenMP Directives – Work-sharing Constructs - Library functions – Handling Data and Functional Parallelism – Handling Loops – Performance Considerations.
Solutions to Common Parallel Programming Problems :
Too Many Threads, Data Races, Deadlocks, and Live Locks, Deadlock, Heavily Contended Locks, Priority Inversion, Solutions for Heavily Contended Locks, Non-blocking Algorithms, ABA Problem, Cache Line Ping-ponging, Memory Reclamation Problem, Recommendations, Thread-safe Functions and Libraries, Memory Issues, Bandwidth, Working in the Cache, Memory Contention, Cache-related Issues, False Sharing, Memory Consistency, Current IA-32 Architecture, Itanium Architecture.
Course outcomes:
At the end of the course the student will be able to:
Question paper pattern:
The SEE question paper will be set for 100 marks and the marks scored will be proportionately reduced to 60.
Textbook/ Textbooks
1 Multicore Programming, Increased Performance through Software Multi-threading ShameemAkhter and Jason Roberts Intel Press 2006
2 An Introduction to Parallel Programming Peter S Pacheco Morgan/Kuffman, Elsevier 2011
3 Multicore Application Programming for Windows, Linux, Oracle, Solaris Darryl Gove Pearson 2011
Reference Books
1 Parallel Programming in C with MPI and OpenMP Michael J Quinn Tata McGraw Hill 2003