To understand the foundations of distributed systems.
To learn issues related to clock Synchronization and the need for global state in distributed systems.
To learn distributed mutual exclusion and deadlock detection algorithms.
To understand the significance of agreement, fault tolerance and recovery protocols in Distributed Systems.
To learn the characteristics of peer-to-peer and distributed shared memory systems.
UNIT I INTRODUCTION 9
Introduction: Definition –Relation to computer system components –Motivation –Relation to parallel systems – Message-passing systems versus shared memory systems –Primitives for distributed communication –Synchronous versus asynchronous executions –Design issues and challenges. A model of distributed computations: A distributed program –A model of distributed executions –Models of communication networks –Global state – Cuts –Past and future cones of an event –Models of process communications. Logical Time: A framework for a system of logical clocks –Scalar time –Vector time – Physical clock synchronization: NTP.
UNIT II MESSAGE ORDERING & SNAPSHOTS 9
Message ordering and group communication: Message ordering paradigms –Asynchronous execution with synchronous communication –Synchronous program order on an asynchronous system –Group communication – Causal order (CO) – Total order. Global state and snapshot recording algorithms: Introduction –System model and definitions –Snapshot algorithms for FIFO channels
UNIT III DISTRIBUTED MUTEX & DEADLOCK 9
Distributed mutual exclusion algorithms: Introduction – Preliminaries – Lamport‘s algorithm – Ricart-Agrawala algorithm – Maekawa‘s algorithm – Suzuki–Kasami‘s broadcast algorithm. Deadlock detection in distributed systems: Introduction – System model – Preliminaries – Models of deadlocks – Knapp‘s classification – Algorithms for the single resource model, the AND model and the OR model.
UNIT IV RECOVERY & CONSENSUS 9
Checkpointing and rollback recovery: Introduction – Background and definitions – Issues in failure recovery – Checkpoint-based recovery – Log-based rollback recovery – Coordinated checkpointing algorithm – Algorithm for asynchronous checkpointing and recovery. Consensus and agreement algorithms: Problem definition – Overview of results – Agreement in a failure – free system – Agreement in synchronous systems with failures.
UNIT V P2P & DISTRIBUTED SHARED MEMORY 9
Peer-to-peer computing and overlay graphs: Introduction – Data indexing and overlays – Chord – Content addressable networks – Tapestry. Distributed shared memory: Abstraction and advantages – Memory consistency models –Shared memory Mutual Exclusion.
TOTAL: 45 PERIODS
OUTCOMES: At the end of this course, the students will be able to:
Elucidate the foundations and issues of distributed systems
Understand the various synchronization issues and global state for distributed systems.
Understand the Mutual Exclusion and Deadlock detection algorithms in distributed systems
Describe the agreement protocols and fault tolerance mechanisms in distributed systems.
Describe the features of peer-to-peer and distributed shared memory systems
- Kshemkalyani, Ajay D., and Mukesh Singhal. Distributed computing: principles, algorithms,
and systems. Cambridge University Press, 2011.
- George Coulouris, Jean Dollimore and Tim Kindberg, ―Distributed Systems Concepts and
Design‖, Fifth Edition, Pearson Education, 2012.
- Pradeep K Sinha, “Distributed Operating Systems: Concepts and Design”, Prentice Hall of India, 2007.
- Mukesh Singhal and Niranjan G. Shivaratri. Advanced concepts in operating systems. McGraw-Hill, Inc., 1994.
- Tanenbaum A.S., Van Steen M., ―Distributed Systems: Principles and Paradigms‖, Pearson Education, 2007.
- Liu M.L., ―Distributed Computing, Principles and Applications‖, Pearson Education, 2004. Nancy A Lynch, ―Distributed Algorithms‖, Morgan Kaufman Publishers, USA, 2003.