Introduction to Distributed Database Systems PDF

Summary

This document provides an introduction to distributed database systems. It covers topics such as distributed DBMS architecture, file systems, database management, motivations, distributed computing concepts, and various advantages and issues related to distributed DBMS.

Full Transcript

Distributed Database Systems Introduction to Distributed Database Systems 1 Course Outline Introduction What is a distributed DBMS Problems Current state-of-affairs Background Distributed DBMS Architecture Distributed Database De...

Distributed Database Systems Introduction to Distributed Database Systems 1 Course Outline Introduction What is a distributed DBMS Problems Current state-of-affairs Background Distributed DBMS Architecture Distributed Database Design Data Access Control Distributed Query Processing Distributed Transaction Management Consistency of Replicated Databases Current Issues 2 Introduction to Distributed Database Systems File Systems program 1 File 1 data description 1 program 2 data description 2 File 2 program 3 data description 3 File 3 3 Introduction to Distributed Database Systems Database Management Application program 1 (with data semantics) DBMS description Application program 2 manipulation (with data database semantics) control Application program 3 (with data semantics) 4 Introduction to Distributed Database Systems Motivation Database Computer Technology Networks integration distribution Distributed Database Systems integration integration ≠ centralization 5 Introduction to Distributed Database Systems Distributed Computing A concept in search of a definition and a name. A number of autonomous processing elements (not necessarily homogeneous) that are interconnected by a computer network and that cooperate in performing their assigned tasks. 6 Introduction to Distributed Database Systems Distributed Computing Synonymous terms distributed function distributed data processing multiprocessors/multicomputers satellite processing backend processing dedicated/special purpose computers timeshared systems functionally modular systems 7 Introduction to Distributed Database Systems What is distributed … Processing logic Functions Data Control 8 Introduction to Distributed Database Systems What is a Distributed Database System? A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network. A distributed database management system (D–DBMS) is the software that manages the DDB and provides an access mechanism that makes this distribution transparent to the users. Distributed database system (DDBS) = DDB + D–DBMS 9 Introduction to Distributed Database Systems What is not a DDBS? A timesharing computer system A loosely or tightly coupled multiprocessor system A database system which resides at one of the nodes of a network of computers - this is a centralized database on a network node 10 Introduction to Distributed Database Systems Centralized DBMS on a Network Site 1 Site 2 Site 5 Communication Network Site 4 Site 3 11 Introduction to Distributed Database Systems Distributed DBMS Environment Site 1 Site 2 Site 5 Communication Network Site 4 Site 3 12 Introduction to Distributed Database Systems Implicit Assumptions Data stored at a number of sites  each site logically consists of a single processor. Processors at different sites are interconnected by a computer network  no multiprocessors parallel database systems Distributed database is a database, not a collection of files  data logically related as exhibited in the users’ access patterns relational data model D-DBMS is a full-fledged DBMS not remote file system, not a TP system 13 Introduction to Distributed Database Systems Shared-Memory Architecture P1 Pn M D Examples : symmetric multiprocessors (Sequent, Encore) and some mainframes (IBM3090, Bull's DPS8) 14 Introduction to Distributed Database Systems Shared-Disk Architecture P1 Pn D M1 Mn Examples : DEC's VAXcluster, IBM's IMS/VS Data Sharing 15 Introduction to Distributed Database Systems Shared-Nothing Architecture P1 Pn D1 Dn M1 Mn Examples : Teradata's DBC, Tandem, Intel's Paragon, NCR's 3600 and 3700 16 Introduction to Distributed Database Systems Applications Manufacturing - especially multi-plant manufacturing Military command and control EFT Corporate MIS Airlines Hotel chains Any organization which has a decentralized organization structure 17 Introduction to Distributed Database Systems Advantages of Distributed DBMS 18 Introduction to Distributed Database Systems Advantages of Distributed DBMS 19 Introduction to Distributed Database Systems Advantages of Distributed DBMS 20 Introduction to Distributed Database Systems Advantages of Distributed DBMS 21 Introduction to Distributed Database Systems Distributed DBMS Promises Transparent management of distributed, fragmented, and replicated data Improved reliability/availability through distributed transactions Improved performance Easier and more economical system expansion 22 Introduction to Distributed Database Systems Transparency Transparency is the separation of the higher level semantics of a system from the lower level implementation issues. Fundamental issue is to provide data independence in the distributed environment Network (distribution) transparency Replication transparency Fragmentation transparency horizontal fragmentation: selection vertical fragmentation: projection hybrid 23 Introduction to Distributed Database Systems Data Independence 24 Introduction to Distributed Database Systems Network Transparency 25 Introduction to Distributed Database Systems Replication Transparency 26 Introduction to Distributed Database Systems Fragmentation Transparency 27 Introduction to Distributed Database Systems Why should provide Transparency? 28 Introduction to Distributed Database Systems Example EMP ASG ENO ENAME TITLE ENO PNO RESP DUR E1 J. Doe Elect. Eng. E1 P1 Manager 12 E2 M. Smith Syst. Anal. E2 P1 Analyst 24 E3 A. Lee Mech. Eng. E2 P2 Analyst 6 E4 J. Miller Programmer E3 P3 Consultant 10 E5 B. Casey Syst. Anal. E3 P4 Engineer 48 E6 L. Chu Elect. Eng. E4 P2 Programmer 18 E7 R. Davis Mech. Eng. E5 P2 Manager 24 E6 P4 Manager 48 E8 J. Jones Syst. Anal. E7 P3 Engineer 36 E7 P5 Engineer 23 E8 P3 Manager 40 PROJ PAY PNO PNAME BUDGET TITLE SAL P1 Instrumentation 150000 Elect. Eng. 40000 P2 Database Develop. 135000 Syst. Anal. 34000 P3 CAD/CAM 250000 Mech. Eng. 27000 P4 Maintenance 310000 Programmer 24000 29 Introduction to Distributed Database Systems Transparent Access SELECT ENAME,SAL FROM EMP,ASG,PAY Tokyo WHERE DUR > 12 AND EMP.ENO = ASG.ENO AND PAY.TITLE = EMP.TITLE Boston Paris Paris projects Paris employees Communication Paris assignments Network Boston employees Boston projects Boston employees Boston assignments Montreal New York Montreal projects Paris projects Boston projects New York projects New York employees with budget > 200000 New York projects Montreal employees New York assignments Montreal assignments 30 Introduction to Distributed Database Systems Distributed Database - User View Distributed Database 31 Introduction to Distributed Database Systems Distributed DBMS - Reality User Query DBMS Software User Application DBMS Software DBMS Communication Software Subsystem User DBMS User Application Software Query DBMS Software User Query 32 Introduction to Distributed Database Systems Potentially Improved Performance Proximity of data to its points of use Requires some support for fragmentation and replication Parallelism in execution Inter-query parallelism Intra-query parallelism 33 Introduction to Distributed Database Systems Parallelism Requirements Have as much of the data required by each application at the site where the application executes Full replication How about updates? Updates to replicated data requires implementation of distributed concurrency control and commit protocols 34 Introduction to Distributed Database Systems System Expansion Issue is database scaling Emergence of microprocessor and workstation technologies Demise of Grosh's law Client-server model of computing Data communication cost vs telecommunication cost 35 Introduction to Distributed Database Systems Distributed DBMS Issues Distributed Database Design how to distribute the database replicated & non-replicated database distribution a related problem in directory management Query Processing convert user transactions to data manipulation instructions optimization problem min{cost = data transmission + local processing} general formulation is NP-hard 36 Introduction to Distributed Database Systems Distributed DBMS Issues Concurrency Control synchronization of concurrent accesses consistency and isolation of transactions' effects deadlock management Reliability how to make the system resilient to failures atomicity and durability 37 Introduction to Distributed Database Systems Relationship Between Issues 38 Introduction to Distributed Database Systems Related Issues Operating System Support operating system with proper support for database operations dichotomy between general purpose processing requirements and database processing requirements Open Systems and Interoperability Distributed Multidatabase Systems More probable scenario Parallel issues 39

Use Quizgecko on...
Browser
Browser