2025 Distributed Database Systems Practical & Oral Bank PDF

Document Details

ExpansiveEnlightenment3161

Uploaded by ExpansiveEnlightenment3161

2025

AB & AM & MR & AE

Tags

distributed databases database systems practical questions oral questions

Summary

This is a solved question bank for Distributed Database Systems. It contains practical and oral questions from the 2025 exam, which is suitable for undergraduate students.

Full Transcript

DISTRIBUTED DATABASE SYSTEMS 2025 PRACTICAL & ORAL BANK (Solved) SENIOR 25 Join our Telegram! BY AB & AM & MR & AE @qahwa_is ORAL...

DISTRIBUTED DATABASE SYSTEMS 2025 PRACTICAL & ORAL BANK (Solved) SENIOR 25 Join our Telegram! BY AB & AM & MR & AE @qahwa_is ORAL BANK Choose the correct answer: 1. A Collection of databases scattered across multiple sites over a network is called ___________ database. a) distributed b) centralized c) parallel d) cloud 2. _____________ is basically a database that is not limited to one system, it is spread over different sites, such as over a network of computers a) Distributed database b) Centralized database c) Local database d) none 3. Any table or any fragment of the table can be accessed by the user in such a manner that they are locally stored in the site of the user by using …………….. transparency. a) fragmentation b) location c) replication d) concurrency 4. In Data Integration in a distributed database, the transformations between the objects related is called schema …………… a) mapping b) matching c) modelling d) integration 5. In Data Integration in a distributed database, the process of identifying that two objects are semantically related is called schema …………… a) mapping b) matching c) modelling d) integration 6. A ____________ database system is located on various sites that don’t share physical components a) distributed b) centralized c) parallel d) cloud 7. all different sites store database identically. a) Homogeneous b) Heterogeneous c) Both of them d) none Database Database 8. different sites can use different schema and software. a) Homogeneous b) Heterogeneous c) Both of them d) none Database Database 9. The relation is fragmented into groups of tuples so that each tuple is assigned to at least one fragment a) Horizontal b) Hybrid c) Vertical d) none fragmentation fragmentation fragmentation 10. The schema of the relation is divided into smaller schemas. Each fragment must contain a common candidate key so as to ensure a lossless join a) Horizontal b) Hybrid c) Vertical d) none fragmentation fragmentation fragmentation 11. relation can be specified by 𝜎𝜎 𝐶𝐶𝐶𝐶 (𝑅𝑅) operation in the relational algebra. a) Horizontal b) Hybrid c) Vertical d) none fragmentation fragmentation fragmentation 12. relation can be specified by 𝜋𝜋𝜋𝜋𝜋𝜋(𝑅𝑅) operation in the relational algebra. a) Horizontal b) Hybrid c) Vertical d) none fragmentation fragmentation fragmentation 13. In some situations, the horizontal and the vertical fragmentation isn’t enough to distribute data for some applications and in that conditions, Mixed fragmentation can be done in two different ways a) Horizontal b) Hybrid c) Vertical d) none fragmentation fragmentation fragmentation BY AB & AM & MR & AE 1 14. the process of vertical fragmentation of a table followed by further horizontal fragmentation is called __________ fragmentation. a) Horizontal b) Vertical c) Hybrid d) Duplicated 15. The starting point of bottom-up design in distributed database system is the individual _______________ Schema. a) Local conceptual b) mediated c) Global conceptual d) relational 16. the root node the query tree represents ________________. a) A data table b) a relational operation c) the result of the query d) None of them 17. The internal nodes in the query tree represent a) Data table b) Query result c) relational operation d) query processing 18. Data is represented in query tree by __________. a) Root node b) Internal nodes c) Leaf nodes d) All of the above 19. in distributed query processing, the __________________ operator has been used as an effective operator to reduce the total amount to data transmission. a) semi-join b) Join c) select d) project 20. we can rebuild a table from the vertical fragments by __________ operation. a) select b Join c) project d) Union 21. In Horizontal fragmentation _________ operation can be performed on the fragments to construct a table. a) Union b Join c) project d) Union 22. in logical integration, the global conceptual schema is entirely virtual and ___________. a) materialized b) not materialized c) not structured d) structured 23. the assignment process of each data fragment to a particular site in the distributed system is called ___________. a) fragmentation b) allocation c) replication d) transparency 24. An intelligent distribution of your data fragments to improve database performance and data availability for end-users is called …………… (data distribution = data allocation) a) fragmentation b) allocation c) replication d) concurrency 25. A description of the replication of fragment in DDBMS is called ___________ schema. a) replication b) fragmentation c) data d) table 26. _________means that, when the data is updated by the user, it is updated and gets reflected in all the table of multiple sites. a) replication b) fragmentation c) concurrency d) table 27. Replicated copies facilitate the user in continuing with the queries in case of failure of a site, without the knowledge of failure, which is known as __________. a) replication b) fragmentation c) concurrency d) failure 28. ____________ is the process of creating and maintaining multiple copies of data is different sites. a) Transparency b) fragmentation c) allocation d) replication BY AB & AM & MR & AE 2 29. ________________ means the division of a database into various sub-tables and sub-relations so that data can be distributed and stored efficiently. a) concurrency b) fragmentation c) allocation d) replication 30. In ________, It must be made sure that the fragments are such that they can be used to reconstruct the original relation a) fragmentation b) replication c) allocation d) None 31. ………… process of dividing the whole or full database into various sub tables or sub relations so that data can be stored in different systems a) fragmentation b) replication c) allocation d) None 32. Query processing in DDBMS is different from query processing in centralized DBMS due to _____________ cost. a) design b) communication c) allocation d) maintenance 33. in physical integration, the source databases are integrated and the integrated database is ………… a) materialized b) not materialized c) not structured d) structured 34. the integration in distributed database system is aided by ___________ tools. a) Extract-Transform- b) Enterprise Application c) Enterprise Information d) Load-Extract- Load (ETL) Integration (EAI) Integration (EII) Transform (LET) 35. ________ allows data exchange between applications and perform similar transformation functions. a) Extract-Transform- b) Enterprise Application c) Enterprise Information d) Load-Extract- Load (ETL) Integration (EAI) Integration (EII) Transform (LET) 36. in query trading algorithm for distributed database system, the controlling/client site for a distributed query is called the ____________. a) customer b) requester c) seller d) buyer 37. in query trading algorithm for distributed database system, the local queries execute are called the ____________. a) customer b) requester c) seller d) buyer 38. in the case ___________ operations comprising of fragments located in multiple sites, we should transfer fragmented data to the site where most of the data is present and preform operation there a) join b) union c) A&B d) none 39. __________database systems are systems with a single or multiple logical databases located at more than one site under the control of a single DDBMS. a) Centralized b) Distributed c) Both d) None 40. ______________, A logically interrelated collection of shared data (and a description of this data) physically distributed over a computer network. a) Distributed database b) Distributed DBMS c) centralized database d) Centralized DBMS 41. ______________, the software system that permits the management of the distributed database and makes the distribution transparent to users. a) Distributed database b) Distributed DBMS c) centralized database d) Centralized DBMS 42. Users access the distributed database via applications, which are classified as those that do not require data from other sites, is known as ______________. a) local applications b) Global application c) Both d) None BY AB & AM & MR & AE 3 43. Users access the distributed database via applications, which are classified as those that require data from other sites, is known as ______________. a) local applications b) Global application c) Both d) None 44. Each DBMS participates in at least ____________ global application. a) one b) two c) three d) four 45. Distributed Database Types may be _______ a) Homogenous only b) Heterogeneous only c) Two are true d) None 46. DDBMS is a collection of logically _________ shared data a) Related b) unrelated c) far d) none 47. The data is split into a number of __________. a) Tracks b) Parts c) Patches d) Fragments 48. With the ______________ In a distributed database, the system functions even when failures occur, only delivering reduced performance until the issue is resolved a) Single Development b) Reliability c) Modular d) Lower Development Communication Cost 49. The small pieces of sub relations or sub tables are called…………... a) fragments b) replication c) allocation d) None 50. ………………In this architecture, clients connect to a central server, which manages the distributed database system. The server is responsible for coordinating transactions, managing data storage, and providing access control a) Client-server b) Peer-to-peer c) Federated d) none architecture architecture architecture 51. ………………….In this architecture, each site in the distributed database system is connected to all other sites. Each site is responsible for managing its own data and coordinating transactions with other sites. a) Client-server b) Peer-to-peer c) Federated d) none architecture architecture architecture 52. ……………In this architecture, each site in the distributed database system maintains its own independent database. but the databases are integrated through a middleware layer that provides a common interface for accessing and querying the data a) Client-server b) Peer-to-peer c) Federated d) none architecture architecture architecture 53. _____________ each fragment is stored at the site with optimal distribution a) Data fragmentation b) Data allocation c) Data replication d) Location transparency 54. increases the availability and improves the performance of the system. a) Data fragmentation b) Data allocation c) Data replication d) Location transparency 55. enables a user to access data without knowing, or being concerned with, the site at which the data resides a) Data fragmentation b) Data allocation c) Data replication d) Location transparency 56. In Query optimization, the first type of optimization is done at _________ a) Local level b) Global level c) internal level d) External level BY AB & AM & MR & AE 4 57. In Query optimization, the second type of optimization is done at ________ a) Local level b) Global level c) internal level d) External level 58. ……. each local DBE performs on the fragments that are stored at the local site, where the local CPU and, and the disk input/output (I/O) time are the main drivers. a) Local level b) Global level c) internal level d) External level 59. Almost all global optimization alternatives ignore the ____________ a) local processing b) local processing c) commit time d) none time cost 60. It was believed that the communication cost was a more dominant factor than the ___________. a) local processing time b) local processing c) commit time d) none cost 61. It is believed that ___________ is/are important to query optimization. a) Local query cost b) Global c) Both d) none communication cost 62. A_____________ is a tree data structure representing a relational algebra expression. a) query tree b) query processing c) query structure d) query data 63. A query tree is a tree data structure representing a ____________ expression. a) Relational algebra b) Relational calculus c) Mathematical d) SQL 64. _______________requires evaluation of a large number of query trees each of which produce the required results of a query. a) Distributed query b) Distributed c) Both A&B d) none optimization Database 65. In _______________ the target is to find an optimal solution instead of the best solution a) Distributed query b) Distributed c) Both A&B d) none optimization Database 66. In query optimization the target is to find an ___________ solution. a) Best b) Optimal c) Shortest d) Easiest 67. _______________ means the operation is run at the site where the data is stored and not the client site. a) Operation shipping b) Data shipping c) hybrid shipping d) none 68. ……………. the data fragments are transferred to the database server, where the operations are executed a) operation shipping b) data shipping c) Hybrid shopping d) none 69. ………………….This is a combination of data and operation shipping, Data fragments are transferred to the high-speed processors, where the operation runs a) operation shipping b) data shipping c) Hybrid shopping d) none 70. In ___________ algorithm for distributed database systems, the controlling/client site for a distributed query is called the buyer and the sites where the local queries execute are called sellers. a) query tree b) Query trading c) query structure d) none 71. ………….. in a distributed database management system requires the transmission of data between the computers in a network a) Query processing b) distribution strategy c) transmission d) none BY AB & AM & MR & AE 5 72. …………… is the ordering of data transmissions and local data processing in a database system a) Query processing b) distribution strategy c) transmission d) none 73. The transmission cost is low when sites are connected through ______-speed Networks and is quite significant in other networks a) Low b) High c) Mid d) none 74. The transmission cost is quite significant when sites are connected through ______-speed Networks a) Low b) High c) Mid d) none 75. Data transfer cost = ________. where C is the cost per byte of data transferring and Size is the no. of bytes transmitted. a) C * Size b) C / size c) C + size d) C - size 76. Commonly, the data transfer cost is calculated in terms of the ______ of the messages. a) No. of bytes b) No. of words c) No. of phrases d) None 77. __________ is the transformation cost for DEPARTMENT table with: DID-10 bytes, DName-20 bytes, and Total records-50 (Cost per byte = total records = 50, Size = 10+20 = 30 transfer cost = C * Size 50 * 30) a) 500 bytes b) 1000 bytes c) 1500 bytes d) 20000 bytes 78. The process of decomposing a table by attributes is called ____________ fragmentation. a) Horizontal b) Vertical c) Hybrid d) Diagonal 79. A query in distributed DBMS requires data from multiple sites, and this is called data __________. a) Transformation b) Transmission c) Matching d) Mapping 80. __________ operations must be performed as early as possible to reduce the data flow over communication network. a) SELECT σ and b) SELECT σ and c) PROJECT π and d) None PROJECT π JOIN ⨝ JOIN ⨝ 81. Database integration can be either physical or _______ a) Practical b) Logical c) Static d) dynamic 82. A _____ replicated database is a good choice if most transactions are retrieval only. a) Fully b) Partially c) Limited d) None 83. A_________is a collection of actions that make consistent transformations of system states while preserving system consistency. a) Transaction b) Transparency c) Isolation d) Recovery 84. Properties of transactions________ a) Isolation b) Atomicity c) Durability d) All of the above 85. ______refers to the effects of some transactions are not reflected on the database. a) Phantom b) Fuzzy Read c) Consistent retrieval d) Lost update BY AB & AM & MR & AE 6 86. ________refers to the transaction, if it reads the same data item more than once, should always read the same value. a) Phantom b) Fuzzy Read c) Inconsistent d) Lost update retrieval 87. _________ requires that the global execution history be serializable. a) Weak consistency b) Mutual consistency c) DB consistency d) Transaction consistency 88. ________ refers to the replicas converging to the same value. a) Weak consistency b) Mutual consistency c) DB consistency d) Transaction consistency 89. __________ requires the availability of lock managers at each site. a) Centralized 2PL b) Primary copy c) Hieratical 2PL d) Distributed 2PL 90. __________ Lock requests are issued to the central scheduler. a) Centralized 2PL b) Primary copy c) Hieratical 2PL d) Distributed 2PL 91. We refer to the TM at the originating site in DDB as the_______ a) Centralized b) Participant c) Coordinator d) Mutator 92. ________ methods guarantee that deadlocks cannot occur in the first Place. The transaction manager checks a transaction when it is first initiated and does not permit it to proceed f it may cause a deadlock a) Deadlock b) Deadlock detection c) Deadlock avoidance d) Deadlock Detection prevention and resolution 93. ___________of transaction timestamps to prioritize transactions and resolve deadlocks by aborting transactions with higher (or lower) priorities. a) Deadlock b) Deadlock detection c) Deadlock d) Deadlock Detection prevention and resolution avoidance 94. ________is the most popular approach to managing deadlocks in the distributed setting. Deadlock prevention a) Deadlock b) Deadlock detection c) Deadlock avoidance d) Deadlock Detection prevention and resolution 95. ___________ is the probability that the system under consideration does not experience any failures in a given time interval. a) Availability b) Reliability c) Consistency d) Durability 96. ____________is the probability that the system is operational according to its specification at a given point in time. a) Availability b) Reliability c) Consistency d) Durability 97. When the source site sends a massage but dose not get a response within an expected time; this is called a __________At that point, reliability algorithms need to take action a) Rollback b) Abort c) Network d) Timeout partitioning 98. A_________ occurs at a destination site when it cannot get an expected message from a source site within the expected time period. a) Rollback b) Abort c) Network d) Timeout partitioning BY AB & AM & MR & AE 7 99. _________first executes the updating transaction on one copy and after the transaction commits the changes are propagated to all other copies (refer transaction) a) Eager replication b) Lazy replication c) Limited replication d) Full replication transparency transparency 100. The starting point of bottom-up design is the set of________ a)Local Conceptual b) Global Conceptual c) Normalized Schema d) Replicated Schema Schemas (LCSs) Schema (GCS) 101. The process consists of integrating local databases with their________ a)Local Conceptual b) Global Conceptual c) Normalized Schema d) Replicated Schema Schemas (LCSs) Schema (GCS) 102. Database integration focuses on distributed database that are design in a _________ fashion. a) Normalized b) Hierarchal c) Top-down d) Bottom-up 103. In Reliability protocols______ ___command (s) are executed differently in a distributed BDMS than in a centralized DBMS a) Begin, Read b) Commit, Recover c) Replicas, Correct d) Write 104. In the distributed setting, in addition to centralized failure types, the system nodes to cope with __________ failure a) Communication b) Site c) Transaction d) Media 105. A protocol is ________ if it permits a transaction to terminate at the operational sites without waiting for recovery of the failed site. a) Recoverable b) Committed c) Nonblocking d) Independent 106. A replicated database is said to be in a __________. State if all the replicas of each of its data items have identical values a) Mutual consistency b) Fully replicated c) Replication d) Partially replicated 107. Concurrency control algorithms enforce the ___________. Property so that concurrent transactions see a consistent database state and leave the database in consistent state. a) Atomicity b) Isolation c) Durability d) None 108. In recovery protocols for the 2PC, the participant fails in initial state, the decision will be____________ a) Abort b) No action c) Remain Block d) Commit 109. In 2PC, if the participant timeouts in ready state the decision be____________ a) Abort b) No action c) Remain Block d) Commit 110. Which of the following is NOT a property of transactions? a) Atomicity b) Isolation c) Reliability d) Redundancy 111. Transactions Provide _______ and __________ execution in the presence of failures. a) Atomic, Reliable b) Atomic, replicas c) Replicas, Correct d) None 112. Two histories have to be considered_________ a) local histories b) global history c) external history d) Both A & B 113. Transactions indicate their intentions by requesting locks from the scheduler_______ a) exclusive lock b) shared lock c) read lock d) lock manager BY AB & AM & MR & AE 8 114. Locking-Based Algorithm are read lock (rl) is called__________ a) exclusive lock b) shared lock c) read lock d) write lock 115. Locking-Based Algorithm are write lock (wl) is called__________ a) exclusive lock b) shared lock c) read lock d) write lock 116. ____________ transaction one of whose operations is rejected by a scheduler is restarted by the transaction manager with a new timestamp. a) TimeStamp b) Deadlock c) Multi-version d) None Ordering (TO) prevention 117. ____________ Periodically, each lock manager transmits its LWFG to the deadlock detector, which then forms the GWFG and looks for cycles in it. a) Centralized b) Hierarchical c) Distributed d) Hierarchical Deadlock Detection Deadlock Detection Deadlock Detection Deadlock prevention 118. _______________ that are local to a single site would be detected at that site using the LWFG. Each site also sends its LWFG to the deadlock detector at the next level. a) Centralized b) Hierarchical c) Distributed d) Hierarchical Deadlock Detection Deadlock Detection Deadlock Detection Deadlock prevention 119. what are the failures in DDBMS? a) Communication b) Media c) Site d) All of them 120. The Purposes of Replication, __________ remove single points of failure by replicating data, so that data items are accessible from multiple sites. a) System availability b) Performance c) Scalability d) None 121. The Purposes of Replication, __________ locate the data closer to their access points, thereby localizing most of the access that contributes to a reduction in response time a) System availability b) Performance c) Scalability d) None 122. The Purposes of Replication, __________ for a way to support growth with acceptable response times a) System availability b) Performance c) Scalability d) None 123. _________all copies of a data item have the same value at the end of the execution of an update transaction. a) Strong mutual b) Weak mutual c) DB consistency d) None consistency criteria consistency criteria criteria 124. _____________ the update activity ceases for some time, the values eventually become identical. a) Strong mutual b) Weak mutual c) DB consistency d) None consistency criteria consistency criteria criteria 125. __________ a query to see inconsistent data while replicas are being updated but requires that the replicas converge to a one-copy serializable state once the updates are propagated to all of the copies. a) Epsilon b) Transaction c) Mutual consistency d) None serializability (ESR) consistency 126. _____________ applying update on all the replicas at the same time (when the Write is issued). a) Synchronous b) Deferred c) Mutual consistency d) None propagation propagation BY AB & AM & MR & AE 9 127. _____________the updates are applied to one replica when they are issued, but their application on the other replicas is batched and deferred to the end of the transaction a) Synchronous b) Deferred c) Lazy propagation d) None propagation propagation 128. _____________ in those applications for which strong mutual consistency may be unnecessary and too restrictive. a) Synchronous b) Deferred c) Lazy propagation d) None propagation propagation 129. _____________ is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. a) Big data b) Database c) Data Mining d) None 130. Characteristics of Big Data___________ a) Volume (size of b) Velocity (speed at c) Variety (Different d) All of the above data) which Data is types of data) generated) 131. In ____________, The set of data obtained must have been obtained in large quantities. a) Volume b) Velocity c) Variety d) Value 132. ____________This big data characteristic describes the speed at which data is created, reprocessed and generalized in big data. The speed here is very influential on big data traffic. a) Volume b) Velocity c) Variety d) Value 133. _____ The characteristic of big data is, different formats and structure, that the data may have, is changing. a) Volume b) Velocity c) Variety d) Value 134. The _______ characteristic means that big data has a very high value if data processing is done correctly and accurately by data practitioners. a) Volume b) Velocity c) Variety d) Value 135. _____________ The characteristic of big data is, describes how accurate or not the data is. a) Volume b) Velocity c) Variety d) Veracity 136. In the _______ characteristic of big data, After the data is analyzed, then the data will be visualized, which is essential for a data practitioner. a) Volume b) Velocity c) Visualization d) Value 137. The _____________ characteristic of big data is more or less similar to veracity. is characteristics being both based on the principle that data must be valid and accurate so that decision-making is right on target. a) Volume b) Validity c) Variety d) Veracity 138. The _____________ characteristic of big data describes how data changes daily with various tendencies. a) Volatility b) Validity c) Visualization d) Veracity 139. The _____________ characteristic of big data means that this character emphasizes aspects of data security and protection. a) Vulnerability b) Validity c) Visualization d) Veracity 140. The _____________ characteristic of big data describes that data changes all the time. a) Volume b) Variability c) Visualization d) Veracity BY AB & AM & MR & AE 10 141. __________ Can be displayed in rows, columns, and relational databases. a) Structured Data b) Unstructured Data c) semi-structured d) None Data 142. if the data is_________, the data obtained is uniform so that processing can be done directly. a) Structured Data b) Unstructured Data c) semi-structured d) None Data 143. __________cannot be displayed in rows, columns, and relational databases. a) Structured Data b) Unstructured Data c) semi-structured d) None Data 144. if the data is ________, it will require a different algorithm, and the analysis time will also take a long time. a) Structured Data b) Unstructured Data c) semi-structured d) None Data 145. One of the reasons big data is important is because it’s ___________, where big data enables organizations to optimize processes, reduce expenses, and make informed decisions, ultimately saving costs. a) Cost Saving b) Time Saving c) Social Media d) Customer Listening Acquisition 146. One of the reasons big data is important is because it’s ___________, By analyzing large datasets efficiently, Big Data streamlines operations, automates tasks, and accelerates decision-making. a) Cost Saving b) Time Saving c) Social Media d) Customer Listening Acquisition 147. One of the reasons big data is important is because it’s __________, where big data helps monitor social media trends, customer sentiments, and brand reputation, aiding in targeted marketing and customer engagement. a) Cost Saving b) Time Saving c) Social Media d) Customer Listening Acquisition 148. One of the reasons big data is important is because it provides ___________, by analyzing customer behavior attracting preferences allows businesses to tailor marketing strategies and attract new customers. a) Cost Saving b) Time Saving c) Social Media d) Customer Listening Acquisition 149. One of the reasons big data is important is because it provides ___________, where big data provides valuable insights into consumer behavior, enabling personalized marketing campaigns and better customer experiences. a) Cost Saving b) Time Saving c) Social Media d) Marketing Insights Listening 150. One of the reasons big data is important is because it’s characterized by___________, Leveraging Big Data fosters innovation by identifying patterns, predicting trends, and driving product development. a) Cost Saving b) Time Saving c) Social Media d) Innovation Listening 151. In ________ step of analyzing massive amounts of data, Defining your aim entails developing a theory and determining how to evaluate it. a) Setting an b) Gathering c) Sorting of Data d) Cleaning the Objective Information information BY AB & AM & MR & AE 11 152. In ________ step of analyzing massive amounts of data, Data collection methods can vary considerably between businesses. collect all the data from any source and place it in data warehouses in its raw form. a) Setting an Objective b) Gathering c) Sorting of Data d) Cleaning the Information information 153. In ________ step of analyzing massive amounts of data, Data professionals then step in to divide the data and set it up for analytical queries following the data’s acquisition and storage. a) Setting an Objective b) Gathering c) Sorting of Data d) Cleaning the Information information 154. During __________ large data sets are handled in chunks and processed at different times. This method is helpful to businesses when there is enough time between data collection and analysis. a) batch processing b) Steam processing c) high processing d) None 155. _________ analysis a small data batch all at once, shortens the time it takes from when the data is collected until it is analyzed. a) batch processing b) Steam processing c) high processing d) None 156. In ________ step of analyzing massive amounts of data, there is no such thing as a little data load, and data scrubbing and filtering are always necessary. a) Setting an Objective b) Gathering c) Sorting of Data d) Cleaning the Information information 157. In ________ step of analyzing massive amounts of data, It will take time to transform massive amounts of data into a usable format. Once completed, advanced analytics can transform massive datasets into useful and actionable insights. a) Setting an Objective b) Gathering c) Sorting of Data d) Analyzing the Data Information 158. _________ analysis determines what has previously occurred. a) Descriptive b) Diagnostic c) Predictive d) Prescriptive 159. _________ analysis concentrates on figuring out why something occurred. a) Descriptive b) Diagnostic c) Predictive d) Prescriptive 160. _________ analysis uses previous data to identify future patterns. a) Descriptive b) Diagnostic c) Predictive d) Prescriptive 161. _________ analysis helps you to create future suggestions. a) Descriptive b) Diagnostic c) Predictive d) Prescriptive 162. In ________ step of analyzing massive amounts of data, these insights is the final stage in the data analytics process. This is more complicated than simply sharing the raw findings of your work; it entails analyzing the data and presenting them in a way that various sorts of audiences can understand. a) Setting an Objective b) Gathering c) Sorting of Data d) Result Sharing Information 163. One of the uses of big data is ________, entails establishing order and exerting authority over the flow of goods. a) Supply chain b) Academic Sector c) Health-Related d) To Improve Health management Sector Care 164. One of the uses of big data is ________, is a deluge of information in the education sector, including data on students, teachers, classes, grades, and other outcomes. a) Supply chain b) Academic Sector c) Health-Related d) To Improve Health management Sector Care BY AB & AM & MR & AE 12 165. One of the uses of big data is ________, useful for forecasting the spread of infectious diseases and figuring out how best to head them off before they cause widespread destruction. a) Supply chain b) Academic Sector c) Health-Related d) To Improve Health management Sector Care 166. One of the uses of big data is ________, where it has introduced wearable devices and sensors that can transmit real-time information into a patient’s electronic health record. a) Supply chain b) Academic Sector c) Health-Related d) To Improve Health management Sector Care 167. One of the uses of big data is ________, they all must deal with a deluge of information because they have abundant records about their citizens, economic development, energy resources, etc. a) Supply chain b) Meteorology c) Government d) Entertainment management Department organizations 168. One of the uses of big data is ________, where thes widespread availability of digital tools has resulted in a dramatic increase in data production, largely responsible for the recent boom of big data in the entertainment and media sectors. a) Supply chain b) Supply chain c) Health-Related d) Entertainment management management Sector Sector 169. One of the uses of big data is ________, where Satellites and weather sensors can be found in every corner of the Earth. a) Supply chain b) Academic Sector c) Health-Related d) Meteorology management Sector Department 170. One of the uses of big data is ________, where users’ needs on various routes and with various means of transportation can be estimated with big data, allowing route planning to cut down on waiting times a) Supply chain b) Academic Sector c) Transportation d) In Government management Sector Organizations 171. One of the uses of big data is ________, where data collection and analysis can aid in the detection of criminal acts like impermissible use of credit/debit cards. a) Supply chain b) Academic Sector c) Health-Related d) Financial management Sector Institutions or Banking Sectors 172. In ___________, Customer surveys and face-to-face contact were the backbones of conventional marketing strategies. a) Supply chain b) Academic Sector c) Health-Related d) Marketing or Retail management Sector 173. In ___________, gather vast volumes of information daily by observing space and receiving data from earth-orbiting satellites, space probes, and planetary rovers. a) Supply chain b) Academic Sector c) Space Industry d) In Government management Organizations 174. ______________ is an open-source framework that supports the processing of large data sets in a distributed computing environment. a) Hadoop b) Integrate.io c) Academic Sector d) MongoDB 175. _____________ is an open-source, free and Java based software framework that offers a powerful distributed platform to store and manage Big Data. a) Hadoop b) Integrate.io c) Academic Sector d) MongoDB 176. When we move a file on ___________, it is automatically split into many small piece a) Hadoop Distributed b) Hadoop MapReduce c) Name Node d) Data Node File System (HDFS) BY AB & AM & MR & AE 13 177. What is the Tools for Analyzing Large Data Sets___________, is a cloud-based service that can be used to gather all of your data sources into one place, analyze them, and get them ready for analytics. a) Hadoop b) Integrate.io c) Academic Sector d) MongoDB 178. What is the Tools for Analyzing Large Data Sets___________, is an adaptable end-to-end marketing analytics platform that provides real-time insights and consolidated views of marketing performance for business leaders. a) Adverity b) Integrate.io c) Academic Sector d) MongoDB 179. What is the Tools for Analyzing Large Data Sets___________, a cloud-based, no-code ETL platform that prioritizes customization, allowing users to pick their metrics and attributes and connect to various data sources. a) Dataddo b) Integrate.io c) Academic Sector d) MongoDB 180. What is the Tools for Analyzing Large Data Sets___________, Developed in C, C++, and JavaScript, MongoDB is a NoSQL document-oriented database. a) Dataddo b) Integrate.io c) Academic Sector d) MongoDB 181. What is the Tools for Analyzing Large Data Sets___________, is a business intelligence and analytics software solution that provides a suite of interconnected tools for making sense of information for some of the top companies in the world. a) Dataddo b) Integrate.io c) Academic Sector d) Tableau 182. What is the Tools for Analyzing Large Data Sets___________, Visual Analytics simplifies the analysis and dissemination of the decisive insights into data that businesses require. a) SAS Data Mining b) Integrate.io c) Academic Sector d) Tableau 183. What is the Tools for Analyzing Large Data Sets___________, The built-in NoSQL DBMS allows open-source software to handle enormous data volumes. a) Dataddo b) Integrate.io c) Academic Sector d) Dextrus 184. Focus on data-intensive jobs__________ a) Big Data b) HPC c) Chunk d) None 185. Focus on computation-intensive jobs________ a) Big Data b) HPC c) Chunk d) None 186. In _________, Transaction is all or nothing. a) Atomicity b) Consistency c) Isolation d) Durability 187. In _________, No violation of integrity constraints. a) Atomicity b) Consistency c) Isolation d) Durability 188. In __________, Concurrent changes invisible (serializable). a) Atomicity b) Consistency c) Isolation d) Durability 189. In __________, Committed updates persist. a) Atomicity b) Consistency c) Isolation d) Durability BY AB & AM & MR & AE 14 PRACTICAL (WRITTEN) horizontal, vertical, hybrid ‫ ﻋﻨﺪﻧﺎ ﻫﻤﺎ‬fragmentation ‫أﻧﻮاع ال‬ Horizontal Fragmentation: ‫ف‬ ‫ن‬ ‫ ي‬WHERE ‫ ﻣﻦ اﻟﺠﺪول بﺄﻣﺮ‬rows/tuples ‫اي بﺎﺧﺪ ﺻﻔﻮف ﻣﻌﻴﻨﻪ‬ ‫و اﻟﺤﺎﻟﻪ دي ﺑﻨﺨﺘﺎر كﻞ اﻟﻌﻮاﻣ ﺪ‬ ‫ﻋبﺎرە ﻋﻦ ي‬ SELECT * Vertical Fragmentation: SELECT ‫ بﺄﻣﺮ‬columns/attributes ‫بﺎﺧﺪ ﻋﻮاﻣ ﺪ ﻣﻌﻴﻨﻪ‬ Hybrid Fragmentation: ‫ بﺎﺧﺪ ﻋﻮاﻣ ﺪ وﺻﻔﻮف ﻣﻌﻴﻨﻪ ﻣﻦ اﻟﺠﺪول ت‬،‫اﻻﺗﻨن اﻟ ﻓﻮق ﻣﻊ بﻌﺾ‬ (horizontal ‫ ﺛﻢ‬vertical ‫)اﻟ ﺗ ﺐ‬ ‫ي ض‬ ‫ي‬...‫ ازاي‬SQL‫واﻟﻤﺜﺎل ﻫﻴﻮﺿﺢ كﻞ ﻋﻤﻠ ﺔ ﺑتﺘﻜﺘﺐ بﺎل‬ EXAMPLE 1 Eno Ename Gender Salary Dep 101 Andre M 3000 1 102 Bob M 4000 1 103 Casey F 5500 2 104 Drew M 5000 2 105 Elena F 2500 2 Emp_Table Write SQL query to: 1. Create a table that only has information about employees in Department 1. 2. Create a table that only has the employee numbers (Eno) and names (Ename) of all employees. 3. Create a table that has the employee numbers and salaries of employees in Department 2 who earn more than 3000. BY AB & AM & MR & AE 15 ‫‪Example 1 Answer:‬‬ ‫ف‬ ‫ﻏ اﻷ ﻮاد ﺲ‬ ‫ي اﻻﻣﺘﺤﺎن ﻣﺒﺘﻜﺘبﺶ ي‬ ‫ف ف‬ ‫اﻟﻤﻮﻇﻔن ي ﻗﺴﻢ رﻗﻢ ‪1‬‬ ‫ي‬ ‫اول ﻃﻠﺐ ﺑ ﻘﻮﻟﻚ ﺗﻌﻤﻞ ﺟﺪول ﻓ ﻪ ﺑ ﺎﻧﺎت‬ ‫‪SELECT * INTO Emp_Dep1‬‬ ‫‪FROM Emp_Table‬‬ ‫;‪WHERE Dep = 1‬‬ ‫ف‬ ‫اﻟ ي رأس‬ ‫‪ SELECT‬و ﻨﻄﻠﻊ ﺟﺪول ﺟﺪ ﺪ بﺄﻣﺮ ‪ INTO‬و ﻨﺠ ﺐ اﻟﺒ ﺎﻧﺎت ﻣﻦ اﻟﺠﺪول ي‬ ‫ﻫﻨﺎ ﺑﻨﺤﺪد كﻞ اﻟﻌﻮاﻣ ﺪ ب‬ ‫ف‬ ‫اﻟﺴﺆال ‪ ،FROM Emp_Table‬ي ال‪ horizontal fragmentation‬ﺑﻨﻌﺘﻤﺪ ﻋ اﻣﺮ ‪ WHERE‬ﻋﺸﺎن ﻧﺨﺘﺎر‬ ‫ﻫﺘب اﻟﺠﻤﻠﺔ ‪WHERE Dep = 1‬‬‫ت‬ ‫ي ف‬ ‫ﻣﻮﻇﻔن ﻗﺴﻢ رﻗﻢ ‪1‬‬ ‫ف‬ ‫اي اﺧﺪ‬ ‫ش‬ ‫ﺻﻔﻮف ﻣﻌﻴﻨﺔ ﻣﻦ اﻟﺠﺪول‪ ،‬واﻟ ط ي‬ ‫‪Eno‬‬ ‫‪Ename‬‬ ‫‪Gender‬‬ ‫‪Salary‬‬ ‫‪Dep‬‬ ‫‪101‬‬ ‫‪Andre‬‬ ‫‪M‬‬ ‫‪3000‬‬ ‫‪1‬‬ ‫‪102‬‬ ‫‪Bob‬‬ ‫‪M‬‬ ‫‪4000‬‬ ‫‪1‬‬ ‫ي ف‬ ‫اﻟﻤﻮﻇﻔن‬ ‫واﺳﺎ ﺟﻤﻴﻊ‬ ‫ن‬ ‫ﺗﺎي ﻃﻠﺐ ﺑ ﻘﻮﻟﻚ ﺗﻌﻤﻞ ﺟﺪول ﻓ ﻪ ارﻗﺎم‬ ‫ي‬ ‫ي‬ ‫‪SELECT Eno, Ename INTO Emp_EnoEname‬‬ ‫‪FROM Emp_Table‬‬ ‫ﻫﻨﺎ ﺑﻨﺤﺪد اﻟﻌﻮاﻣ ﺪ اﻟﻤﻄﻠ ﺔ ﻓﻘﻂ ب‪ SELECT‬و ﻨﻄﻠﻊ ﺟﺪول ﺟﺪ ﺪ بﺄﻣﺮ ‪ INTO‬و ﻨﺠ ﺐ اﻟﺒ ﺎﻧﺎت ﻣﻦ اﻟﺠﺪول‬ ‫ن‬ ‫ف‬ ‫اﻟ ي اﻟﺴﺆال ‪ ،FROM Emp_Table‬ال‪ vertical fragmentation‬ﻫﻮ ي‬ ‫اي اﺧﺘﺎر ﻋﻮاﻣ ﺪ ﻣﻌﻴﻨﻪ ﺲ‬ ‫ي‬ ‫‪Eno‬‬ ‫‪Ename‬‬ ‫‪101‬‬ ‫‪Andre‬‬ ‫‪102‬‬ ‫‪Bob‬‬ ‫‪103‬‬ ‫‪Casey‬‬ ‫‪104‬‬ ‫‪Drew‬‬ ‫‪105‬‬ ‫‪Elena‬‬ ‫اﻟﻤﻮﻇﻔن ف ﻗﺴﻢ ‪ 2‬ض ﺣﺎﻟﺔ ان اﻟﻤﻮﻇﻒ ﺑ ﻘبﺾ ت‬ ‫ا ﻣﻦ ‪3000‬‬ ‫ف‬ ‫ﺗﺎﻟﺖ ﻃﻠﺐ ﻫﺘﻌﻤﻞ ﺟﺪول ﻓ ﻪ ارﻗﺎم وﻣﺮﺗبﺎت‬ ‫ي‬ ‫ي ي‬ ‫‪SELECT Eno, Salary INTO Emp_Dep2_3000‬‬ ‫‪FROM Emp_Table‬‬ ‫;‪WHERE Dep = 2 AND Salary > 3000‬‬ ‫ف‬ ‫و ﻧﻔﺲ اﻟﻮﻗﺖ ﻣﺤﺪدﻳﻦ ﺻﻔﻮف ﻣﻌﻴﻨﻪ‬ ‫ﻫﻨﺎ ﺑﻨﺤﺪد ﻋﻮاﻣ ﺪ ﻣﻌﻴﻨﺔ ب‪ SELECT‬و ﻨﻄﻠﻊ ﺟﺪول ﺟﺪ ﺪ بﺄﻣﺮ ‪ INTO‬ي‬ ‫ﻓبﺎﻟﺘﺎ دە ‪hybrid fragmentation‬‬ ‫ي‬ ‫ب‪WHERE‬‬ ‫‪Eno‬‬ ‫‪Salary‬‬ ‫‪103‬‬ ‫‪5500‬‬ ‫‪104‬‬ ‫‪5000‬‬ ‫‪BY AB & AM & MR & AE‬‬ ‫‪16‬‬ EXAMPLE 2 Eno Ename Gender Salary Dep 101 Andre M 3000 1 102 Bob M 4000 1 103 Casey F 5500 2 104 Drew M 5000 2 105 Elena F 2500 2 Emp_Table Each department has its own site, write a SQL query for each site where you create tables for employees working in each department. Example 2 Answer: ‫ف‬ ‫ﻏ اﻷ ﻮاد ﺲ‬ ‫ي اﻻﻣﺘﺤﺎن ﻣﺒﺘﻜﺘبﺶ ي‬ ‫ي ف‬ dep1 ‫ ﻫﺘﺤﻂ ﺑ ﺎﻧﺎت‬،‫ﻟﻠﻤﻮﻇﻔن ﺑﺘ ع اﻟﻘﺴﻢ‬ ‫ﺳ ﻓﺮ ﺑ ﻌﻤﻞ ﺟﺪاول‬ ‫ ا ﺘﺐ ﻛﻮد ﻟ ﻞ ي‬،‫ﺳ ﻓﺮ‬ ‫ﻫﻨﺎ ﺑ ﻘﻮﻟﻚ كﻞ ﻗﺴﻢ ﻟﻪ ي‬ ‫ت‬ ‫ف‬ ‫ف‬ horizontal fragmentation ‫ وﻧ ع اﻟﺘﻘﺴ ﻢ ﻫﻴب‬site 2 ‫ ي‬dep 2 ‫ و ﺎﻧﺎت‬site 1 ‫ي‬ site 1 ‫ﻛﻮد‬ SELECT * INTO Emp_Dep1 FROM Emp_Table WHERE Dep = 1; Eno Ename Gender Salary Dep 101 Andre M 3000 1 102 Bob M 4000 1 site 2 ‫ﻛﻮد‬ SELECT * INTO Emp_Dep2 FROM Emp_Table WHERE Dep = 2; Eno Ename Gender Salary Dep 103 Casey F 5500 2 104 Drew M 5000 2 105 Elena F 2500 2 BY AB & AM & MR & AE 17 PRACTICAL BANK (MCQ) Choose the correct answer: 1. A ___________ is a database in which data is stored across different sites either on the same network or on entirely different networks. Portions of the database are stored in multiple physical locations and processing is distributed among multiple database nodes. a) Distributed database b) Centralized database c) Local database d) None 2. Which of the following is a general method for distributing data on multiple database servers? a) Distributed b) Data replication c) Data structure d) A & B transactions 3. __________ is the process of making multiple copies of data and storing them at different locations for backup purposes, fault tolerance and to improve their overall accessibility across a network a) Data fragmentation b) Data replication c) Data segmentation d) Data analysis 4. Which of the following is a type of data replication? a) Transactional b) Snapshot c) Merge d) All of them 5. Which of the following is a data replication approach in DBMS? a) Full replication b) Partial replication c) No replication d) All of them 6. ___________ means that the complete database is replicated at every site of the distributed system. This scheme maximizes data availability and redundancy across a wide area network. a) Full replication b) Partial replication c) No replication d) None 7. ______________ occurs when only certain fragments of the database are replicated based on the importance of data at each location. Here, the number of copies can range from one to the total number of nodes in the distributed system. a) Full replication b) Partial replication c) No replication d) None 8. ________ means each fragment is stored exactly at one site. a) Full replication b) Partial replication c) No replication d) None 9. _________ is a database server or servers that make data available for replication. It is the source database that publishes data and changes to one or more subscribers. It can have one or more publications. a) Publisher b) Distributor c) Subscriber d) None 10. ___________ is the database on the Publisher that is the source of data and database objects to be replicated, so a Publication DB is a logical collection of articles from a database a) Publication b) Distribution c) Subscription d) Replication database database database database 11. A/an _____ is the basic unit of SQL Server Replication. It can consist of tables, stored procedures, and views. a) Article b) Publication c) subscription d) distribution 12. A database instance that consumes SQL Server replication data from a publication is called ____________. _______ can receive data from one or more publishers and publications. It can also pass data changes back to the publisher or republish the data to other subscribers depending on the type of the replication design and model. a) Publisher b) Distributor c) Subscriber d) None BY AB & AM & MR & AE 18 13. A copy of data from the publication database or a target database of a replication model is called _______. a) Publication b) Distribution c) Subscription d) Replication database database database database 14. ____________ is a component responsible for distributing data and schema changes from the publisher to the subscribers. ________ acts as intermediary between the publisher, where the original data is stored, and the subscribers which receive the replicated data. a) Publisher b) Distributor c) Subscriber d) None 15. ___________ is a system database used by the distributor to store replication-related data from publisher. It contains information about replicated transactions, commands, and other metadata required for data synchronization between the publisher and subscribers. a) Publication b) Distribution c) Subscription d) Replication database database database database 16. In ________, the distributor or the publisher actively pushes the replicated data changes to the subscribers. a) Pull subscription b) Push subscription c) Add subscription d) Commit subscription 17. In __________, the subscriber actively pulls the replicated data changes from the publisher. Subscribers periodically connect to the publisher and request data changes. a) Pull subscription b) Push subscription c) Add subscription d) Commit subscription 18. ___________ are processes responsible for carrying out the tasks involved in replicating data between the publisher, distributor, and subscribers. a) Replication agents b) Replication c) Replication d) Replication publishers subscribers distributors 19. Which of the following is a type of replication agents? a) Replication b) Log reader agent c) Distribution agent d) All of them snapshot agent 20. A publication contains ________. a) Articles b) Agents c) Snapshots d) Merge 21. __________ is a replication technique that allows you to create a point-in-time snapshot of data and schema from a source database (publisher) and replicate it to one or more destination servers (subscribers). This type of replication is useful when you need to distribute a consistent set of data to multiple subscribers without continuous updates. a) Transactional b) Snapshot c) Merge d) None 22. This agent creates a snapshot of the publication and stores it in the snapshot folder. Subscribers use this snapshot to initialize or reinitialize their data. a) Replication b) Log reader agent c) Distribution agent d) Merge agent snapshot agent 23. ___________ is a SQL Server technology that is used to replicate changes between two databases. These changes can include database objects like tables (primary key is required), stored procedures, views, and so on, as well as data. a) Transactional b) Snapshot c) Merge d) None 24. This agent reads the transaction log of the published database and copies the transactions marked for replication from the transaction log into the distribution database. a) Replication b) Log reader agent c) Distribution agent d) Merge agent snapshot agent BY AB & AM & MR & AE 19 25. In snapshot and transaction replication, _________ this agent moves the replicated data from the distribution database to the subscribers. a) Replication b) Log reader agent c) Distribution agent d) Merge agent snapshot agent 26. Which of the following are types of distribution agents in merge replication? a) Merge Distribution b) Merge agent c) Distribution agent d) A&B agent 27. In merge replication, _______ moves metadata and schema changes. a) Merge Distribution b) Merge agent c) Distribution agent d) A&B agent 28. In merge replication, _______ moves data changes. a) Merge Distribution b) Merge agent c) Distribution agent d) A & B agent 29. This type of replication is commonly found in server-to-client environments and allows both the publisher and subscriber to make changes to data dynamically. In ____________ replication, data from two or more databases are combined to form a single database thereby contributing to the complexity of using this technique. a) Transactional b) Snapshot c) Merge d) None 30. ______________ agent uploads changes from the Subscriber to the Publisher and then downloads changes from the Publisher to the Subscriber to synchronize changes between the publisher and subscribers, ensuring data consistency across all nodes. a) Replication b) Log reader c) Distribution d) Merge snapshot BY AB & AM & MR & AE 20

Use Quizgecko on...
Browser
Browser