Data Management and NoSQL Systems Quiz

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

Which type of data does a data lake primarily store?

Metadata
Structured data
Semi-structured data
Unstructured data (correct)

What is the main advantage of synthetic training data for AI systems?

Clear semantic reliability
Limited applicability
High variability
Generation of large amounts of data quickly (correct)

In what aspects do NoSQL systems excel compared to relational databases?

ACID transactions
Flexible data model
Powerful SQL query language
Global distribution (correct)

What is the primary advantage of relational databases over NoSQL systems?

ACID transactions (B) Signup and view all the answers

What type of data storage solution is recommended for a university library's historic map digitization project?

Document store (A) Signup and view all the answers

What is the main advantage of a document store over a key-value store?

Query flexibility (B) Signup and view all the answers

In a DHT implementation like Chord, what is the purpose of finger tables?

Maintaining routing complexity (C) Signup and view all the answers

What is the routing complexity of P2P systems like Chord?

$O( ext{log} n)$ (D) Signup and view all the answers

What does ETL stand for in the context of data management?

Extract, Transform, Load (A) Signup and view all the answers

What does a data lake struggle with in data management?

Variety and Veracity (A) Signup and view all the answers

What are some reasons why data-related tasks take so much time and effort in typical machine learning projects?

Tasks like data identification, aggregation, cleaning, labelling, and augmentation are specific to problem domains and require custom solutions. (B) Signup and view all the answers

Which best summarizes the 4V’s of Big Data?

Volume, Velocity, Variety, Veracity: These represent the size, speed, formats, and quality challenges of big data. (C) Signup and view all the answers

How can the 4V’s of Big Data be addressed by current technology?

Current technology offers solutions for managing large volumes, processing data at high speeds, handling diverse data formats, and assessing data quality. (D) Signup and view all the answers

In a DHT using finger tables with a hash range from 0 to 255 organized as a hash ring, if Node 1 covers the hash range 3-55, what would be the entry for distance 32 in Node 1's Chord-style finger table?

87 (C) Signup and view all the answers

Briefly outline the difference between consistency in the CAP theorem and consistency in ACID.

CAP consistency refers to consistency of replicas; ACID consistency refers to overall data consistency when faced with potentially complex transactions. (C) Signup and view all the answers

A new database system claims that it can recover from a fatal hard drive failure by rebuilding all lost data from a log file stored on a different drive, and thus would be available (in the sense of the CAP theorem). Briefly discuss this claim.

The system is not fully available during the rebuild, so it does not satisfy the availability requirement of the CAP theorem. (D) Signup and view all the answers

Can the Two-Phase Commit Protocol recover from a worker who sent a 'ready' answer to a 'prepare', but never answered with an 'acknowledge' after a 'commit'. Why (not)?

Somewhat. The safest approach would be to wait for a timeout and then treat the silence as a failure, sending a rollback to all other workers. (B) Signup and view all the answers

Why does a system like Amazon Dynamo use a Vector Clock instead of regular time stamps?

Vector clocks can differentiate between conflicting versions and outdated versions, addressing issues like partitioning events or concurrent modifications. (A) Signup and view all the answers

In a replicated data storage scenario, a master-slave setup for each replica using locks for write operations would ensure that 'read-your-own-write' conflicts will not happen.

No. Write locks would only fix write-write conflicts, not read-your-own-write conflicts. (A) Signup and view all the answers