Cloud Storage Solutions Quiz
48 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which type of storage is available for mounting in a VM instance and behaves like a local hard drive?

  • File storage
  • Object storage
  • Block storage (correct)
  • Cloud storage
  • What is a characteristic of temporary and local block storage?

  • Content is lost when VM is shut down (correct)
  • It is persistent across restarts
  • It is suitable for application data storage
  • Content is retained after VM shutdown
  • What distinguishes persistent block storage from temporary storage?

  • It retains data after VM shutdown (correct)
  • It cannot be mounted to a VM
  • It is faster than temporary storage
  • It only supports local file systems
  • Which storage solution would be appropriate for storing application data across multiple servers?

    <p>Relational database</p> Signup and view all the answers

    In which scenario is using a database more advantageous than block storage?

    <p>When partitioning application data across multiple servers.</p> Signup and view all the answers

    Why might a file system be inappropriate for storing application data in a VM?

    <p>It is only accessible by a single VM instance.</p> Signup and view all the answers

    What is a limitation of local block storage mentioned in the content?

    <p>Data may not survive the failure of the instance.</p> Signup and view all the answers

    What must occur for file system integrity in persistent block storage?

    <p>Clean shutdown</p> Signup and view all the answers

    What is a characteristic feature of NoSQL databases?

    <p>No schema or only a simple one</p> Signup and view all the answers

    What is the primary advantage of key/value stores in a NoSQL context?

    <p>Facilitates horizontal scaling by distributing data across servers</p> Signup and view all the answers

    Which statement is true regarding relational databases?

    <p>They can be deployed as independent small-data instances.</p> Signup and view all the answers

    What does the term 'elasticity' refer to in the context of NoSQL databases?

    <p>The ability to scale in and out according to demand</p> Signup and view all the answers

    What is a potential drawback of using relational databases compared to NoSQL?

    <p>Inability to scale horizontally beyond a certain limit</p> Signup and view all the answers

    How do key/value stores manage the distribution of data?

    <p>Employing consistent hashing principles</p> Signup and view all the answers

    Which of the following is a common reason for opting for NoSQL databases over traditional relational databases?

    <p>Requirement for simpler query models and flexibility</p> Signup and view all the answers

    What is the design goal of horizontal scaling in NoSQL databases?

    <p>Distribute the load across multiple servers</p> Signup and view all the answers

    What is a key characteristic of object stores in the context of NoSQL databases?

    <p>They store immutable data that cannot be changed once published.</p> Signup and view all the answers

    Which of the following is a known deployment example of Apache Cassandra?

    <p>Apple with 100,000+ servers</p> Signup and view all the answers

    What type of data does Apache Cassandra primarily focus on managing?

    <p>Binary Large Objects (BLOBs)</p> Signup and view all the answers

    What advantage do object stores offer when resources are accessed by clients?

    <p>Clients can send GET requests directly using HTTP.</p> Signup and view all the answers

    Which Amazon S3 feature allows users to manage multiple versions of an object?

    <p>Using version numbers in object identifiers</p> Signup and view all the answers

    What is a benefit of using CDNs in conjunction with object stores?

    <p>They allow for caching and distributed resource representation.</p> Signup and view all the answers

    What is the primary purpose of creating a bucket in Amazon S3?

    <p>To serve as a namespace for storing objects.</p> Signup and view all the answers

    Which mechanism is used for securing access to objects in Amazon S3?

    <p>Amazon-provided encryption and access-control mechanisms</p> Signup and view all the answers

    What must be done atomically to prevent inconsistency in a database when managing customer orders?

    <p>Updating the database and publishing an event</p> Signup and view all the answers

    Why is it important for each individual database to be fault-tolerant?

    <p>To maintain consistency despite service failures</p> Signup and view all the answers

    What issue can occur if a service fails after updating the database?

    <p>The event may not be received by the destination microservice</p> Signup and view all the answers

    What happens to the database if there is a crash after an event is published but before it is processed?

    <p>The database retains consistency with previously published events</p> Signup and view all the answers

    What is a crucial step in managing customer credit lines before allowing an order?

    <p>Reducing the credit line by the value of the order</p> Signup and view all the answers

    What does the term 'event-sourcing' refer to in the context of the provided content?

    <p>Recording all changes as a sequence of events</p> Signup and view all the answers

    What is a consequence of not publishing an event after updating a database?

    <p>Other services will be unaware of the database update</p> Signup and view all the answers

    How can services coordinate credit line checks when processing orders?

    <p>By publishing events to notify other services</p> Signup and view all the answers

    What functionality does Dropbox provide to synchronize files?

    <p>It maintains a local copy on users’ devices and synchronizes with the cloud.</p> Signup and view all the answers

    How are files managed within Dropbox in terms of chunking?

    <p>Each chunk is an independent object identified by a unique key.</p> Signup and view all the answers

    What is the role of metadata servers in Dropbox's architecture?

    <p>They handle ownership details and maintain a list of file chunks.</p> Signup and view all the answers

    What was Dropbox's storage solution before it moved to on-premises storage?

    <p>Amazon EC2 and S3 object store.</p> Signup and view all the answers

    How does Dropbox handle notifications of changes to files?

    <p>Notifications are received using delayed HTTP calls.</p> Signup and view all the answers

    Which protocol does Dropbox use for its API between clients and servers?

    <p>REST over HTTPS but not RESTful.</p> Signup and view all the answers

    In terms of file system abstraction, what does Dropbox present to the users?

    <p>Files, directories, and links.</p> Signup and view all the answers

    What happens when a change is made to a file on Dropbox?

    <p>Only the differences in chunk content are sent as compressed binary diffs.</p> Signup and view all the answers

    What is a key problem associated with tailing the database log?

    <p>Exposes the internal schema of the tables/values</p> Signup and view all the answers

    In the context of using a database as a message queue, what is a major challenge?

    <p>Event publishing must align with business logic</p> Signup and view all the answers

    What role does a log tailer play in tailing the database log approach?

    <p>Reads the transaction log and publishes events</p> Signup and view all the answers

    Why is the implementation of a separate EVENT table in each microservice potentially problematic?

    <p>It complicates the polling process for event publishing</p> Signup and view all the answers

    What does the database transaction log primarily record?

    <p>Changes and updates made to the database</p> Signup and view all the answers

    In which scenario is the synchronization of event publishing most critical?

    <p>Polling the EVENT table for changes</p> Signup and view all the answers

    What is one significant disadvantage of table-level changes in database log tailing?

    <p>They can be easily misinterpreted as business events</p> Signup and view all the answers

    What is the primary advantage of maintaining an EVENTS database table on each microservice?

    <p>It allows for individual event processing in isolation</p> Signup and view all the answers

    Study Notes

    Cloud Computing - Storage and State Management

    • Announcements:
      • Feedback for the first quiz will be available after the course.
      • The second quiz will be available on Moodle at 12:45 today.
      • Quiz deadline: Wednesday, October 16, 10:45.
      • Review deadline: Wednesday, October 23, 10:45.

    Objectives

    • Present storage solutions for laas environments.
    • Discuss the scalability of relational databases (SQL) and introduce NoSQL databases.
    • Describe the Dropbox service as an example.
    • Explain techniques for state management in microservices applications.

    Block Storage

    • Virtual disks are available for mounting in a VM instance.
    • These disks are visible as block devices, similar to local hard drives or SSDs.
    • The guest operating system needs to mount the storage using a file system.
    • Temporary and local block storage is attached to the host or via a local SAN (Storage Area Network).
    • Content stored in temporary/local storage is lost when the VM is shut down. Example: Amazon EC2 Instance Store.
    • Persistent block storage persists across VM shutdowns/restarts. Example: Amazon EC2 Elastic Block Store.

    Storage for Applications

    • Block storage is used for operating systems, libraries, and application binaries/containers.
    • File systems aren't suitable for storing application data locally for a single VM instance.
    • Shared storage is difficult to achieve and may be lost when the VM instance fails.
    • Databases are better for handling application data over multiple servers.
    • Containerized applications can have their own databases. A set of containers can use a platform provider's managed service for storage and databases.

    Database Selection

    • Many types of databases are available (relational and NoSQL).
    • Relational databases (SQL) have features like ACID properties (Atomicity, Consistency, Isolation, Durability), and complex queries using join functions. They're well understood by developers. Examples include Heroku Postgresql, Amazon Aurora and RDS.
    • Well-known relational databases are often not well-suited for large cloud applications due to issues in scaling and elasticity.

    MySQL Single-Node Scalability

    • The performance of a single-node MySQL database on cloud servers compared to cloud databases shows that performance is affected by thread count.
    • 4GB MySQL cloud servers perform worse than comparable cloud databases as the thread count increases.

    Scaling Relational Databases

    • Enterprise-scale relational databases are often vertically scaled on large machines.
    • This can be expensive.
    • Cloud computing allows scaling horizontally using many commodity servers.

    Scalability Cube

    • The scalability cube illustrates scaling options within microservices environments.

    Horizontally Scaling Relational Databases

    • To handle more workload, split database content over multiple machines (database sharding).
    • Split tables across several database instances.
    • A proxy (e.g., PL/Proxy for PostgreSQL or MySQL Cluster) directs and combines query results across these instances.

    MySQL Cluster Sharding

    • This technique splits data based on the primary key, utilizing a hash function for distribution.

    Limits of Sharding

    • Sharding is typically based on primary keys, using a hash function for distribution over different instances.
    • SQL queries frequently require comparisons/merging data from multiple tables, affecting scalability.
    • Automatic sharding doesn't improve scalability.

    Primary/Secondary Replication

    • Writes to a relational database must go to a primary replica site.
    • Reads can come from multiple read replicas for improved performance.

    Sharding and Horizontal Elasticity

    • Careful sharding can result in scalability to a few dozen database instances.
    • Modifications/removal of a shard node, however, require redistribution of all data, negatively impacting throughput.
    • Elastic sharding for relational databases is not typically practical.

    Relational Databases Takeaway

    • Relational databases are suitable for many applications with lower data volumes.
    • MySQL/PostgreSQL servers can handle significant transaction throughput on large EC2 instances.
    • Better scalability is needed when dealing with larger data volumes.
    • Lack of join queries or relational schemas can be mitigated in many applications.

    NoSQL Databases

    • "Not only SQL" databases, are a family of options with multiple flavors.
    • Common characteristics of Nosql databases include horizontal scaling, simpler querying and flexible schemas. No explicit relational tables between sets of data.
    • An example is a key-value store designed for horizontal scaling by splitting key ranges into disjoint subsets. Different servers handle respective subsets. Examples include Apache Cassandra.

    Key/Value Stores

    • A common NoSQL interface, like a global hash map, with put(key, value) and value ← get(key) operations.
    • Design for scaling: Splits subsets of keys across multiple servers using consistent hashing principles.

    NoSQL Horizontal Scalability

    • Individual servers process requests independently.
    • Multiple servers can share the same portion of the key and data range.
    • New servers can be quickly added by splitting existing key-ranges.

    NoSQL: Adding a Server

    • Adding a server (in the case of key-value stores) dynamically partitions the available keys into new storage partitions for additional servers.

    Apache Cassandra

    • NoSQL/key-value store with a focus on throughput and linear scalability.

    Object Stores

    • Object storage is a key-value store for immutable data.
    • Immutable data is data that remains constant and shouldn't be changed after being created.
    • Object identifiers can use version numbers.
    • Clients can get resources directly from their URIs.
    • Typically a managed service used for caching and CDN distribution. Examples include Amazon S3, and Google Cloud Storage.

    Amazon S3 Example

    • Objects are stored in buckets.
    • Multiple past versions of objects can be maintained.
    • Objects have automated encryption and access controls provided by Amazon.

    Using URI from client

    • Display of object URLs, which can be directly accessed by clients.

    Usage of Object Storage in Project

    • Azure Blob storage is a similar service to S3.
    • Public URIs for objects are generally preferred to local data for improved accessibility.
    • REST calls can be used for image uploads. Azure functions can be used for image resizing.
    • Removal of data is possible at the end of a project.

    Document-Oriented NoSQL

    • Document-oriented NoSQL stores structured data in formats such as JSON or YAML.
    • This type of store provides indexing capabilities by field and/or value.
    • It supports collections and hierarchies. MongoDB is a popular choice.

    Other NoSQL Databases

    • Graph databases store relationships between keys (as in social networking graphs such as friends). An example is neo4j.
    • Column stores, like Apache HBase, Google BigTable, store attribute values as full columns for efficient aggregation and search.

    Popularities (DB-Engines Ranking)

    • This display shows popularity of different database engines (such as MySQL, Oracle, PostgreSQL, MongoDB) over time.

    Case Study: Dropbox

    • Dropbox is a cloud storage service that provides personal file storage synced to local devices. It is accessible via a web interface and has an API.

    Cloud Storage - Dropbox Client Side

    • The client application monitors changes to local files.
    • Any changes are compressed and sent to the cloud.
    • Dropbox uses delayed HTTP calls when notifying the cloud of changes. This means that the cloud response time may be up to 60 seconds or more, yet updates are frequently written to the cloud upon registering a change in the local file system.
    • Chunking of files is used for efficiency and resilience.

    Dropbox Implementation

    • Dropbox maintains metadata on its own private cloud, distinct from file data.
    • The service uses an S3-like service from Amazon to store file content.
    • Dropbox uses a REST API, but this is not entirely standard RESTful.

    Dropbox Architecture

    • This illustration displays the overall architecture design for the Dropbox service incorporating metadata and data storage separation. Different services exist on servers in charge of specific tasks like metadata storage, processing, and notification.

    Example Dropbox Interaction

    • Interactions between clients and servers through HTTPS.
    • Separate services handle interactions with the primary and secondary instances of each database.

    Interaction With Storage Servers

    • Dropbox clients receive server IPs for connection using a load balancer.

    Why Use Two Different Stores?

    • Metadata and data storage have distinct requirements.
    • Metadata must be highly consistent.
    • File/data distribution should not be visible to the client.
    • Data storage should be efficient, cheap, and scalable.

    Some Numbers from 2013

    • Dropbox grew to a significant number of users and data volume rapidly.
    • It used thousands of physical servers and Amazon services for storage coordination.

    Dropbox Move Away from Amazon

    • Dropbox transitioned from Amazon's EC2 and S3 instances to using its own storage solutions.

    Storage and Microservices

    • This slide emphasizes a polyglot persistent architecture.

    Polyglot Architecture

    • Each service maintains its own state.
    • Database selection depends on each individual service's needs.

    State Management with Microservices

    • Microservices only maintain a subset of the entire enterprise database (e.g., Customer billing information).
    • Individual services are independent and can scale independently.
    • Microservices use events to coordinate operations across multiple databases.

    Service Decomposition

    • Example diagrams show how different microservice components decompose data into classes.

    Domain Model Pattern

    • This pattern decomposes data into classes which match specific microservice components.

    Problems

    • Data in one database often references data in another database.
    • Databases are accessible only through service APIs, making it harder in many instances to change database schema.

    Aggregates

    • A set of domain objects grouped together and treated as a single unit for better data consistency throughout the microservice system.
    • Aggregates have unique keys for referencing by other services (e.g., their own URI).

    Foreign Keys for Inter-Aggregate References

    • Foreign key relationships for referencing data between different aggregates are employed to improve data integrity.

    Single-Aggregate Transactions

    • Database transactions are limited to single aggregates. Consistency within transactions across aggregates must be implemented through messaging.

    Events and Reliability

    • Service-based publishing and consuming of events is a way to handle reliability and potential service/database failures.

    Preventing Inconsistency

    • Both the database update and event publishing can be done atomically, ensuring consistency even if failure occurs during one of the operations.

    Solution 1 (Tailing the database log)

    • Tailing of the database log is a potential but less flexible solution for tracking database changes and events.

    Solution 2 (Database as a message queue)

    • In this method, the database itself is used as a message queue to distribute update/event notifications, which can be more problematic as it can be challenging to ensure consistency.

    Event Sourcing

    • A method of managing an aggregate as a sequence of events; changes to an aggregate are documented by recording an event for the aggregate.

    Advantages of Event Sourcing

    • Event publishing is atomic, ensuring data consistency.
    • Other services can receive/process events to keep track of data changes.
    • Provides a method for debugging events.
    • Data updates are more structured.

    Example of Use of Event Sourcing

    • Example diagrams illustrate the flow of events between services.
    • Order Service (OM) creates events, and Customer Service (CM) updates Customer data upon receiving the events from OM.

    CQRS(Command Query Responsibility Separation)

    • Separate command and query handling for efficiency, ensuring that query operations are handled efficiently without imposing unnecessary overhead.
    • Queries are materialized as database views, simplifying queries when they are commonly used.
    • Microservices that are commonly queried can cache results, reducing repeated database calls.

    Conclusion

    • Application state management in the cloud relies on databases (various types).
    • Managing state with microservices is more complex than with a single database and transactions.
    • An event sourcing approach ensures the consistent and proper handling of events within and between microservices systems.

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Related Documents

    Description

    This quiz explores various types of storage solutions available for virtual machine instances, including characteristics of temporary and persistent block storage. Test your understanding of when to use databases versus block storage, and identify limitations associated with local block storage in a cloud environment.

    More Like This

    AWS Storage Solutions Overview
    278 questions
    Elastic Volume Service Overview
    45 questions
    Cloud Storage Services Overview
    48 questions
    Use Quizgecko on...
    Browser
    Browser