Cloud Computing: Object Storage

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

How does object storage differ from block or file storage in handling data?

  • It stores data in a hierarchical directory structure, like traditional file systems.
  • It stores data as fixed-size blocks, optimized for raw speed.
  • It stores data as objects with associated metadata in a flat namespace, eliminating name collisions using REST HTTP APIs. (correct)
  • It stores data exclusively for archival purposes, with limited accessibility.

In object storage, what benefit does the association of metadata with objects provide?

  • It restricts object access to only authorized users, improving data security.
  • It ensures that objects are stored in a geographically redundant manner.
  • It enhances indexing and management of objects, allowing for more descriptive properties. (correct)
  • It primarily reduces the physical storage space required for each object.

Which of the following is a key characteristic of how object storage systems handle data manipulation?

  • Data manipulation is limited to read-only operations to ensure data integrity.
  • Data manipulation is achieved through specialized APIs that require deep knowledge of the underlying storage hardware.
  • Data manipulation is performed using SQL queries directly on the stored data.
  • Data manipulation is conducted using REST APIs with operations like GET, PUT, DELETE, and UPDATE. (correct)

How does object storage typically ensure data durability?

<p>By creating multiple copies of objects and employing erasure coding across different locations. (D)</p>
Signup and view all the answers

What role do 'buckets' play in Amazon S3's data organization?

<p>Buckets are storage containers that hold related objects, allowing for organization and separation of data. (C)</p>
Signup and view all the answers

In Amazon S3, how can users control access to their data?

<p>By setting permissions on objects through the AWS Management Console. (C)</p>
Signup and view all the answers

If an object needs to be recoverable after accidental deletion in Amazon S3, what feature should be enabled?

<p>Versioning (A)</p>
Signup and view all the answers

What is the purpose of 'keys' in organizing data within Amazon S3?

<p>Keys serve as resource identifiers, providing the path to an object, including an optional directory path and object name. (B)</p>
Signup and view all the answers

Which of the following is a primary feature of Amazon S3 that helps in protecting data against failures?

<p>Data replication across multiple storage devices (B)</p>
Signup and view all the answers

What does the term 'swift partitions' refer to in the context of OpenStack Swift?

<p>Logical divisions within a Swift storage system that determine where data is located. (B)</p>
Signup and view all the answers

In OpenStack Swift, what is the functional purpose of a 'Ring'?

<p>A data structure that maps partition space to physical locations on disk. (A)</p>
Signup and view all the answers

How does DynamoDB differ from traditional relational database management systems (RDBMS)?

<p>DynamoDB is a NoSQL database that stores data in tables without enforcing a strict schema and does not support joins. (C)</p>
Signup and view all the answers

What is the role of primary keys in DynamoDB?

<p>To uniquely identify each item in a table and provide more querying flexibility using secondary indexes. (A)</p>
Signup and view all the answers

Which of the following best describes a 'partition key' in DynamoDB?

<p>A simple primary key composed of one attribute used as input to an internal hash function to determine data storage. (C)</p>
Signup and view all the answers

What is the purpose of secondary indexes in DynamoDB?

<p>To enable querying data in the table using an alternate key, in addition to queries against the primary key. (C)</p>
Signup and view all the answers

In DynamoDB, how are partitions related to data distribution and storage?

<p>Partitions are logical divisions that dictate how DynamoDB stores and distributes data, with management fully handled by DynamoDB. (B)</p>
Signup and view all the answers

Which HTTP methods are commonly used in REST APIs for manipulating data in object storage?

<p>GET, PUT, DELETE, UPDATE (A)</p>
Signup and view all the answers

What does it mean for messages to be 'self-descriptive and stateless' in the context of REST APIs and object storage?

<p>Messages contain all necessary information to be understood without needing context from previous requests, and they do not retain any client state on the server. (C)</p>
Signup and view all the answers

In the context of object storage, what is erasure coding?

<p>A method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations or storage media or a combination of these. (C)</p>
Signup and view all the answers

What are the benefits of using multi-part uploads for large objects in Amazon S3?

<p>Multi-part uploads allow for parallel uploading of object parts, maximizing network utilization and providing resilience to failures. (A)</p>
Signup and view all the answers

Flashcards

Object Storage

A prominent approach in building cloud storage systems, allowing users to store data as objects.

Object Storage

Virtualizing physical implementation in a flat namespace, it eliminates name collisions using REST HTTP APIs.

S3 Bucket

A resource in S3 used to store objects.

S3 Object

The fundamental entities stored in S3.

Signup and view all the flashcards

S3 Key

How objects are referenced in S3, similar to a file path.

Signup and view all the flashcards

Object Versioning

Used to maintain different versions of objects in a bucket.

Signup and view all the flashcards

Access Control

Allows users to assign permissions to control access to their objects.

Signup and view all the flashcards

Audit Logs

Lets you monitor access to a bucket, recording who accessed what and when.

Signup and view all the flashcards

Replication

A data redundancy method in S3 designed to provide high availability, ensuring data survival across multiple storage devices.

Signup and view all the flashcards

Swift Partitions

Breaks storage into locations for data, key for replication.

Signup and view all the flashcards

Swift Account

A user in Swift that creates accounts.

Signup and view all the flashcards

Swift Container

Where Accounts store data.

Signup and view all the flashcards

DynamoDB

NoSQL DB available via Amazon Web Services.

Signup and view all the flashcards

DynamoDB Items

Basic units of data storage comprised of attributes.

Signup and view all the flashcards

Primary and Secondary Indexes

Uniquely identify items; provide querying flexibility.

Signup and view all the flashcards

DynamoDB Partition

A storage allocation for table data.

Signup and view all the flashcards

Partition Key

Simple primary key.

Signup and view all the flashcards

Partition key and sort key

Composed of two attributes.

Signup and view all the flashcards

Study Notes

Object Storage in Cloud Computing

  • Object storage is a prominent approach used in building cloud storage systems.
  • Object storage is different from block or file storage because it allows for storing data as objects, essentially files in a logical view.
  • The physical implementation is virtualized in a flat namespace, eliminating name collisions using REST HTTP APIs.
  • Object storage manipulates data using GET, PUT, DELETE, and UPDATE.
  • Object storage uses Representational State Transfer (REST) APIs.
  • Resources are identified by Uniform Resource Identifiers (URIs).
  • Objects also allows for addressing and identification of individual objects by more than just file name and file path.
  • Object storage uses scale-out distributed systems.
  • Each node often runs on a local filesystem.
  • Object storage doesn't need specialized or expensive hardware
  • The object storage software handles durability, creating multiple copies or erasure codes.
  • Management is simplified with a single flat namespace.

Block Storage vs Object Storage

  • Block Storage is a bucket of bits with no meaning attached
  • Object storage takes form, is not just a string of bits.
  • In Block storage, one bit string is no different from the next
  • Object storage is software system called an object store from storage subsystem to.
  • Block storage has no data intelligence, operations are against gross collections of bits, without knowledge of what the bits represent to the customer.
  • Object storage itself has the knowledge about data and can perform complex functions against the data

Structure of Objects and Metadata

  • Objects take the form of files and metadata to describe the file
  • System Metadata includes the Filename, Creation date, and Last Modified date.
  • Custom Metadata includes the Subject, Place Taken, Category, and Sharing Permissions

Amazon S3 (Simple Storage Service)

  • Amazon S3 is a highly reliable, available, scalable, and fast cloud storage service.
  • AWS console (GUI interface to AWS) performs most common operations
  • Amazon provides a REST-ful API with HTTP operations such as GET, PUT, DELETE, and HEAD.
  • Libraries and SDKs for various languages abstract these operations.
  • Several S3 browsers exists to explore S3 account as if it were a directory (or a folder).

Organizing Data in Amazon S3

  • Data is stored as objects within S3.
  • Objects are stored in resources called buckets.
  • S3 objects can be up to 5 terabytes in size, with no limit on the number of objects.
  • Objects in S3 are replicated across multiple geographic locations for resilience.
  • Objects are referred to using keys, which consist of an optional directory path name followed by the object name.
  • Object versioning, when enabled, allows for recovery from deletions and modifications.
  • Buckets keep related objects in one place and separate them from others, with up to 100 buckets per account.
  • Each object has a key, serving as the path to the resource in an HTTP URL.
  • Slash-separated keys establish a directory-like naming scheme for browsing in S3.
  • Security is achieved through Access controls and Audit Logs for users to ensure the security of their S3 data

Security of S3

  • Access Control to Objects lets Users set permissions that allow others to access objects, accomplished via the AWS Management Console.
  • Audit logs allows S3 users to turn on logging for a bucket and stores complete access logs for the bucket in a different bucket
  • the Audit logs display , which AWS account accessed the objects, the time of access, the IP address, and the operations performed.

Data Protection in Amazon S3

  • By default, S3 replicates data across multiple storage devices to survive two replica failures.
  • RRS (Reduced Redundancy Storage) replicates noncritical data twice, designed to survive one replica failure.
  • Data can be S3 can be made to run in specific geographic locations for performance, legal, and availability reasons.
  • Bucket-level selection can accomplish specific data locations via region storing during creation.
  • S3 automatically stores the full history of all objects when versioning is enabled on a bucket.
  • Objects can be restored to prior versions and deletions undone.

Large Objects and Multi-Part Uploads

  • Program splits large objects into multiple parts and uploads each independently to S3
  • Uploads can be parallelized to maximize network utilization.
  • Only failed parts need re-trying if a part fails to upload.

Open Stack Swift

  • Open Stack Swift is another illustration of object storage
  • Swift Partitions breaks the storage available into locations where data and accounts will be located.
  • Swift Partitions is the core of the replication system
  • Account is an user in the storage system that uses swift creates accounts.
  • Container: Containers are where accounts create and store data.
  • Containers are name spaces used to group objects
  • object is actual data stored on the disk.
  • Ring maps the partition space to physical locations on disk.

NoSQL Database - DynamoDB

  • DynamoDB is a fully managed NoSQL cloud database available through Amazon Web Services (AWS).
  • Data is stored in tables that must be created and defined in advance.
  • Users must define aspects of the tables, mainly keys and local secondary indexes.
  • DynamoDB also retains schema less flavor.
  • DynamoDB can query data based on secondary indexes, but has item-level consistency, analogous to row-level consistency in RDBMSs
  • DynamoDB is not the best choice if consistency across items is a necessity
  • Tables are the highest level at which data can be grouped and manipulated, any join-style capabilities that you need will have to be implemented on the application side

DynamoDB data Structures

  • DynamoDB tables contain items and each item have attributes
  • DynamoDB tables uniquely identifies each item using primary keys and uses secondary indexes to provide more querying flexibility
  • There are two different kinds of primary keys Partition key and Partition key and sort key

DynamoDB Partition and Sort Keys

  • Partition key is composed of one attribute to compose to a simple primary key
  • Partition key DynamoDB uses the partition key's value as input to an internal hash function
  • Output of the hash function determines physical storage internal to DynamoDB.
  • Partition key and sort key, referred to as a composite primary key, is composed of two attributes Partition key and sort key.

DynamoDB Secondary Indexes and Partitions

  • DynamoDB users can create one or more secondary indexes on their tables
  • A secondary index lets users query the data in the table using an alternate key and against the primary key.
  • DynamoDB stores data in partitions with each partition containing an allocation of storage for a table, backed by solid state drives (SSDs)
  • The partitions are automatically replicated across multiple Availability Zones within an AWS Region.
  • DynamoDB handles the partition management.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Use Quizgecko on...
Browser
Browser