Podcast
Questions and Answers
How does object storage differ from block or file storage in handling data?
How does object storage differ from block or file storage in handling data?
- It stores data in a hierarchical directory structure, like traditional file systems.
- It stores data as fixed-size blocks, optimized for raw speed.
- It stores data as objects with associated metadata in a flat namespace, eliminating name collisions using REST HTTP APIs. (correct)
- It stores data exclusively for archival purposes, with limited accessibility.
In object storage, what benefit does the association of metadata with objects provide?
In object storage, what benefit does the association of metadata with objects provide?
- It restricts object access to only authorized users, improving data security.
- It ensures that objects are stored in a geographically redundant manner.
- It enhances indexing and management of objects, allowing for more descriptive properties. (correct)
- It primarily reduces the physical storage space required for each object.
Which of the following is a key characteristic of how object storage systems handle data manipulation?
Which of the following is a key characteristic of how object storage systems handle data manipulation?
- Data manipulation is limited to read-only operations to ensure data integrity.
- Data manipulation is achieved through specialized APIs that require deep knowledge of the underlying storage hardware.
- Data manipulation is performed using SQL queries directly on the stored data.
- Data manipulation is conducted using REST APIs with operations like GET, PUT, DELETE, and UPDATE. (correct)
How does object storage typically ensure data durability?
How does object storage typically ensure data durability?
What role do 'buckets' play in Amazon S3's data organization?
What role do 'buckets' play in Amazon S3's data organization?
In Amazon S3, how can users control access to their data?
In Amazon S3, how can users control access to their data?
If an object needs to be recoverable after accidental deletion in Amazon S3, what feature should be enabled?
If an object needs to be recoverable after accidental deletion in Amazon S3, what feature should be enabled?
What is the purpose of 'keys' in organizing data within Amazon S3?
What is the purpose of 'keys' in organizing data within Amazon S3?
Which of the following is a primary feature of Amazon S3 that helps in protecting data against failures?
Which of the following is a primary feature of Amazon S3 that helps in protecting data against failures?
What does the term 'swift partitions' refer to in the context of OpenStack Swift?
What does the term 'swift partitions' refer to in the context of OpenStack Swift?
In OpenStack Swift, what is the functional purpose of a 'Ring'?
In OpenStack Swift, what is the functional purpose of a 'Ring'?
How does DynamoDB differ from traditional relational database management systems (RDBMS)?
How does DynamoDB differ from traditional relational database management systems (RDBMS)?
What is the role of primary keys in DynamoDB?
What is the role of primary keys in DynamoDB?
Which of the following best describes a 'partition key' in DynamoDB?
Which of the following best describes a 'partition key' in DynamoDB?
What is the purpose of secondary indexes in DynamoDB?
What is the purpose of secondary indexes in DynamoDB?
In DynamoDB, how are partitions related to data distribution and storage?
In DynamoDB, how are partitions related to data distribution and storage?
Which HTTP methods are commonly used in REST APIs for manipulating data in object storage?
Which HTTP methods are commonly used in REST APIs for manipulating data in object storage?
What does it mean for messages to be 'self-descriptive and stateless' in the context of REST APIs and object storage?
What does it mean for messages to be 'self-descriptive and stateless' in the context of REST APIs and object storage?
In the context of object storage, what is erasure coding?
In the context of object storage, what is erasure coding?
What are the benefits of using multi-part uploads for large objects in Amazon S3?
What are the benefits of using multi-part uploads for large objects in Amazon S3?
Flashcards
Object Storage
Object Storage
A prominent approach in building cloud storage systems, allowing users to store data as objects.
Object Storage
Object Storage
Virtualizing physical implementation in a flat namespace, it eliminates name collisions using REST HTTP APIs.
S3 Bucket
S3 Bucket
A resource in S3 used to store objects.
S3 Object
S3 Object
Signup and view all the flashcards
S3 Key
S3 Key
Signup and view all the flashcards
Object Versioning
Object Versioning
Signup and view all the flashcards
Access Control
Access Control
Signup and view all the flashcards
Audit Logs
Audit Logs
Signup and view all the flashcards
Replication
Replication
Signup and view all the flashcards
Swift Partitions
Swift Partitions
Signup and view all the flashcards
Swift Account
Swift Account
Signup and view all the flashcards
Swift Container
Swift Container
Signup and view all the flashcards
DynamoDB
DynamoDB
Signup and view all the flashcards
DynamoDB Items
DynamoDB Items
Signup and view all the flashcards
Primary and Secondary Indexes
Primary and Secondary Indexes
Signup and view all the flashcards
DynamoDB Partition
DynamoDB Partition
Signup and view all the flashcards
Partition Key
Partition Key
Signup and view all the flashcards
Partition key and sort key
Partition key and sort key
Signup and view all the flashcards
Study Notes
Object Storage in Cloud Computing
- Object storage is a prominent approach used in building cloud storage systems.
- Object storage is different from block or file storage because it allows for storing data as objects, essentially files in a logical view.
- The physical implementation is virtualized in a flat namespace, eliminating name collisions using REST HTTP APIs.
- Object storage manipulates data using GET, PUT, DELETE, and UPDATE.
- Object storage uses Representational State Transfer (REST) APIs.
- Resources are identified by Uniform Resource Identifiers (URIs).
- Objects also allows for addressing and identification of individual objects by more than just file name and file path.
- Object storage uses scale-out distributed systems.
- Each node often runs on a local filesystem.
- Object storage doesn't need specialized or expensive hardware
- The object storage software handles durability, creating multiple copies or erasure codes.
- Management is simplified with a single flat namespace.
Block Storage vs Object Storage
- Block Storage is a bucket of bits with no meaning attached
- Object storage takes form, is not just a string of bits.
- In Block storage, one bit string is no different from the next
- Object storage is software system called an object store from storage subsystem to.
- Block storage has no data intelligence, operations are against gross collections of bits, without knowledge of what the bits represent to the customer.
- Object storage itself has the knowledge about data and can perform complex functions against the data
Structure of Objects and Metadata
- Objects take the form of files and metadata to describe the file
- System Metadata includes the Filename, Creation date, and Last Modified date.
- Custom Metadata includes the Subject, Place Taken, Category, and Sharing Permissions
Amazon S3 (Simple Storage Service)
- Amazon S3 is a highly reliable, available, scalable, and fast cloud storage service.
- AWS console (GUI interface to AWS) performs most common operations
- Amazon provides a REST-ful API with HTTP operations such as GET, PUT, DELETE, and HEAD.
- Libraries and SDKs for various languages abstract these operations.
- Several S3 browsers exists to explore S3 account as if it were a directory (or a folder).
Organizing Data in Amazon S3
- Data is stored as objects within S3.
- Objects are stored in resources called buckets.
- S3 objects can be up to 5 terabytes in size, with no limit on the number of objects.
- Objects in S3 are replicated across multiple geographic locations for resilience.
- Objects are referred to using keys, which consist of an optional directory path name followed by the object name.
- Object versioning, when enabled, allows for recovery from deletions and modifications.
- Buckets keep related objects in one place and separate them from others, with up to 100 buckets per account.
- Each object has a key, serving as the path to the resource in an HTTP URL.
- Slash-separated keys establish a directory-like naming scheme for browsing in S3.
- Security is achieved through Access controls and Audit Logs for users to ensure the security of their S3 data
Security of S3
- Access Control to Objects lets Users set permissions that allow others to access objects, accomplished via the AWS Management Console.
- Audit logs allows S3 users to turn on logging for a bucket and stores complete access logs for the bucket in a different bucket
- the Audit logs display , which AWS account accessed the objects, the time of access, the IP address, and the operations performed.
Data Protection in Amazon S3
- By default, S3 replicates data across multiple storage devices to survive two replica failures.
- RRS (Reduced Redundancy Storage) replicates noncritical data twice, designed to survive one replica failure.
- Data can be S3 can be made to run in specific geographic locations for performance, legal, and availability reasons.
- Bucket-level selection can accomplish specific data locations via region storing during creation.
- S3 automatically stores the full history of all objects when versioning is enabled on a bucket.
- Objects can be restored to prior versions and deletions undone.
Large Objects and Multi-Part Uploads
- Program splits large objects into multiple parts and uploads each independently to S3
- Uploads can be parallelized to maximize network utilization.
- Only failed parts need re-trying if a part fails to upload.
Open Stack Swift
- Open Stack Swift is another illustration of object storage
- Swift Partitions breaks the storage available into locations where data and accounts will be located.
- Swift Partitions is the core of the replication system
- Account is an user in the storage system that uses swift creates accounts.
- Container: Containers are where accounts create and store data.
- Containers are name spaces used to group objects
- object is actual data stored on the disk.
- Ring maps the partition space to physical locations on disk.
NoSQL Database - DynamoDB
- DynamoDB is a fully managed NoSQL cloud database available through Amazon Web Services (AWS).
- Data is stored in tables that must be created and defined in advance.
- Users must define aspects of the tables, mainly keys and local secondary indexes.
- DynamoDB also retains schema less flavor.
- DynamoDB can query data based on secondary indexes, but has item-level consistency, analogous to row-level consistency in RDBMSs
- DynamoDB is not the best choice if consistency across items is a necessity
- Tables are the highest level at which data can be grouped and manipulated, any join-style capabilities that you need will have to be implemented on the application side
DynamoDB data Structures
- DynamoDB tables contain items and each item have attributes
- DynamoDB tables uniquely identifies each item using primary keys and uses secondary indexes to provide more querying flexibility
- There are two different kinds of primary keys Partition key and Partition key and sort key
DynamoDB Partition and Sort Keys
- Partition key is composed of one attribute to compose to a simple primary key
- Partition key DynamoDB uses the partition key's value as input to an internal hash function
- Output of the hash function determines physical storage internal to DynamoDB.
- Partition key and sort key, referred to as a composite primary key, is composed of two attributes Partition key and sort key.
DynamoDB Secondary Indexes and Partitions
- DynamoDB users can create one or more secondary indexes on their tables
- A secondary index lets users query the data in the table using an alternate key and against the primary key.
- DynamoDB stores data in partitions with each partition containing an allocation of storage for a table, backed by solid state drives (SSDs)
- The partitions are automatically replicated across multiple Availability Zones within an AWS Region.
- DynamoDB handles the partition management.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.