Hashing Techniques and Collision Resolution
18 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is a hash function?

A randomizing function applied to the hash field value of a record, yielding the address of the disk block of the stored record.

What is a common hash function?

  • h(K) = K + M
  • h(K) = K - M
  • h(K) = K * M
  • h(K) = K mod M (correct)
  • Hash functions guarantee that distinct values will hash to distinct addresses.

    False

    What is the problem with most hashing functions?

    <p>They do not guarantee distinct mappings.</p> Signup and view all the answers

    What is collision resolution in hashing?

    <p>The process of finding another position for a record when a hash collision occurs.</p> Signup and view all the answers

    Which of the following is a type of collision resolution?

    <p>Open Addressing</p> Signup and view all the answers

    What does the function h(k)=k mod m return when k=12 and m=10?

    <p>2</p> Signup and view all the answers

    Using hash functions can result in empty spaces in a hash table.

    <p>True</p> Signup and view all the answers

    What happens to the structure of the directory in extendible hashing as the global depth increases?

    <p>The directory doubles its entries for each increase in global depth.</p> Signup and view all the answers

    Which of the following best describes dynamic hashing?

    <p>It maintains a tree-structured directory that allows for efficient storage expansion.</p> Signup and view all the answers

    What is a unique feature of linear hashing compared to other dynamic hashing techniques?

    <p>It allows for both expansion and reduction in the number of buckets without needing a new directory.</p> Signup and view all the answers

    In the initial insertion of the keys into a hash table using linear probing, if two keys hash to the same index, what happens?

    <p>The latter key is placed in the next available index in the table.</p> Signup and view all the answers

    Given the hash function h(k) = k mod 10, which key among the following will first collide with a previously entered key during insertion?

    <p>15</p> Signup and view all the answers

    What occurs when a bucket's local depth d' equals the global depth d, and the bucket overflows?

    <p>Doubling the number of entries in the directory array</p> Signup and view all the answers

    Which statement accurately reflects the impact of RAID technology on disk access?

    <p>RAID can potentially increase the amount of data read in a given time.</p> Signup and view all the answers

    In the context of data striping in RAID, what is the difference between bit-level and block-level striping?

    <p>Bit-level striping spreads individual bits across multiple disks, while block-level striping handles blocks of data.</p> Signup and view all the answers

    What is the result of halving the global depth d in extendible hashing?

    <p>It reduces the number of entries in the directory only if all buckets have a local depth less than d.</p> Signup and view all the answers

    Which of the following best describes the role of data striping in RAID architectures?

    <p>It disperses data across multiple disks to enhance speed and efficiency.</p> Signup and view all the answers

    Study Notes

    Hashing Techniques

    • Hash function takes a hash field value of a record and yields an address (disk block) of the stored record
    • Hash file is a data structure that uses hashing
    • Search conditions in a hash file involve equality conditions on hash fields
    • Key fields are commonly used for hashing
    • Internal hashing: a hash table resides in main memory
    • A hash function converts a hash field value into an integer between 0 and M-1
    • A common hash function is h(K) = K mod M, where K is the hash field value and M is the number of available addresses

    Collisions and Resolution

    • Collisions occur when distinct hash field values map to the same address
    • Collision resolution strategies aim to find alternative positions for records that collide
    • Open addressing: checks subsequent positions in order until an empty position is found
      • Deletion and searching can be problematic with open addressing
    • Chaining: extends the array by adding overflow positions
    • Multiple hashing: employs additional hash functions when collisions occur, using open addressing if necessary
    • Empty spaces may occur even when the number of slots equals the number of records, due to the nature of hash functions

    Hash Function Example

    • h(k) = k mod m, where h(k) represents the hash function applied to key k, and m is the number of available addresses or slots in memory
    • Example: if m = 10 (10 slots), keys k1 = 12 and k2 = 22 both hash to address 2:
      • h(12) = 12 mod 10 = 2
      • h(22) = 22 mod 10 = 2
    • This illustrates a hash collision, where distinct keys are mapped to the same address
    • Good hashing functions aim to distribute records uniformly to minimize collisions and efficiently locate records while maximizing bucket usage

    Open addressing

    • Inserts a new record into the first empty spot in the hash table. This spot may be in a different location than the one calculated by the hash function.
    • Linear probing is a common type of open addressing.

    Hashing Techniques

    • Extendible hashing dynamically expands the hash table as needed.
    • Dynamic hashing utilizes a hierarchical tree structure.
    • Linear hashing allows buckets to grow or shrink without changing the directory.

    Extendible Hashing

    • Uses a directory that is an array of bucket addresses.
    • The directory has a global depth (d) that determines the number of bits used to address the buckets.
    • The hash function generates a hash value and the first d bits of this value map to a location in the directory.
    • As the hash table grows, the global depth (d) increases and the directory doubles in size.
    • When buckets are sparsely populated, the global depth (d) can be reduced and the directory size is halved.

    RAID Level 0

    • Data is striped across multiple disks without any redundancy.
    • Improves performance by spreading data across multiple disks.
    • No redundancy, so if one disk fails, all data is lost.

    RAID Level 1

    • Uses mirrored disks for data redundancy.
    • Creates a copy of the data on a separate disk.
    • If a disk fails, the mirrored disk provides a backup.

    Collision Resolution Techniques

    • Open addressing:
      • Inserts a new record into the first empty spot in the hash table. This spot may be in a different location than the one calculated by the hash function.
      • Linear probing is a common type of open addressing.
    • Chaining:
      • Uses a linked list to handle collisions. Each entry in the hash table points to a linked list.
    • Multiple hashing:
      • Uses multiple hash functions to prevent collisions. If the first hash function results in a collision, the program uses a second hash function. This process can repeat until a free slot is found.

    Hashing Functions

    • Goal of a good hash function:
      • Uniform distribution: To evenly distribute records across the address space, minimizing collisions.
      • Efficiency: To use as much of the allocated space as possible, minimising empty spots.

    RAID Levels

    • RAID level 2:
      • Uses Hamming codes for error detection and correction.
      • Suitable for applications where data integrity is critical.
    • RAID level 3:
      • Uses a single parity disk to provide redundancy.
    • RAID levels 4 and 5:
      • Use block-level data striping to distribute data across multiple disks.
      • RAID Level 5 distributes parity data across all disks for redundancy, making it more efficient than RAID Level 4, which stores the parity information on a single disk.
    • RAID level 6:
      • Uses P+Q redundancy scheme with two redundant disks, providing protection against up to two disk failures.

    Disk Performance with RAID

    • RAID Level 1 is the easiest level to rebuild after a disk failure.
    • RAID Levels 3 and 5 are often used for large volume storage because they offer good performance, especially during read operations.

    Data Striping

    • Bit-level striping:
      • Each disk stores a single bit from each byte.
      • Improves both performance and parallel read/write speed.
    • Block-level striping:
      • Each disk stores a block of data.
      • Improves performance by splitting data into blocks and distributing them across multiple disks.

    Studying That Suits You

    Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

    Quiz Team

    Related Documents

    Lecture-03_mai.pdf

    Description

    Explore the fundamentals of hashing techniques and collision resolution in data structures. This quiz covers key concepts such as hash functions, internal hashing, and the strategies for handling collisions like open addressing and chaining. Test your understanding of how these methods are implemented in computer science.

    More Like This

    Master the Art of Hashing Techniques
    5 questions
    Hashing Techniques in Data Structures
    10 questions

    Hashing Techniques in Data Structures

    SelfSufficiencyHammeredDulcimer avatar
    SelfSufficiencyHammeredDulcimer
    Use Quizgecko on...
    Browser
    Browser