Summary

This presentation introduces hash tables, which store a collection of records with keys. It explains how the location of a record depends on the hash value of the record's key. It also covers collision resolution techniques like chaining and open addressing, noting that searching for a particular key is generally quick.

Full Transcript

Hash Tables  This chapter discusses several ways of storing information in an array, and later searching for the information.  Hash tables are a common approach to the...

Hash Tables  This chapter discusses several ways of storing information in an array, and later searching for the information.  Hash tables are a common approach to the storing/searching problem.  This presentation introduces hash tables. What is a Hash Table ?  The simplest kind of hash table is an array of records.  This example has 701 records. [ 700]... An array of records What is a Hash Table ? Number 506643548  Each record has a special field, called its key.  In this example, the key is a long integer field called Number. [ 700]... What is a Hash Table ? Number 506643548  The number might be a person's identification number, and the rest of the record has information about the person. [ 700]... What is a Hash Table ?  When a hash table is in use, some spots contain valid records, and other spots are "empty". [ 700] Number 281942902 Number 233667136 Number 506643548 Number 155778322... Inserting a New Record Number 580625685  In order to insert a new record, the key must somehow be converted to an array index.  The index is called the hash value of the key. [ 700] Number 281942902 Number 233667136 Number 506643548 Number 155778322... Inserting a New Record Number 580625685  Typical way create a hash value: (Number mod 701) What is (580625685 mod 701) ? [ 700] Number 281942902 Number 233667136 Number 506643548 Number 155778322... Inserting a New Record Number 580625685  Typical way to create a hash value: (Number mod 701) 3 What is (580625685 mod 701) ? [ 700] Number 281942902 Number 233667136 Number 506643548 Number 155778322... Inserting a New Record Number 580625685  The hash value is used for the location of the new record. [ 700] Number 281942902 Number 233667136 Number 506643548 Number 155778322... Inserting a New Record  The hash value is used for the location of the new record. [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322... Collisions Number 701466868  Here is another new record to insert, with a hash value of 2. My hash value is. [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322... Collisions Number 701466868  This is called a collision, because there is already another valid record at. When a collision occurs, move forward until you find an empty spot. [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322... Collisions Number 701466868  This is called a collision, because there is already another valid record at. When a collision occurs, move forward until you find an empty spot. [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322... Collisions Number 701466868  This is called a collision, because there is already another valid record at. When a collision occurs, move forward until you find an empty spot. [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322... Collisions  This is called a collision, because there is already another valid record at. The new record goes in the empty spot. [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322... Collision Resolution  If, when an element is inserted, it hashes to the same value as an already inserted element, then we have a collision and need to resolve it.  There are several methods for dealing with this:  Separate chaining  Open addressing 16 Collision Resolution with Separate Chaining  The idea is to keep a list of all elements that hash to the same value.  The array elements are pointers to the first nodes of the lists.  A new item is inserted to the front of the list.  Advantages:  Better space utilization for large items.  Simple collision handling: searching linked list.  Overflow: we can store more items than the hash table size.  Deletion is quick and easy: deletion from the linked list. 17 Example Keys: 0, 1, 4, 9, 16, 25, 36, 49, 64, 81 hash(key) = key % 10. 0 0 1 81 1 2 3 4 64 4 5 25 6 36 16 7 8 9 49 9 18 Collision Resolution with Open Addressing  Separate chaining has the disadvantage of using linked lists.  Requires the implementation of a second data structure.  In an open addressing hashing system, all the data go inside the table.  Thus, a bigger table is needed.  If a collision occurs, alternative cells are tried until an empty cell is found. 19 Open addressing Searching for a Key Number 701466868  The data that's attached to a key can be found fairly quickly. [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322... Open addressing Searching for a Key Number 701466868  Calculate the hash value.  Check that location of the array for the key. My hash value is. Not me. [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322... Open addressing Searching for a Key Number 701466868  Keep moving forward until you find the key, or you reach an empty spot. My hash value is. Not me. [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322... Open addressing Searching for a Key Number 701466868  Keep moving forward until you find the key, or you reach an empty spot. My hash value is. Not me. [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322... Open addressing Searching for a Key Number 701466868  Keep moving forward until you find the key, or you reach an empty spot. My hash value is. Yes! [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322... Open addressing Searching for a Key Number 701466868  When the item is found, the information can be copied to the necessary location. My hash value is. Yes! [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322... Deleting a Record  Records may also be deleted from a hash table. Please delete me. [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322... Deleting a Record  Records may also be deleted from a hash table.  But the location must not be left as an ordinary "empty spot" since that could interfere with searches. [ 700] Number 281942902 Number 233667136 Number 580625685 Number 701466868 Number 155778322... Deleting a Record  Records may also be deleted from a hash table.  But the location must not be left as an ordinary "empty spot" since that could interfere with searches.  The location must be marked in some special way so that a search can tell that the spot used to have something in it. [ 700] Number 281942902 Number 233667136 Number 580625685 Number 701466868 Number 155778322... Summary  Hash tables store a collection of records with keys.  The location of a record depends on the hash value of the record's key.  When a collision occurs, the next available location is used (chaining) or by Open addressing.  Searching for a particular key is generally quick.  When an item is deleted, the location must be marked in a special way, so that the searches know that the spot used to be used. Questions ? THE END