Mappings and Hash Tables

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Which of the following best describes the role of a hash function in a hash table?

  • To resolve collisions by comparing keys directly.
  • To map keys to specific indices in the hash table. (correct)
  • To sort keys in ascending order before insertion.
  • To encrypt the keys for secure storage.

In Python dictionaries, keys must be mutable to allow for modification after creation.

False (B)

What is the primary goal of a well-designed hash function?

minimize collisions

A hash collision occurs when two different ______ produce the same hash value.

<p>keys</p> Signup and view all the answers

Match the following collision resolution techniques with their descriptions:

<p>Separate Chaining = Storing colliding keys in a data structure (e.g., a list) at the same index. Linear Probing = Searching sequentially for the next available slot in the hash table. Quadratic Probing = Using a quadratic function to determine the next slot to check. Double Hashing = Using a secondary hash function to determine the step size for probing.</p> Signup and view all the answers

Which of the following is NOT a basic operation of the mapping ADT?

<p>sort() (B)</p> Signup and view all the answers

Linear search has a time complexity of O(log n).

<p>False (B)</p> Signup and view all the answers

What is the time complexity of accessing an element in a dictionary using its key, assuming a good hash function and minimal collisions?

<p>O(1)</p> Signup and view all the answers

The process of increasing the size of a hash table and remapping the keys is known as ______.

<p>rehashing</p> Signup and view all the answers

Match the following types to whether they are hashable or non-hashable.

<p>int = hashable string = hashable list = non-hashable tuple = hashable</p> Signup and view all the answers

What is the purpose of the len() operation in a mapping ADT?

<p>Return the number of key-value pairs in the mapping. (A)</p> Signup and view all the answers

A binary search has a time complexity of O(n).

<p>False (B)</p> Signup and view all the answers

In the context of hash tables, what is a 'bucket'?

<p>an array location</p> Signup and view all the answers

The ratio of the number of entries to the table size in a hash table is known as the ______ factor.

<p>load</p> Signup and view all the answers

Match the ADT operation with its intended purpose

<p>put(key, value) = Adds a key:value pair get(key) = Returns value (KeyError) = is raised if the given key is not present</p> Signup and view all the answers

Why is it important for keys in a hash table to be immutable?

<p>To ensure the hash function consistently returns the same index for a key. (D)</p> Signup and view all the answers

Using a simple array is not an efficient way to implement a hash table.

<p>True (A)</p> Signup and view all the answers

What is the purpose of separate chaining?

<p>collision resolution</p> Signup and view all the answers

When a collision occurs, ______ probing searches for the next available index.

<p>linear</p> Signup and view all the answers

Match items to being hashable or not hashable.

<p>string = hashable set = non-hashable</p> Signup and view all the answers

Which collision-handling scheme stores colliding keys outside the table at the same index?

<p>Separate Chaining (B)</p> Signup and view all the answers

Quadratic probing is always guaranteed to find an empty slot if one exists.

<p>False (B)</p> Signup and view all the answers

What is the purpose of rehashing in a dynamic hash table?

<p>resize</p> Signup and view all the answers

When the load factor of a hash table increases, the average time complexity of get and put operations tends to ______.

<p>increase</p> Signup and view all the answers

Match what each ADT operation does:

<p>del(key) = remove key:value pair keys = return set of keys in mapping values = return set of values in mapping</p> Signup and view all the answers

What is the primary advantage of using a hash table for storing and retrieving data?

<p>Efficient average-case time complexity for <code>get</code> and <code>put</code> operations. (A)</p> Signup and view all the answers

Separate chaining is an example of closed hashing.

<p>False (B)</p> Signup and view all the answers

What does a Hash Collision indicate?

<p>same integer</p> Signup and view all the answers

0(1) is the rehashing cost when using ______ update.

<p>lazy</p> Signup and view all the answers

Match the python function with tuple, list, or set.

<p>hash((1,2,3,4)) = tuple hash([1,2,3,4]) = list hash({1,2,3,4}) = set</p> Signup and view all the answers

Given a hash table with a load factor close to 1, which of the following techniques would be most effective in improving performance?

<p>Rehashing with a larger table size. (B)</p> Signup and view all the answers

In open addressing, the number of possible probe sequences is limited by table size.

<p>True (A)</p> Signup and view all the answers

What is the time complexity of accessing an element using separate chaining collision-resolution?

<p>O(n)</p> Signup and view all the answers

With a direct-access table, there is ______ wasted space.

<p>too much</p> Signup and view all the answers

Match these properties to whether they are advantages or disadvantages of using keys as indexes.

<p>I can <code>put</code> and <code>get</code> with 0 (1) = advantage Need a slot for every possible key = disadvantage</p> Signup and view all the answers

In separate chaining, how does the worst-case time complexity of get and put operations compare to the average-case complexity?

<p>The worst-case complexity is O(n), while the average case is close to O(1). (A)</p> Signup and view all the answers

Any potential collision can be fixed by applying a second hashing function.

<p>False (B)</p> Signup and view all the answers

Other than using a linked list, what is another example of a data structure to minimize collision?

<p>list</p> Signup and view all the answers

With quadratic probing, quadratic provides the ______ probes.

<p>interval</p> Signup and view all the answers

Match the following Linear and Quadratic probing values.

<p>Linear = Lineal Quadratic = Quadratic</p> Signup and view all the answers

Flashcards

What is mapping?

Association between two sets of things

Key-value pair

Associates a value to a key

What are buckets?

Data stored in a hash table

What is a hash function?

Mapping between key and bucket

Signup and view all the flashcards

put(key,value)

Adds a key-value pair

Signup and view all the flashcards

get(key)

Retrieves value by key

Signup and view all the flashcards

What is KeyError?

Raises error if no associated key

Signup and view all the flashcards

del(key)

Removes a key-value pair

Signup and view all the flashcards

What is len()?

Number of keys in mapping

Signup and view all the flashcards

What are keys?

Set of keys in mapping

Signup and view all the flashcards

What are values?

Set of values in mapping

Signup and view all the flashcards

What are items?

Key-value pairs in mapping

Signup and view all the flashcards

Using keys as indices

Direct-access table or map

Signup and view all the flashcards

Using f(key)

Mapping entire key space into a small set

Signup and view all the flashcards

What is a hash function?

Converts key to index

Signup and view all the flashcards

What is a hash collision?

Having same hash value

Signup and view all the flashcards

Separate chaining

Storing colliding keys in the same slot (linked list etc..)

Signup and view all the flashcards

Open addressing

Storing colliding key in different slot

Signup and view all the flashcards

Linear probing

Search table for closest free cell

Signup and view all the flashcards

Quadratic probing

Prob i^2'th bucket in i'th iteration

Signup and view all the flashcards

What is double hashing?

Algorithmic approach to find the next available slot

Signup and view all the flashcards

Dictionaries rehashing

Dictionaries support O(1) for put and get

Signup and view all the flashcards

When to Rehash?

Load factor triggers resizing

Signup and view all the flashcards

What is load factor?

Ratio of entries number/table size

Signup and view all the flashcards

What is object?

Object whose hash value is returned

Signup and view all the flashcards

What are hashable types?

bool, int, long, float, string, tuple

Signup and view all the flashcards

What are non-hashable types?

Lists, set, dictionary

Signup and view all the flashcards

Study Notes

Chapter 15: Mapping and Hash Tables

  • Objectives include understanding mapping vocabulary, listing basic operations of mapping ADTs, explaining hashing, listing hash function requirements, explaining key immutability, and implementing mapping ADTs.

Search Algorithms

  • Linear search has a time complexity of O(n).
  • Binary search has a time complexity of O(log n).
  • Hashing can achieve search operations in O(1) time.

Dictionaries

  • Python dictionaries store elements with keys of any hashable types and values of any types.
  • Dictionaries store unordered collections of unique values within key-value pairs.
  • Keys in dictionaries must be unique.
  • Dictionaries are mutable, enabling adding, modifying, and removing key-value pairs after creation.

Mapping

  • A mapping is an association between two sets of things, specifically associating a value to a key.
  • These associated pairs are referred to as key-value pairs.
  • Keys within a mapping must be unique.

Concept of Hashing

  • Information to be retrieved is stored in a hash table, which is visualized as an array of "m" locations termed buckets.
  • A hash function is a mapping between a key and a bucket.
  • The time required to store and retrieve data is proportional to hash function computation time.

Mapping ADT (Abstract Data Type)

  • Required operations:
    • put(key, value): Adds a key-value pair.
    • get(key): Returns the value, raising a KeyError if the key is absent.
  • Optional operations:
    • del(key): Removes a key-value pair.
    • len(): Returns the number of keys.
    • keys(): Returns a set of keys.
    • values(): Returns a set of values.
    • items(): Returns a set of key-value pairs.

Direct Addressing

  • Idea: If we know exactly where a key should be, keys can be directly used as indices.
  • This approach assigns a unique index to each key.
  • Problem: Potentially too much wasted space

Hash Functions and Tables

  • Instead of directly using the key as an index, hash functions map the entire key space into a smaller set of keys to use just the right sized array.
  • A hash function (f) converts a key to an index; for example, h(x) = x % Table_size.
  • Modify hash function.

Hash Collision

  • This occurs when two different keys have the same hash value.
  • Mathematically, h(k1) = h(k2) when k1 ≠ k2.

Collision Handling Schemes

  • Collision occurs when different elements are mapped to the same cell.
  • Collision resolution methods incluse separate chaining and open addressing.
  • Open addressing involves linear probing, quadratic probing, and double hashing.

Separate Chaining (Open Hashing)

  • Keys are stored in the same slot using a list, a linked list or by some other data structure .

Open Addressing (Closed Hashing)

  • Open addressing is a collision resolution strategy where collisions are resolved by storing the colliding key in a different location when the natural choice is full.
  • The methods are:
    • Linear Probing: Search the table for the closest following free location and insert the new key there.
    • Quadratic Probing: When collision occurs, probe for i²'th bucket in ith iteration.
    • Double Hashing: Use a secondary hash function algorithm to find the next free available slot where it is very unlikely that for some key, h(key) equals g(key).

Rehashing

  • Dictionaries support O(1) for put and get, regardless of n
  • Achieved when:
    • bucket = hash(key) % m, where keys are split into m buckets used with ~m items.
  • If n gets too big/small, Rehash is used:
    • Create a list with p buckets
    • Rehash every key: bucket = hash(key) % p
    • Cost: O(n), Average cost: O(1); Lazy Update!
  • Rehashing is prompted by load factor.

Load Factor

  • Load Factor (α) is the Ratio of the number of entries in the table to table size.
  • If n is the total number of (key, value) pairs stored in the table and c is the capacity of the table (i.e., array), where α = n/c
  • This decides when to increase the HashMap capacity to maintain the get() and put() operation complexity of 0(1).

Python's hash() Function

  • Uses only one parameter: the object whose hash value is to be returned.
  • Only immutable objects can be hashed.
  • Hashable objects: bool, int, long, float, string, tuple, etc.
  • Non-hashable objects incluse lists, sets, and dictionaries.

Implementing the Mapping ADT

  • Involves using ListMapping and HashMapping classes.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

CS223: Data Structures - Hash Tables
24 questions
Hash Tables and Algorithm Analysis
5 questions
Use Quizgecko on...
Browser
Browser