ch07_Space and Time Trade-Offs_StringSearch-Hashing.pdf

Full Transcript

Algorithms Analysis and Design Chapter 7: Space and Time Trade-Offs A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 3 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 1 Space –for-time trad...

Algorithms Analysis and Design Chapter 7: Space and Time Trade-Offs A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 3 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 1 Space –for-time tradeoffs Two varieties of space-for-time algorithms:  input enhancement — preprocess the input (or its part) to store some info to be used later in solving the problem counting sorts string searching algorithms  prestructuring — preprocess the input to make accessing its elements easier hashing indexing schemes (e.g., B-trees) A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 2 Review: String searching by brute force pattern: a string of m characters to search for text: a (long) string of n characters to search in Brute force algorithm Step 1 Align pattern at beginning of text Step 2 Moving from left to right, compare each character of pattern to the corresponding character in text until either all characters are found to match (successful search) or a mismatch is detected Step 3 While a mismatch is detected and the text is not yet exhausted, realign pattern one position to the right and repeat Step 2 A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 3 String searching by preprocessing Several string searching algorithms are based on the input enhancement idea of preprocessing the pattern  Knuth-Morris-Pratt (KMP) algorithm preprocesses pattern left to right to get useful information for later searching  Boyer -Moore algorithm preprocesses pattern right to left and store information into two tables  Horspool’s algorithm simplifies the Boyer-Moore algorithm by using just one table A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 4 Horspool’s Algorithm A simplified version of Boyer-Moore algorithm: preprocesses pattern to generate a shift table that determines how much to shift the pattern when a Character Shift mismatch occurs A B … … always makes a shift based on the text’s character c aligned with the last character in the pattern according to the shift table’s entry for c A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 5 How far to shift? Look at first (rightmost) character in text that was compared: 1. The character is not in the pattern.....C...................... BAOBAB 2. The character is in the pattern (but not the rightmost) .....O...................... (O occurs once in pattern) BAOBAB .....A...................... (A occurs twice in pattern) BAOBAB 3. The rightmost characters do match.....B...................... BAOBAB A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 6 Shift table (1) The length of the shift is determined by the shift table. shift[c] is defined for all c ∈ Σ: (Σ: alphabet set) 1. If c does not occur in P, shift[c] = m. (m: pattern length) 2. Otherwise, shift[c] = m − 1 − i, where P[i] = c is the last occurrence of c in P[0..m − 2]. 1. The character is not in the pattern Character Shift......C.................. - 6 BAOBAB 2. The character is in the pattern ......A...................... (O occurs once or more in pattern) BAOBAB Character Shift A 1 012345 shift[‘A’]= 6-1-4=1 A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 7 Shift table (2)  The length of the shift is determined by the shift table. shift[c] is defined for all c ∈ Σ: (Σ: alphabet set) If c does not occur in P, shift[c] = m. (m: pattern length) Otherwise, shift[c] = m − 1 − i, where P[i] = c is the last occurrence of c in P[0..m − 2]. 01234567 m=8 A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 8 Shift table (3)  Shift sizes can be precomputed by the formula - distance from c’s rightmost occurrence in pattern among its first m-1 characters to its right end t (c) = - pattern’s length m, otherwise by scanning pattern before search begins and stored in a table called shift table Shift table is indexed by text and pattern alphabet Example: for P=“BAOBAB”, shift[‘A’]=1 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 1 2 6 6 6 6 6 6 6 6 6 6 6 6 3 6 6 6 6 6 6 6 6 6 6 6 A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 9 Horspool’s Algorithm: Example 1 Text: “BARD LOVED BANANAS”, P:”BAOBAB” A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ 1 2 6 6 6 6 6 6 6 6 6 6 6 6 3 6 6 6 6 6 6 6 6 6 6 6 6 BARD LOVED BANANAS BAOBAB ……(No matchshift[‘L’]=6) Space BARD LOVED_BANANAS BAOBAB ……(No match shift[‘B’]=2) BARD LOVED BANANAS BAOBAB ……(No match shift[‘N’]=6) BAOBAB (unsuccessful search) A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 10 Horspool’s Algorithm: Example 2 11 Boyer-Moore Horspool Example A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 12 Boyer-Moore Horspool Example … cont. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 13 Boyer-Moore Horspool Example … cont. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 14 Boyer-Moore Horspool Example … cont. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 15 Boyer-Moore Horspool Example … cont. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 16 Boyer-Moore Horspool Example … cont. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 17 Boyer-Moore Horspool Example … cont. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 18 Boyer-Moore Horspool Example … cont. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 19 Boyer-Moore Horspool Example … cont. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 20 Hashing prestructuring — preprocess the input to make accessing its elements easier A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 21 Hashing  A very efficient method for implementing a dictionary, i.e., a set with the operations: – find – insert – delete  Based on representation-change and space-for-time tradeoff ideas  Important applications: – symbol tables – databases (extendible hashing) A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 22 Hash tables and hash functions The idea of hashing is to map keys of a given file of size n into a table of size m, called the hash table, by using a predefined function, called the hash function, h: K location (cell) in the hash table Example: student records, key K= SSN: h(K) = K mod m where m is some integer (typically, prime) e.g.: Given m = 1000, where is record with K= 314159265 stored? Generally, a hash function should: be easy to compute distribute keys about evenly throughout the hash table A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 23 Collisions If h(K1) = h(K2), there is a collision  Good hash functions result in fewer collisions but some collisions should be expected (birthday paradox)  Two principal hashing schemes handle collisions differently: Open hashing – each cell is a header of linked list of all keys hashed to it Closed hashing – one key per cell – in case of collision, finds another cell by – linear probing: use next free bucket – double hashing: use second hash function to compute increment A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 24 Open hashing (Separate chaining) Keys are stored in linked lists outside a hash table whose elements serve as the lists’ headers. Example: A, FOOL, AND, HIS, MONEY, ARE, SOON, PARTED h(K) = ((sum of K’s letters’ positions in the alphabet) MOD 13) Key A FOOL AND HIS MONEY ARE SOON PARTED h(K) 1 9 6 10 7 11 11 12 0 1 2 3 4 5 6 7 8 9 10 11 12 A AND MONEY FOOL HIS ARE PARTED SOON Search for“Introduction A. Levitin KID to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 25 Closed hashing (Open addressing) Keys are stored inside a hash table. Key A FOOL AND HIS MONEY ARE SOON PARTED h(K) 1 9 6 10 7 11 11 12 The table is treated as a circular array 0 1 2 3 4 5 6 7 8 9 10 11 12 A A FOOL A AND FOOL A AND FOOL HIS A AND MONEY FOOL HIS A AND MONEY FOOL HIS ARE A AND MONEY FOOL HIS ARE SOON PARTED A AND MONEY FOOL HIS ARE A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 26 Hashing - Example ASCII start=65 cap , ASCII=97 small 11 A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 27 Hashing – Example … cont. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 28 Hashing – Example … cont. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 29 Hashing – Example … cont. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 30 Hashing – Example … cont. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 31 Hashing – Example … cont. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 32 Hashing Algorithms A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 33 A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 34 Closed Hashing – Linear Probing A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 35 Closed Hashing – Linear Probing A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 36 Closed Hashing – Linear Probing A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 37 Closed Hashing – Linear Probing A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 38 Closed Hashing – Linear Probing A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 39 Closed Hashing – Linear Probing A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 40 Open Hashing – Separate Chaining A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 41 Open Hashing – Separate Chaining A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 42 Open Hashing – Separate Chaining A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 43 Open Hashing – Separate Chaining A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 44 Open Hashing – Separate Chaining A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 45 Open Hashing – Separate Chaining A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 46 Open Hashing – Separate Chaining A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 47 Hashing - Summary A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 7 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved. 48

Use Quizgecko on...
Browser
Browser