slides-hashing-en.pdf

Hash tables Chapter-contents: 23 -20 2022 B.Groz Hashing: general properties Families of hash functions Strategies to build hash tables Hashing extensions: Consistent hashing 1 Table of contents 23 -20 2022 Hash tables Hashing: general properties Families of hash functions Strategies to build hash tables Hashing extensions: Consistent hashing B.Groz 2 Hashing: definition A hash table consists in : a hash map (function) h : U → [0..m − 1]. an array v of size m Generally: h must be fast (eval. in O(1)). U = [0..L] with L m possibly unbounded. Similar techniques when U is tuples (fixed-length), or strings (variable length)... B.Groz 3 Use cases for hashing B.Groz Dictionaries and unordered sets in programs Indexing data (DBMS index, data mining) Managing a cache (ex: linux kernel, DBMS) Distributing data: partitioning, join evaluation (databases) Symbol tables in compilers Fingerprints (Git, identifying duplicates) Data mining: fingerprints, projections Cryptography: checking passwords, authentification, data integrity (checksum) 4 Strategies for hashing Many possible strategies to deal with collisions: Static hashing: Closed addressing Open addressing the number of buckets is fixed collisions solved through chaining collisions solved by probing locations Linear probing, Quadratic probing, Double hashing Cuckoo hashing Perfect hashing not strictly perfect, but O(1) access too builds collision-free hash functions FKS Dynamic hashing the number of buckets can vary Extendable hashing, Linear hashing Some people use different denominations: Open hashing = separate chaining, Closed hashing = open addressing B.Groz 5 Table of contents 23 -20 2022 Hash tables Hashing: general properties Families of hash functions Strategies to build hash tables Hashing extensions: Consistent hashing B.Groz 6 Families H of hash functions: independence H = {h : U → [0..m − 1]} B.Groz Universal family : ∀x, y ∈ U, x 6= y : Pr [h(x) = h(y )] ≤ 1/m h←H k-independent = k-universal = strongly universalk : ∀ distinct x1 ,... , xk ∈ U, ∀y1 ,... , yk ∈ [1..m] : Pr [h(x1 ) = y1 ∧ · · · ∧ h(xk ) = yk ] = 1/mk h←H Special case: k=2 ↔ pairwise independent. In practice, often a bit too strong: either use (µ, k)-independent : Pr(...) ≤ µ/mk , or prove that the family is close to a k-independent one. 7 Families H of hash functions: independence (2) How to build efficiently universal or k-independent hash functions? - degree k polynomials with random coefficients, modulo some prime (Carter-Wegman) - combinations of multiplications, bit-shifting... (Dietzfelbinger) - tabulation hashing: 3-independent - Seigel How much independence is needed? - chaining: universal is enough - linear probing: 5-independence, or tabulation - cuckoo hashing: log n-independence (? > 6), or tabulation if static. B.Groz 8 Building hash functions in practice Typically use bit operations: XOR, multiply, shift. Metrics to "quality": evaluate - chi-square test - avalanche test - collisions - number of CPU cycles [https://en.wikipedia.org/wiki/ List_of_hash_functions] B.Groz unsigned int MurmurHash2 ( const void * key, int len, unsigned int seed ) { // 'm' and 'r' are mixing constants generated offline. // They're not really 'magic', they just happen to work well. const unsigned int m = 0x5bd1e995; const int r = 24; // Initialize the hash to a 'random' value unsigned int h = seed ^ len; // Mix 4 bytes at a time into the hash const unsigned char * data = (const unsigned char *)key; while(len >= 4) { unsigned int k = *(unsigned int *)data; k *= m; k ^= k >> r; k *= m; h *= m; h ^= k; data += 4; len -= 4; } // [cut some code here]... return h; } 9 Python built-in hash() function Hash is not very random, but meant to be good locality for usage in dict with real data. For (small) integers, identity mapping. Hash returns integers (generally 24bytes, depending on interpreter). mutable types such as lists are not hashable. custom-defined types are hashable. Hash is based on id, unless you redefine __hash__(). >>> map(hash, [0,1,2]) [0, 1, 2] >>> hash(-.5) -1152921504606846976 >>> hash(.5) 1152921504606846976 >>> map(hash, ("namea", "nameb", "namec")) [-1658398457, -1658398460, -1658398459] """ hashes of string/bytes objects varies from one session to another, unless you set PYTHONHASHSEED """ >>> hash([1,3,4]) TypeError: unhashable type: 'list' B.Groz 10 Hash function implementation in Python tuplehash(PyTupleObject *v) { register long x, y; register Py_ssize_t len = Py_SIZE(v); register PyObject **p; long mult = 1000003L; x = 0x345678L; p = v->ob_item; while (--len >= 0) { y = PyObject_Hash(*p++); if (y == -1) return -1; x = (x ^ y) * mult; mult += (long)(82520L + len + len); } x += 97531L; if (x == -1) x = -2; return x; } static long string_hash(PyStringObject *a) { register Py_ssize_t len; register unsigned char *p; register long x; } if (a->ob_shash != -1) return a->ob_shash; len = Py_SIZE(a); p = (unsigned char *) a->ob_sval; x = *p = 0) x = (1000003*x) ^ *p++; x ^= Py_SIZE(a); if (x == -1) x = -2; a->ob_shash = x; return x; [https://svn.python.org/projects/python/trunk/Objects] Pour des algorithmes de hachage sécurisés (famille SHA, etc): librairie hashlib B.Groz 11 Hash function implementation in Java public int hashCode() { int h = hash; if (h == 0 && value.length > 0) { char val[] = value; for (int i = 0; i < value.length; i++) { h = 31 * h + val[i]; } hash = h; } } return h; [http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/lang/String.java] See also other mechanisms (secure hash functions and hashcodes): [https://github.com/google/guava/tree/master/guava/src/com/google/common/hash] B.Groz 12 Families H of hash functions: cryptographic hash functions Just the function, not an array for storage. Required properties: - avalanche - fast to evaluate - one-way function: resist preimage attack : from h and c, finding m such that h(m) = c must be infeasible and resist secondary preimage attack : from h, c, and m1 s.t. h(m1 ) = c, finding m2 such that h(m2 ) = c must be infeasible. (where infeasible means (provably) extremely hard... for current computers) "Much more than encryption algorithms, one-way hash functions are the workhorses of modern cryptography" (Bruce Schneier) B.Groz 13 Families H of hash functions: locality sensitive hashing H = {h : M → S} M = (M, d) metric space, S set of buckets, r1 < r2 thresholds, P1 > P2 probabilities. H is (r1 , r2 , P1 , P2 )- sensitive if: B.Groz d(p, q) ≤ r1 =⇒ Prh←H [h(p) = h(q)] ≥ P1. d(p, q) ≥ r2 =⇒ Prh←H [h(p) = h(q)] ≤ P2. r2 r1 × p d(p, q) ≥ r2 d(p, q) ≤ r1 14 Table of contents 23 -20 2022 Hash tables Hashing: general properties Families of hash functions Strategies to build hash tables Hashing extensions: Consistent hashing B.Groz 15 Hash tables In DB, bucket = page containing multiple keys. Typical default strategy: closed addressing. But overflow chains can spoil performance. Load factor: #keys #buckets Load factor around 80% generally considered best. Java 10 HashMap default load factor: 0.75 B.Groz 16 Java 8 HashMap: Closed addressing The hash function used is : static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); } Solve collisions through chaining The table is actually an array of bins. Each bins contains a collection of triples: Start with 16 bins. Resize when load factor exceeded. (#slots ∗ = 2) When a bin gets overpopulated, Java switches its content to a Tree to allow faster lookups Not thread-safe because of resizing. [http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/HashMap.java] B.Groz 17 Linear Probing General idea: use a hash function h0. Table h tries to insert w in h0 (w ), but if already occupied, inserts in next available bucket. B.Groz After x, u, s, inserting w, then z h0 : x → 7 5 u → 7 6 s → 7 7 w → 7 5 z → 7 2 18 19 0 1 17 z 16 2 3 15 4 w x u 14 13 s w 12 11 10 9 5 6 7 8 18 Linear Probing: deletions Deletions: 2 possible strategies: lazy deletion with special flag or shift next cells in cascade if needed. Deleting s at position 7 18 19 0 1 17 z 16 2 3 15 4 x u 14 13 † w 12 11 10 9 5 6 7 8 Quadratic probing: similar to Linear Probing but for i th probe, check h0 (x) + c1 ∗ i + c2 ∗ i 2 mod m instead of h0 (x) + i mod m B.Groz 19 Linear Probing: guarantees Guarantees: Search Delete Insert O(1) expected O(1) expected O(1) expected as long as load factor not too high, and provided we use 5-independent hash functions or tabulation hashing. In practice, some popular functions achieve reasonably low collisions, even without 5-independence. Typically among the fastest, because good locality. B.Groz 20 Python 2.7 dictionaries: Probing, but not linearly The table is actually an array of triples: Start with 8 slots. Resized each time dictionary is 2/3 full. (#slots ∗ = 4 for small dict, ∗ = 2 for large. Dictionary may be rebuilt if dictionary is too sparse, which is checked only during insertions) initial index is basically "low order bits of key.hash()" (not random, but fast and good locality for real world input which are often sequences) Solve collision with a pseudo_random probing sequence that is guaranteed to visit all slots eventually. """ To compute the index of the next slot when starting from slot j """ j = (5*j) + 1 + perturb; perturb >>= PERTURB_SHIFT; # Remark: eventually, perturb will be 0 # now use j % 2**i as the next table index; [https://stackoverflow.com/questions/327311/how-are-pythons-built-in-dictionaries-implemented] [https://github.com/wklken/Python-2.7.8/blob/master/Objects/dictobject.c] Current Python dict=as above+2 optimizations: compact layout+dict-sharing. B.Groz 21 Python 3.6+ dictionaries: adding 1 level of indirection Python 2 dictionary layout is as follows: entries = [['--', '--', '--'], [-8522787127447073495, 'barry', 'green'], ['--', '--', '--'], ['--', '--', '--'], ['--', '--', '--'], [-9092791511155847987, 'timmy', 'red'], ['--', '--', '--'], [-6480567542315338377, 'guido', 'blue']] Python 3 instead inserts new keys sequentially in a dense table dk_entries. Another (sparse) table dk_indices allows to locate existing actually, -1 or -2 instead. entries:... indices = [None, 1, None, None, None, 0, None, 2] entries = [[-9092791511155847987, 'timmy', 'red'], [-8522787127447073495, 'barry', 'green'], [-6480567542315338377, 'guido', 'blue']] insertion order 4 Compression (≥ 30% less) 4 Faster resize 4 Faster iterations B.Groz 22 Python dictionaries: optimizations User-defined dictionaries (explicit dict()) are as above: combined-table dict. However, Python3.4+ stores objects attributes in : split-table dict. In split-table dictionaries, the tables split values from the keys: values are kept in a separate array for each object so that multiple dict can share same key data structure. Python 3 dictionary layout (supports both split and combined dict): dk_refcnt dk_size dk_lookup dk_usable dk_nentries dk_indices dk_entries Reference count: 1 for combined_table, > 1 for split_table length of dk_indices function to search an item by key. nb unused (-1) slots in indices. nb entries. the "true" hash table discussed above [https://github.com/python/cpython/blob/master/Objects/dictobject.c] B.Groz 23 More about hashing in Python... B.Groz Python set() (at least up to Python3.11 in 2023) adopts layout ±similar to (but distinct from) Python2.7’s dict. Why not use the compact layout? (i) Typical use pattern differs, (ii) it would not save as much space, and (iii) it’s potentially incompatible with current implementation of probing in sets (hybrid b/w strictly linear and randomized sequence). Consequence: set iteration not based on insertion order! list(set(['aa','c','ab'])) # needs not be ['aa','c','ab'] list(dict.fromkeys(['c','ab','d','c','e'])) == ['c','ab','d','e'] How can we fix hash seeds if needed? - we can set PYTHONHASHSEED env. var. before running Python; ex: export PYTHONHASHSEED=0 - or we can hash objects with a custom hash (or use hashlib). 24 Cuckoo Hashing General idea: use two hash function, each on a distinct array of size m/2 (or distinct part of array) when inserting, if one of the 2 possible location is empty, insert there otherwise, displace (remove and re-insert) one of the items Cuckoo chick ejects egg Reed warbler feeds cuckoo https://www.cs.tau.ac.il/~shanir/advanced-seminar-data-structures-2009/bib/pagh01cuckoo.pdf B.Groz 25 Cuckoo Hashing x After x, u, inserting s 1 h0 x → 7 0 u → 7 4 s → 7 0 h1 5 2 1 w → 7 4 z → 7 4 y → 7 0 1 1 2 B.Groz 0 One of the two positions is free, so we insert s in it. 2 3 u 4 5 6 7 26 Cuckoo Hashing x After x, u, inserting s 1 h0 x → 7 0 u → 7 4 s → 7 0 h1 5 2 1 w → 7 4 z → 7 4 y → 7 0 1 1 2 B.Groz 0 s One of the two positions is free, so we insert s in it. 2 3 u 4 5 6 7 26 Cuckoo Hashing After x, u, s, inserting w h0 x → 7 0 u → 7 4 s → 7 0 h1 5 2 1 w → 7 4 z → 7 4 y → 7 0 1 1 2 B.Groz x 0 1 2 s None of the two positions is free, so we insert w arbitrarily in one of them. 3 u 4 5 6 7 26 Cuckoo Hashing After x, u, s, inserting w h0 x → 7 0 u → 7 4 s → 7 0 h1 5 2 1 w → 7 4 z → 7 4 y → 7 0 1 1 2 B.Groz x 0 1 2 s u is pushed out of its current location by w. u is pushed into its alternative location. u 3 w 4 5 6 7 26 Cuckoo Hashing After x, u, s, inserting w h0 x → 7 0 u → 7 4 s → 7 0 h1 5 2 1 w → 7 4 z → 7 4 y → 7 0 1 1 2 B.Groz x 0 1 s 2 u u is pushed out of its current location by w. u is pushed into its alternative location. u 3 w 4 5 6 7 26 Cuckoo Hashing After x, u, s, w , inserting z x 0 s 1 h0 x 7→ 0 u 7→ 4 s 7→ 0 h1 5 2 1 w → 7 4 z → 7 4 y → 7 0 1 1 2 B.Groz s w 2 3 x z u w is pushed out of its current location by z. w is pushed into its alternative location, which triggers other relocations in cascade. 4 5 6 7 26 Cuckoo Hashing After x, u, s, w , inserting z s 0 w 1 h0 x 7→ 0 u 7→ 4 s 7→ 0 h1 5 2 1 w → 7 4 z → 7 4 y → 7 0 1 1 2 B.Groz s w 2 u 3 x z w is pushed out of its current location by z. w is pushed into its alternative location, which triggers other relocations in cascade. 4 5 x 6 7 26 Cuckoo Hashing After x, u, s, w , z, inserting y h0 x → 7 0 u → 7 4 s → 7 0 h1 5 2 1 w → 7 4 z → 7 4 y → 7 0 1 1 2 B.Groz y 0 s 1 2 z w u w 3 z s is pushed out of its current location by y. But the relocations create a cycle. So the hash table is rebuilt with new hash functions. 4 5 x 6 7 26 Cuckoo Hashing : guarantees Guarantees: Search Delete Insert O(1) O(1) O(1) expected (amortized) Typically 20-30% slower than linear probing. B.Groz 27 Extendible Hashing General idea: organize hashes into a trie=radix search tree. A simple extendible hashing scheme can maintain: a dynamic set of buckets. Each bucket b has a depth db. a bucket directory. With depth d. ∀b, d ≥ db =⇒ hashes viewed as bit strings (LSB order), trie collapsed as an array directory. 2 8 20 1 9 12 h(x) =... 00 2 2 00 01 h(x) =... 01 2 10 22 11 B.Groz 13 62 h(x) =... 10 2 3 Directory h(x) =... 11 Buckets, where we represent h(k) rather than k http://cgi.di.uoa.gr/~ad/M149/p315-fagin.pdf 28 Extendible Hashing: bucket overflow (1) What happens when bucket capacity is exceeded? When bucket depth db = d, split overflowing bucket, relocate corresponding entries and double directory. When bucket depth db < d, split overflowing bucket and relocate corresponding entries. 2 8 20 12 1 13 9 2 2 00 01 5 2 10 22 11 B.Groz 21 62 2 3 Directory Buckets 29 Extendible Hashing: bucket overflow (2) What happens when bucket capacity is exceeded? When bucket depth db = d, split overflowing bucket, relocate corresponding entries and double directory. When bucket depth db < d, split overflowing bucket and relocate corresponding entries. 2 8 20 1 9 12 3 3 000 001 2 010 22 011 2 100 3 101 3 110 13 111 B.Groz 62 Directory 21 5 Buckets 30 Extendible Hashing: bucket overflow (3) What happens when bucket capacity is exceeded? 2 8 20 1 9 12 4 28 3 3 000 001 2 010 22 011 2 100 3 101 3 110 13 111 B.Groz 62 Directory 21 5 Buckets 31 Extendible Hashing: bucket overflow (4) What happens when bucket capacity is exceeded? 3 8 3 3 000 1 001 2 010 22 011 62 2 100 3 101 3 110 111 B.Groz 9 13 21 5 20 12 4 3 Directory 28 Buckets 32 Extendible Hashing For deletions: if empties bucket, try to merge with its "twin" bucket, and adapts depth. 4 At most 2 blocks to access data 4 grows and shrinks gracefully as data evolves 8 When data is skewed and considering bit d + 1 does not prevent overflow, extendible hashing resorts to overflow chains. 8 needs separate directory datastructure Used in file systems such as GFS2. B.Groz 33 Linear Hashing General idea: global variables: N = #buckets (initially), i =current level, b 0 =current bucket for round robin use family of hash functions h0 ,... , hk such that hi+1 splits each bucket of hi in two: b and b + N ∗ 2i. Initially, use h0. Whenever a bucket b "overflows" : - use overflow chain for that bucket b - split bucket b 0 (not necessarily b) in round-robin order, using the next-level function to rehash b 0. Increment b 0. Eventually, all buckets (esp. overflow buckets with their chain) will be split, and we will move to next level. When searching for item, we evaluate first hash function. Then, depending on position of bucket w.r.t. b 0 , may need to rehash with next function (2 possible locations then). Used in Hash index (PostgreSQL). http://infolab.cs.unipi.gr/pre-eclass/courses/db/db-post/readings/Litwin-VLDB80.pdf B.Groz 34 Bibliography... Nice lectures, with theoretical analysis of performance (probing, cuckoo): https://web.stanford.edu/class/cs166/ Dynamic hashing : extendible and linear hashing. https://db.inf.uni-tuebingen.de/staticfiles/teaching/ss13/db2/db2-06-1up.pdf Some surveys: http://www.cse.fau.edu/~xqzhu/papers/ACS.Chi.2017.Hashing.pdf http://romania.amazon.com/techon/presentations/PracticalSurveyHashTables_AurelianTutuianu.pdf B.Groz 35 Table of contents 2023 2 2 20Hash tables B.Groz Hashing: general properties Families of hash functions Strategies to build hash tables Hashing extensions: Consistent hashing A key-value store : Redis 36 Distributed hash tables: consistent hashing 'Rendezvous hashing ' Highest Random Weight ' keyring To avoid redistributing too many keys as a new server is added/removed. Assign keys to servers based on random partitioning of hash space. Figure assumes hash space is 0..264 (264 -1 would make more sense, actually) N = 3 nodes 35 36... 264 0 1 2 3 t 34 33 37 fragment on n1 4 5 n2 6 33 7 31 29 n3 u 28 7→ 7→ 7→ 7→ 12 28 30 36 26 30 10 29 12 13 25 14 24 23 22 21 20 19 18 17 16... 264 0 15 Key t is stored on n1 , x and u on n3... t 2 3 4 5 n2 6 7 8 n1 s 9 10 n3 u 28 11 x 27 26 12 13 25 14 24 23 22 21 20 19 18 17 16 15 Key t is moved to the new node n4. Expected: O(k/N) keys moved, with k = #keys, N = #servers. B.Groz 1 n4 31 9 11 x 27 h0 : x u s t n1 37 32 8 s 35 36 34 32 30 Adding node n4 37 Consistent hashing: improving repartition Each server takes charge of multiple fragments on the ring. Using virtual copies of nodes 35 36 t 34 33 37... 264 0 1 n4 2 3 n3 n2 4 5 6 n1 32 7 31 8 30 n1 s 29 9 10 n3 u 28 11 x 27 26 12 13 25 24 n2 23 22 n4 21 20 19 18 17 16 14 15 Key t is moved to the new node n4. Reduces variance. Taking k ' log N copies "guarantees" reasonably balanced loads. B.Groz 38 Consistent hashing: usage Teradata (1986) Akamai (1998) Chord (2001) Amazon’s Dynamo (2007) Cassandra Discord... To combine partitioning and replication, DynamoDB copies keys from a fragment in the next x fragments on the ring. They also designed heuristics on how to build the fragments efficiently. [https://www.microsoft.com/en-us/research/wp-content/uploads/2017/02/HRW98.pdf] [https://www.akamai.com/us/en/multimedia/documents/technical-publication/ consistent-hashing-and-random-trees-distributed-caching-protocols-for-relieving-hot-spots-on-the-world-wide-w pdf] [https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf] B.Groz 39 Consistent hashing for CDN P: set of pages C : set of available servers (n servers) For each page: 1 balanced d-ary tree with n nodes a mapping h : P × {1,... , n} → C (consistent hashing). the root of each tree is mapped by h to the server that stores the page For each page request: 1. Browser: picks a random leaf-to-root path, maps this path to a sequence of machines using h. Sends a request to the leaf that contains: the sequence of machines, the page request, and the browser id. 2. Cache machine: if it is caching the page, returns it. Otherwise increments a counter for the page, then asks the next machine and returns the page to browser when he gets it. When the counter for a page exceeds some threshold, caches a copy of the page. [https://www.akamai.com/us/en/multimedia/documents/technical-publication/consistent-hashing-and-randomtrees-distributed-caching-protocols-for-relieving-hot-spots-on-the-world-wide-web-technical-publication.pdf] B.Groz 40 Key-value store Essentially a simple dictionnary (key-value pairs, access via the key). Core operations (other depend on store): put(k,v) (set in Redis) get(k) delete(k). Very fast for lookup : data indexed by key. And, like most NoSQL systems : Can generally be distributed. (Very) scalable. Interaction through API rather than query language. Schema-on-read (values need not follow same schema) Examples : Oracle BerkeleyDB, Riak, Amazon DynamoDB, LevelDB, Memcached, Redis. B.Groz 41 Redis (Remote Dictionary Server) - Key-value store ("in-memory data structure store") - Open-source, written in C, 1st version in 2009 by Salvatore Sanfilippo [http://antirez.com/latest/0]. Now open source project backed by Redis Ltd.. - Applications : cache, real-time storage. Possibly also : streaming engine, message broker, database - In-memory (though it can persist writes in log or periodic dump). - Used in :... - Interaction with a redis server via a command-line interface : redis-cli. Or through clients which have been developped for most common languages. Ex: redis-py for Python. B.Groz 42 Redis (open source) features - Client-Server architecture, TCP connection (implemented its own event library) - Mostly single-thread execution (multithread network from Redis 6.0) - Features : Some transactions (can group operations) Replication Sharding Server-side Lua scripts (like stored procedure) PUB/SUB "Tracking" (Client-side caching, with server sending invalidation messages) B.Groz 43 Redis : data model Stores a collection of key-value pairs. The key is any binary sequence (String). Typically : not too long (it’s more efficient to use a hash for large keys) informative, and following a schema. Ex : "user:1000" or "comment:4321:reply-to" The value is a Core redis data type, or an extension provided by a module or UDF (server-side function). Core data types: String (an array of byte/char), by default max size is 512MB. Containers of strings: Lists (doubly linked, by insertion order) Sets Sorted sets (strings with score) Hashes (a hash table) Others : streams, geospatial, HLL, bitmaps, bitfield. B.Groz 44 Redis : String operations Most operations are O(1): > SET user :1 salvatore OK > GET user :1 " salvatore " > SET ticket :27 "\"{ ' username ': ' priya ' , ' ticket_id ': 321}\"" EX 100 > INCRBY views : page :2 10 ( integer ) 10 [https://redis.io/docs/data-types/strings/] B.Groz 45 Redis : Hash operations Most operations are O(1): > HSET user :123 username martina firstName Martina lastName Elisa country GB ( integer ) 4 > HGET user :123 username " martina " > HGETALL user :123 1) " username " 2) " martina "... 8) " GB " [https://redis.io/docs/data-types/hashes/] B.Groz 46 Redis Cluster : sharding Design : high perf. (scales to 1000 nodes) > best-effort write safety > availability - Client-Server, TCP connection (Redis implements its own event library) - Cluster nodes connected to each other (full mesh) through a TCP cluster bus (gossip protocol/heartbeats to discover new nodes/detect failures). - Redis assigns keys to nodes based hash slot (fixed hash function. Slot may contain several k-v pairs). - Each master (and its replicas) is responsible for a range of "hash slots". - No proxys : clients can contact any server (included replicas) and the reply indicates where to redirect query if needed - Asynchronous replication (roughly at the same time as acknowledgement of a write). Rule for merging conflicts is : last failover wins. - Replicas are not attached to a single master. If master fails, replica replaces master. When a master ends up without replica, a replica migrates from a master with multiple replicas (the master that has most replicas) to an orphaned master. - After failure/split the part of cluster that has a minority of masters eventually won’t accept writes, but may cause lost writes in the meantime. B.Groz 47 Redis : beyond Redis open source Modules many dynamic libraries (modules), mostly proprietary with source available (Redis Source Available License) : - Data models (Json datatypes, graph database, time series...) - Indexing (secondary indexes, full text search, bloom filters, cuckoo filters, spatial indexes) - AI (NN...) Redis Stack extends Redis with other data models and processing tools (json, graph, time series). Bundles the corresponding modules, Redis Insight (GUI for visualization). Redis Enterprise commercial, available on premises or deployed on AWS, GCP, Azure (possibly multicloud). Improves scalability, availability, security features, support... Mostly developed by Redis. B.Groz 48 References on Redis B.Groz - https://redis.io/docs/ - http://b3d.bdpedia.fr/redis.html - https://sebastien.combefis.be/files/ecam/nosql/ECAM-NoSQL4MIN-Cours2-Slides.pdf - https://cs.brown.edu/courses/cs227/archives/2011/slides/mar07-redis.pdf (outdated) 49

slides-hashing-en.pdf

Document Details

Tags

Related

Full Transcript