Summary

This document explains checksums and their use in ensuring data integrity in distributed systems. It outlines how checksums are calculated and verified to detect corrupted data. The document discusses cryptographic hash functions necessary to calculate checksums.

Full Transcript

117 Checksum (New) Let's learn about checksum and its usage. Background In a distributed system, while moving data between components, it is possible that * * * * *...

117 Checksum (New) Let's learn about checksum and its usage. Background In a distributed system, while moving data between components, it is possible that * * * * * * the data fetched from a node may arrive corrupted. *** *** This corruption can occur because of faults in a storage device, network, software, *** *** etc. * How can a distributed system ensure data integrity, so that the client receives an ** ** ** error instead of corrupt data? ** * Solution * Calculate a checksum and store it with data. * *** *** * * To calculate a checksum, a cryptographic hash function like MD5, SHA-1, SHA-256, *** *** ** ** ** ** ** ** or SHA-512 is used. ** ** The hash function takes the input data and produces a string (containing letters and *** *** *** *** *** numbers) of fixed length; this string is called the checksum. *** *** *** * * * When a system is storing some data, it computes a checksum of the data and stores * * * * * * * * * *** the checksum with the data. *** * When a client retrieves data, it verifies that the data it received from the server * * * * * * * *** matches the checksum stored. If not, then the client can opt to retrieve that data *** * * *** *** *** from another replica. ***

Use Quizgecko on...
Browser
Browser