Modern Symmetric Encryption Methods - PDF
Document Details
Uploaded by GenialTuba
University of Nottingham
Tags
Summary
This document provides an overview of modern symmetric encryption methods, explaining how principles like XOR and substitutions work in modern computer systems. It covers both basic operations and more advanced aspects, like substitution boxes. The document also indicates how this type of encryption relates to real-world applications.
Full Transcript
Modern Symmetric Encryption Methods How are the principles of symmetric encryption applied in modern computer systems? Basic operations XOR, substitutions, permutations Modern Encryption Methods So far we have looked at classical encryption methods because they are simple and easier to u...
Modern Symmetric Encryption Methods How are the principles of symmetric encryption applied in modern computer systems? Basic operations XOR, substitutions, permutations Modern Encryption Methods So far we have looked at classical encryption methods because they are simple and easier to understand. The principles behind these methods are used as the basis for modern symmetrical encryption methods used in computing and electronic communications. All of the classical encryption methods (including the Enigma machine) are based on a substitution operation One basic unit of data (a letter) is substituted with another according to some rules (E) and the key (K E). Each substitution is a 1:1 mapping between the possible characters in pt and ct The substitution can be fixed (mono-alphabetic) or change over time (poly-alphabetic) Letter/byte substitutions for digital data - XOR XOR is the equivalent of Caesar’s cipher for digital systems In digital systems, the basic unit of data is a bit, with the bits usually grouped into bytes. The most common operation to combine the bits/bytes of plaintext and key is XOR: Character h e l l o ASCII Code (Hex) 0x68 0x65 0x6C 0x6C 0x6F ASCII Code (Binary) 0 1 1 0 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 1 Key (20 0s, 20 1s) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ct = XOR (pt, key) 1 0 0 1 0 1 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 1 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 1 Here both the message (“hello”) and key are 5 bytes or 40bits long The operation is easily reversed by a second XOR: ct 1 0 0 1 0 1 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 1 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 1 Key 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 pt = XOR(ct, key) 0 1 1 0 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 1 A simple XOR cipher is extremely vulnerable to a known-plaintext attack, since Key = XOR(ct, pt) pt 1 0 0 1 0 1 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 1 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 1 ct 0 1 1 0 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 1 1 key = XOR(ct,pt) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Because XOR alone is weak, it is usually combined with other operations. Substitution Boxes (S-Boxes) Federal Information Processing Standards (FI Publication 197, 2001 November 26 “Announcing the ADVANCED ENCRYPTION STANDARD (AES)” Perform a substitution on smaller units of data (bytes): they take one byte and substitute it with another. Could use a function (e.g. ), but usually use lookup tables to avoid a linear relationship between input and output bytes. This helps protect against linear cryptanalysis. S-Boxes can be fixed – they do not depend on the key (e.g DES and AES), or they can be generated from the key before encryption begins (e.g. Blowfish and Twofish algorithms) S-Boxes usually operate on single bytes to minimise the memory required to store the lookup table. 1byte S-Box – 256 possible values; Second digit of hexadecimal byte 8byte S-Box – 18,446,744,073,709,551,616 possible values! 0 1 2 3 4 5 6 7 8 9 a b c d e f 0 63 7c 77 7b f2 6b 6f c5 30 1 67 2b fe d7 ab 76 The 1-byte S-Boxes are used in parallel to encrypt more than 1 byte at 1a ca time. 82 c9 7d fa 59 47 f0 ad d4 a2 af 9c a4 72 c0 First digit of hexadecimal byte 2 b7 fd 93 26 36 3f f7 cc 34 a5 e5 f1 71 d8 31 15 3 4 c7 23 c3 18 96 5 9a 7 12 80 e2 eb 27 b2 75 4 9 83 2c 1a 1b 6e 5a a0 52 3b d6 b3 29 e3 2f 84 5 53 d1 0 ed 20 fc b1 5b 6a cb be 39 4a 4c 58 cf h e l l o 6 d0 ef aa fb 43 4d 33 85 45 f9 2 7f 50 3c 9f a8 7 51 a3 40 8f 92 9d 38 f5 bc b6 da 21 10 ff f3 d2 8 cd 0c 13 ec 5f 97 44 17 c4 a7 7e 3d 64 5d 19 73 S-Box S-Box S-Box S-Box S-Box 9 60 81 4f dc 22 2a 90 88 46 ee b8 14 de 5e 0b db a e0 32 3a 0a 49 6 24 5c c2 d3 ac 62 91 95 e4 79 b e7 c8 37 6d 8d d5 4e a9 6c 56 f4 ea 65 7a ae 8 ct ct ct ct ct c ba 78 25 2e 1c a6 b4 c6 e8 dd 74 1f 4b bd 8b 8a d 70 3e b5 66 48 3 f6 0e 61 35 57 b9 86 c1 1d 9e 0 1 processed 2 by S-Boxes 3 4 byte e e1 f8 98 11 69 d9 8e 94 9b 1e 87 e9 ce 55 28 df 5 Bytes of data – each f 8c a1 89 0d bf e6 42 68 41 99 2d 0f b0 54 bb 16 is fed through a single S box in parallel, all S- Boxes are the same. S-Box from the AES algorithm – The byte 0x68 (ASCII ‘h’) would be substituted with 0x45 Permutation boxes (P-Boxes) S-Boxes alone are vulnerable because they have effectively split the encryption into multiple smaller, independent 8-bit encryption processes Breaking these simpler processes is easier that breaking one large process. S-Boxes are usually used in conjunction with P-Boxes that mix the results of the separate substitutions. P-Boxes aim to distribute the outputs of each S-Box in one level, across the inputs of all S-Boxes at the next level. h e l l o h e l l o S-Box S-Box S-Box S-Box S-Box P-Box ct ct ct ct ct S-Box S-Box S-Box S-Box S-Box 0 5 Bytes 1 processed of data 2 by S-Boxes 3 – each4 byte is fed through a single S box in parallel P-Box ct ct ct ct ct P-Boxes 0 Mix the 1 inputs and 2 outputs 3 bits of S-Boxes 4 to ensure changes to any input bit could affect any (and many) output bits Block Ciphers Efficient encryption for bulk data Block Encryption Algorithms In block encryption algorithms, a number of the operations described previously are combined to form an algorithm. This algorithm accepts a “block” of data of fixed size– typically 64, 128, 192 or 256 bits long in real systems. If the message is longer than the block size it is split into multiple blocks, each of which is encrypted separately. 23 bytes of pt in Blocks shorter than the block size are padded. 5 byte/40 bit 2 bytes of blocks padding to make Block 1 Block 2 Block 3 Block 4 Block 5 5 full blocks Each block of plaintext passes 40 bit key through The keysize is encryption Block Encryption usually the same algorithm in turn as, or similar to, to give a Algorithm the blocksize. corresponding block of ciphertext 25 bytes of ct Block 1 Block 2 Block 3 Block 4 Block 5 Example: A 40-bit block encryption process Example Block Encryption Plaintext Block Algorithm - AES Sub-Bytes S- Boxes Shift Rows Round 1 Types of Mix P-Box Columns AES – Advanced Encryption Standard, XOR with Round Round key 1 currently in widespread use. Key Sub-Bytes S- Developed as a result of a competition Boxes around year 2000. Shift Rows Types of Round 2 Block size of 128bits, key size of 128, 192 Mix P-Box Columns or 256 bits XOR with Round Uses rounds – the basic encryption step Round key 2 Key applied multiple times Initial key is expanded into many Round Keys Key For AES there are 10, 12, or 14 rounds Expansio Sub-Bytes S- depending on key size n Boxes Round key n Each round consist of Shift Rows substitution, permutation and Key Types of Round n Mix P-Box XOR steps Columns XOR with Round Decryption is the same but More information about AES in: Federal Information Processing Key uses “inverse” S-Boxes and Standards (FIPS) Publication 197, 2001 November 26 “Announcing the ADVANCED ENCRYPTION STANDARD (AES)” Ciphertext (Available online) Block Example Block Encryption Plaintext Block Algorithm - AES Sub-Bytes S- Boxes Shift Rows Round 1 Types of Mix P-Box Columns The algorithm (like other block algorithms) XOR with Round Round key 1 is designed to produce a ciphertext block Key that appears to be completely random Sub-Bytes S- data Boxes No obvious patterns, just looks like noise. Shift Rows Types of Round 2 Mix P-Box Substitution steps are supposed to Columns eliminate any obvious relationship XOR with Round between pt and ct Round key 2 Key Called confusion effect Key Permutation steps try to ensure that even Expansio very small changes in a pt block (1 bit), Sub-Bytes S- n Boxes Round key n result in a completely different ct block Shift Rows Called diffusion effect Key Types of Round n Mix P-Box Columns Goal is to make cryptanalysis impossible XOR with Round and limit attacks to brute force. Key Ciphertext Block Block ciphers – A potential problem? What is the obvious problem with block encryption? (Think mono-alphabetic!) Block 1 Block 2 Block 3 Block 4 Block 5 40 bit key Block Encryption Algorithm Block 1 Block 2 Block 3 Block 4 Block 5 If pt is much longer than the blocksize (typically 8-32 bytes), many blocks of data are being encrypted with the same EK. This is exactly the same problem seen in classic mono-alphabetic ciphers. Patterns in pt can, and will, appear in ct – even if the encryption algorithm E is highly secure. A good E (block encryption algorithm) can do little on its own about obscuring patterns seen across pt blocks: Two identical pt blocks will always result in identical ct blocks because each block is encrypted using exactly the same process so patterns in pt blocks appear in ct blocks. Block Ciphers - The ECB Penguin There is a quite famous example of this – the ECB Penguin. ECB = Electronic Code Book, refers to way in which the block cipher (AES) is applied The 128-bit AES encryption process (considered secure and used today for https://) is applied to a bitmap image in ECB mode. When the encrypted image is opened, patterns in pt clearly appear in ct. It is possible to tell what the pt was, despite the fact it has been encrypted using a secure algorithm. The problem is the way the algorithm has been applied. To avoid this problem, there are other modes of operation that describe how block ciphers can be used securely. These modes were originally defined for the DES algorithm in 1980. Block Ciphers – Modes of Operation (ECB) Electronic Code Book (ECB) ptblock1 ptblock2 ptblock3 ptblock4 ctblock1 ctblock2 ctblock3 ctblock4 Ke y Block Block Block Block Block Block Block Block Encryptio Encryptio Encryptio Encryptio Decryptio Decryptio Decryptio Decryptio n n n n n n n n Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm ctblock1 ctblock2 ctblock3 ctblock4 ptblock1 ptblock2 ptblock3 ptblock4 Each block of pt is passed through the block encryption process in turn Decryption is straightforward – pass each block of ct back the decryption process in turn. Block Ciphers – Modes of Operation (ECB) Electronic Code Book (ECB) 128-bit AES 192-bit AES Advantages Very simple to implement and understand Blocks encrypted/decrypted independently, allows encryption and decryption to be parallelised on multi-core / multi-processor systems 256-bit Any block of ct can be decrypted without needing to first AES decrypt other blocks (good if you only want to decrypt part of ct – e.g drive encryption) A corrupt block of ct will only result in 1 corrupt block of pt (no error propagation) pt patterns appear in ct – in this Disadvantages encrypted image text is still legible. Just Very insecure when applied to data consisting of multiple increasing the key size setting doesn’t necessarily make things better! blocks, not used for multiple blocks of data for this reason Block Ciphers – Modes of Operation (CBC) = bitwise XOR of blocks Cipher Block Chaining (CBC) ptblock1 ptblock2 ptblock3 ptblock4 ctblock1 ctblock2 ctblock3 ctblock4 IV Ke y Block Block Block Block Block Block Block Block Encryptio Encryptio Encryptio Encryptio Decryptio Decryptio Decryptio Decryptio n n n n n n n n Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm ctblock1 ctblock2 ctblock3 ctblock4 ptblock1 ptblock2 ptblock3 ptblock4 The output (ct) of the preceding block (effectively a random number) is XOR’d with the pt of the current block. For the first block an initialisation vector (IV) is needed as there is no preceding block. During decryption, the output (pt) of the current block must be XOR’d with the input (ct) of the Block Ciphers – Modes of Operation (CBC) Cipher Block Chaining (CBC) Advantages Decryption is still parallelisable on multicore systems Any block of ct can be decrypted independently without needing to first decrypt another block (good if you only want to decrypt part of plaintext in an application like hard-drive encryption) Initialisation vector only affects first block, doesn’t form essential part of key. A corrupt block of ct will only affect the current, and next block of decrypted pt (limited error propagation) 128-bit Disadvantages AES Encryption relies on each block being encrypted in turn – cannot be parallelised, impossible to make changes to individual ct blocks without re-encrypting subsequent blocks Encrypted image looks like noise Initialisation vector only affects first block, doesn’t form essential part of (random data) key. The Cipher Feedback (CFB) mode of operation is very similar to CBC, but the Block Ciphers – Modes of Operation (CTR) = bitwise XOR of blocks Counter (CTR) ptblock1 ptblock2 ptblock3 ptblock4 ctblock1 ctblock2 ctblock3 ctblock4 IV Counte r Ke y Block Block Block Block Block Block Block Block Encryptio Encryptio Encryptio Encryptio Decryptio Decryptio Decryptio Decryptio n n n n n n n n Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm Algorithm ctblock1 ctblock2 ctblock3 ctblock4 ptblock1 ptblock2 ptblock3 ptblock4 The counter and initialisation vector are used to generate a non-random key sequence for each block The counter can produce any sequence, but the both sender and receiver need counters that produce identical sequences This initial key sequence is encrypted using the block algorithm to give a ‘random’ key, the random key is XOR’d with pt to give ct. Block Ciphers – Modes of Operation (CTR) Counter (CTR) Advantages Blocks encrypted/decrypted independently, allows encryption and decryption to be parallelised on multi-core / multi-processor systems Any block of ct can be decrypted without needing to first decrypt another block. Any block of ct can be re-encrypted without needing to modify other blocks. A corrupt block of ct will only result in 1 corrupt block of pt (no 128-bit error propagation) AES Disadvantages Uses weak (XOR) encryption and relies completely on the block Encrypted image looks like noise (again encryption algorithm producing a pseudorandom key sequence random data) from a non-random counter input A missing block of ct will cause a loss of synchronisation. Further reading (Modes of Operation) NIST Special Publication 800-38A: Recommendation for Block, 2001 Edition Cipher Modes of Operation: Methods and Techniques, Morris Dworkin, https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38a.pdf Demonstration – Application of AES using OpenSSL OpenSSL for encryption How were previous encrypted images generated? Format of a Bitmap Encrypting bitmap data using OpenSSL Interpreting encrypted data as an image Applications of Block Ciphers Encryption: Bulk data encryption, e.g. disk encryption Microsoft’s Bitlocker uses AES in either CBC or XTS (a variant of CTR optimised for disk encryption) mode Apple’s FileVault – FileVault2 uses AES in XTS mode Internet communications – e.g. https:// (usually) uses AES for encrypting the data sent when you visit a secure website. Other: Random number generators (e.g. using CTR mode) Hash functions for message validation/authentication (see following) https://blogs.technet.microsoft.com/dubaisec/2016/03/04/bitlocker-aes-xts-new-encryptio e/ Stream Ciphers Encryption for data streams, e.g. communications Stream Ciphers CTR block cipher operating mode is a type of stream cipher. Uses the block algorithm to generate a pseudorandom key for each block. Because key is effectively random, simple XOR encryption can be used at each stage Even if you know pt for a block and so can determine the key for that block, it doesn’t help you decrypt any other block. A disadvantage of using block ciphers in this way is that they require the data to be partitioned into (relatively) large blocks: Requires padding if message are not a multiple of block size What about very short messages? Could be inefficient due to message/padding ratio. What about intermittent/real-time data streams? Cannot transmit until have a complete block of data, or need to constantly transmit padding. Stream ciphers work in a similar way but at a bit (or sometimes byte) level Think of pt as a continuous stream of bits Generate a key as a continuous stream of bits (must be a pseudorandom stream) XOR pt and key streams as each new bit of pt becomes available Synchronised Stream Ciphers Input Data pt stream Key ct stream PRBNG Key stream PRBNG – Pseudo-random Binary Number Generator transmitted Key stream must appear to be random to observer, no predictable patterns, data no repetition Both sender and receiver must be able to generate an identical key stream. PRBNG initialised using a common key Need to keep streams synchronised Electronic equivalent of Enigma machine Bitwise equivalent of CTR mode block cipher Received ct Key stream Output Data Key Decrypted pt PRBNG strea Secure Wifi Wifi obviously even more vulnerable to packet capture than a wired network – you don’t need to be connected to the network or even in the building to be able to intercept data Encryption is essential Wifi uses stream cipher style algorithms for encryption, originally WEP (Wired Equivalent Privacy) Used from 1997 - 2003 64bit (WPA-40) or 128bit (WPA-104) keys WPA-40 key consists of a shared key (40bits) and a random initialisation vector, IV (24 bits). The shared key is your Wifi password The IV is not secret, it just a number agreed by access point and computer for each “conversation” to try and avoid the exact same key being used for all transmissions (to try to limit cryptanalysis). The 64 bit key is then used as the key for a pseudorandom number generator 40-bit Wifi Key PRBNG Key XOR 24-bit IV (RC4) Message Bytes Example – RC4 and WEP RC4 is an example of a Synchronised Stream Cipher that was widely used (including for WEP secure Wifi) RC4 generates a pseudorandom byte sequence – the keystream which is then XOR’d with the message Its internal state is an array of 256 bytes (values 0-255) The starting order of these bytes is determined by the key One byte is then selected from the array to be the next 8 bits in the keystream The byte order then changes according to the algorithm. For RC4 to be secure, it should be impossible to predict the internal state, starting state or next keystream byte – even if you have managed to obtain all previous bytes in the keystream. Source: https://commons.wikimedia.org/wiki/ File:RC4.svg RC4 has been shown to be insecure The keystream it generates is not “random enough” There are correlations between early bytes in the key stream and properties of the key. Example – RC4 and WEP It was used for the WEP secure wi-fi protocol and its non-randomness led to problems! One example: The ciphertext can be seen as it is transmitted across the network The plaintext for certain early bytes sent is known (e.g. standard header bytes) From this, the early keystream bytes can easily be determined (since ks = ct XOR pt) If the keystream was pseudorandom, this wouldn’t be a problem since it would tell an attacker nothing about the internal state of RC4, the key, or future bytes in the stream. Because the keystream is non-random and related to the key, information about the key can be obtained. If enough ct data is obtained, the entire WEP key can be discovered passively. There were other problems too but most stemmed from the fact that the keystream was not random enough or that the same/similar keys were used repeatedly. A non-random or repeating key stream allows analysis of the ciphertext that can reveal information about the key / keystream Idea is the same as the basic cryptanalysis we looked at earlier For more info see: Practical attacks against WEP and WPA, Martin Beck and Erik Tews, 2008 Practical examples at: https://www.aircrack-ng.org/doku.php? id=simple_wep_crack Secure Wifi - WPA WEP no longer used, replaced with: WPA (2003-), WPA2 (2004-) WPA3(2018-) Additional features to ensure message integrity (message authentication codes, see later slides) WPA2 onwards uses AES in various counter-like modes to generate keysteam, WPA used a modified RC4 cipher called TKIP Both use a “master key”, from which transient keys are generated This means key used for cipher is continually changing Other advances in authentication methods for WPA3 Personal / Pre-shared key methods (Wifi Password) Enterprise – Authenticate user using credentials (e.g. when you log into Eduroam), known as EAP Various attacks have been demonstrated, including: KRACK – Key Reinstallation Attack – essentially tricks the cipher into reusing a known key - https://www.krackattacks.com/ Side-channel (analysis of algorithm timing) https://ieeexplore.ieee.org/document/9152782 Details of WPA not covered in this module, but if you are interested then this website Also much less sophisticated attacks, like someone pretending to be a seems to give a relatively clear public wifi network overview: https://networklessons.com/cis These attacks are beyond scope of this module but the important point is co/ccnp-encor-350-401/introdu ction-to-wpa-key-hierarchy that Wifi encryption/authentication methods are not necessarily secure Important to use other methods to encrypt communications – https, etc Notice that both WEP and CSS used a 40-bit cipher Example – DVD Encryption (CSS) This is because the US used to restrict export of any product or software containing encryption algorithms with keys > 40bits DVDs were equipped with a stream-cipher based encryption system Intent was to only allow “legal” players to play discs The disc key is encrypted 40-bit cipher with a number of possible player keys (~400) Each player key has some test data – known data encrypted with that key When the disc is inserted, the Each DVD player The data (video) player tests its keys against on is programmed on each disc is the disc by attempting to with a number encrypted with a decrypt the test data of keys disc key which is unique to that If the player is allowed to play movie the disc, one if its keys will More detail here if interested: match https://insecure.org/news/ Example – DVD Encryption (CSS) CSS wasn’t particularly effective 240 keys isn’t a huge number, brute forcing was possible Using cryptanalysis, only 225 attempts needed to be tried to find all valid keys – even easier Was quickly compromised by Open Source community To make DVD playback software/hardware you needed to buy a key from DVD CCA and adhere to restrictions Open source software didn’t have legal keys DeCSS software developed (at least partially) by Norwegian teenager – implemented a reverse engineered version of CSS algorithm and valid keys Keys did not come from brute force Someone realised that commercial DVD software (XingDVD) contained legal DVD keys that were stored unprotected, these were just copied Resulted in various legal threats/challenges to prevent its distribution and prosecute creators Modern open-source DVD players (e.g. VLC player) use libdvdcss This uses brute-forced keys to play any DVD Possibly not legal, but is tolerated Message Authentication Cryptographic Hashes Used to verify message authenticity Hash functions take a message and generate a highly condensed/shortened and unique ciphertext – a “digital fingerprint” A sender can hash the message they want to send to and transmit the hash alongside the message The receiver applies the same hash function to a message and compares the hash they generated to the one embedded. If they are the same, the message has not been corrupted during transmission Messag Received Sender e message Message Receiver Hash Hash Could be plaintext or Functio ciphertext message Functio n n Compar Received hash e message ok? Cryptographic Hashes 1 way functions Easy to generate signature from plaintext message Impossible to generate/guess plaintext from fingerprint Very similar messages should give very different fingerprints Should be practically impossible to find two messages with the same signature – this outcome is called a collision Common hash functions include: i.e. at least one these has SHA1, MD4, MD5 (old, considered insecure) been shown to be untrue SHA2 (SHA-256, SHA-384, SHA-512) – used in https:// Useful to verify that data hasn’t been corrupted, but an attacker could alter the message and the signature as the hash function will be known. Used in conjunction with public key encryption methods, see later Example: 6246E760 F0600030 471605E9 52007758 C974FF5B 88E74EB6 526DB1BD 8CC1AE03 is the SHA-256 hash of all the text in the above bullet points. It is a 256 bit (64 hexadecimal digit) number The SHA-256 hash of an executable / word document / etc would be the same length Cryptographic hashes are used for password storage Passwords never stored as plaintext If the system is compromised, all passwords will be revealed Hash functions are used to store passwords Create a hash of the password and store this When user enters password, hash their input and compare this with the saved hash Hashing algorithms for password storage are designed to be slow! They don’t have to hash large files Slower algorithms make brute-force attempts to reveal passwords more time- consuming Common algorithms include Bcrypt (based on the Blowfish block cipher) and Scrypt Password Storage and Salts For password storage, a salt (some random but not secret data) is appended to each password before hashing Example: There are 100 different users that have chosen the same password across 10 different websites If all websites use the same password hashing algorithm (not unlikely as there aren’t that many) then all of these passwords result in the same hash An attacker could build a lookup table containing the hash of all common passwords Known as a rainbow table If any website’s server is compromised, the attacker can just scan any password hashes found on the server for any that match hashes in the rainbow table. Password hashing is reversed instantly for these passwords A salt adds some additional information to the password before hashing to make each password hash unique E.g. add the first three letters of the user’s username and the first three letters of the day they registered Could also be a random number that is also stored alongside the password hash This information is known so easy to also add it to the password user enters to login but it Brute-force and other methods for obtaining passwords There is quite an interesting recent article on Microsoft’s website about methods used by attackers to obtain passwords In summary: The most common methods for obtaining passwords are from reused passwords – where 1 person uses the same password across multiple websites but one of these website is weakly protected/untrustworthy and leaks the password; or phishing where you are tricked into giving up your password If an attacker does try to guess passwords, then most of the time they will try very few passwords (e.g. the 2 most common, 5 most common) across many user accounts, rather than trying a full range of passwords for a single account (to improve the guess – success ratio) To break the hashing process described on the previous page, first the attacker has to actually gain access to the system to download the hashed password database. This is difficult and is what limits this type of attack, if the database is downloaded then cryptocurrency mining equipment can be used to break the hashing very quickly using modern cryptocurrency mining equipment: A $20,000 cryptocurrency mining system (as of July 2019) can try 100 Billion passwords per second if the passwords are hashed with SHA256. Lists are available with 500 Million passwords known to be in use, the article estimates this covers about 70% of all passwords in use. If the website’s password database is obtained, even of the passwords and salted and hashed, the plaintext for a password can be obtained in 5ms for 70% of passwords in use! It’s worth noting that this is probably just one person’s opinion on the topic and some of the statistics aren’t fully justified but it is still quite interesting. https://techcommunity.microsoft.com/t5/Azure-Active-Directory-Identity/Your-Pa-word-doesn-t-matter/ba-p/731984, July 2019 Message Authentication Codes MAC functions generate signatures from a pt or ct message, but also take a key as a parameter. Messag Received Sender e message Message Receiver Could be plaintext or ciphertext message MAC MAC key Functio Functio key n n Compar Received hash e message ok? An attacker who has intercepted and modified the message will not be able to generate a new, valid MAC if they don’t know the key MAC functions are usually modified hash functions that combine the key and message during the hashing process – e.g HMAC-SHA256 uses SHA256 hashing function to hash both key and plaintext As with other symmetric methods, both sender and receiver must know the key HashCash Hashing to reduce spam Concept from 1997... : Sending spam is easy, making it harder to send large volumes of emails might reduce spam Add some “cost” (time) to sending an email Normal person sending a few emails, no major effect Server sending millions of spam emails – heavy penalty HashCash is a proof-of-work method You have to prove that you did some work to have your email accepted. HashCash version Principle Date (26/06/2003) Generate a HashCash string for each email you send: Email recipient 0:030626:[email protected]:6470e06d773e05a8 A number The final entry in this string is called a “nonce”, this must be chosen so that the SHA1 hash of the HashCash string is valid Nonce – “A number used once”, a number that happens to be the solution to this particular problem Valid means that the first 20-bits of the hash are zeroes The only way to find a valid nonce is trial and error, the idea is that finding a nonce that will take some time, this was supposed to be around 1s on computers of the time – each email costs 1s to send String attached to email headers, receiver can hash string and check first 20 bits to see if it is valid (very fast process) Reality Didn’t take off, not really used, however http://www.hashcash.org/ https://web.mit.edu/outland/arch/i386_deb50/build/hashcash-1.21/sha1- It is seen as the origins of Bitcoin…. hashcash.html http://www.sha1-online.com/ Hashing in Block Chains / Cryptocurrency Blockchains Transaction How do you create an electronic record (called a leger) of transactions/events that: s Can be trusted # of prev. Can never be modified or corrupted (immutable) block Is not controlled by any one person or entity (decentralised) Transaction Blockchains aim to do this s Sequence of transactions, recorded on the internet, as a permanent and trustworthy record. # of prev. Key principles: block Lots of copies of the leger exist If someone manages to corrupt one then then the consensus of the others will highlight this. Transaction It is impossible to destroy all copies. You cannot break the chain s Every transaction is linked to all previous transactions # of prev. You cannot go back and change one transaction, this would be immediately detectable and invalidate your copy of the record block If one transaction changes, all subsequent transactions need to be recomputed to avoid invalidating the chain This would need to be done across at least 50% of the legers to achieve consensus Transaction The immutable property is achieved using hashes s Each block of transactions contains a hash of the previous block (which itself includes a hash of the # of prev. previous, previous block, and so on) block If you change a transaction block, you change its hash, the next block in your chain is no longer valid Bitcoin Transaction Bitcoin – the transactions in each block represent transfer of bitcoin tokens from one s account to another # of prev. To make a transaction, it is submitted to the network, validated, combined into a block and added to the end of the chain block Anyone can volunteer to generate a new block recording recent transactions Transaction Each new block grants Bitcoin to the person who made the block (the miner) The first transaction in each block generates “Coinbase” which is transferred to the creator s Provides incentive for people to do the work required to operate the network # of prev. Multiple miners will attempt to generate a valid block, different answers might temporarily fork the chain block Consensus is used to confirm a valid direction New blocks will only be added to valid chains – erroneous blocks sit in dead-ends Transaction The entire transaction history for a Bitcoin can be found by reading the ledger. This can confirm ownership. s Verifying a transaction occurred might only involve reading one block – the previous transaction for that coin. # of prev. The validity of this block is confirmed by following the hash-chains backwards (does this block genuinely follow on from past block history) and forwards (if blocks were added afterwards, this is part of the active chain and not a dead fork) Transaction Adding a block is intentionally difficult: Naturally restricts the amount of currency in circulation (control supply/demand and therefore value) s The more time-consuming it is to generate/regenerate blocks, the harder it is to corrupt a leger while making it appear valid # of prev. block Controlling block creation difficulty is done with a proof-of-work https://stateofsecurity.com/bitcoin-proof-of-work-51-of-accountants-agree/ algorithm https://bitcoin.org/bitcoin.pdf Bitcoin – Proof-of-work Transaction Transaction Inspired by HashCash – associate a cost with block s s # of prev. # of prev. creation block block As with HashCash: nonce Transaction Each block contains a nonce (number used to modify the block’s hash) s The SHA-256 hash of a block must be less than a certain number # of prev. This is the difficulty target, smaller the target, harder the problem block The difficulty target is reduced over time to control the effort required to create a block – e.g. in response to improving hardware or just to limit supply Transaction s The history of each Bitcoin can be extracted from the ledger, # of prev. including who it currently belongs to (an account) block Who the coin currently belongs to is determined by following a chain of transactions in the leger Transaction A transaction is only processed if the account owner provides a digital s signature approving the amount # of prev. Digital signatures using public key encryption methods to verify that a certain block person has sent a message – more on this next… Bitcoin – Proof-of-work https://www.blockchain.com/explorer/charts/difficulty Difficulty target modified every 2016 blocks, aim to take ~10minutes to compute hash Bitcoin – Block Structure & Transactions We will come back to the transactions part later…. https://www.blockchain.com/explorer/blocks/btc