University of Glasgow CSC1104 Lecture 3: Internal & External Memory PDF

Singapore CSC1104 - COMPUTER ORGANISATION & ARCHITECTURE LECTURE 3 : INTERNAL MEMORY AND EXTERNAL MEMORY Assoc. Prof. Cao Qi [email protected] School of Acknowledgement Computing Science Main contents of CSC1104 - Computer Organisation and Architecture are derived from: Computer organization and architecture, Designing for performance. Author: William Stallings. Publisher: Pearson. Acknowledgement to: Author and Publisher. Computer organization and Design, The hardware/software interface. Authors: D. Patterson and J. Hennessy. Publisher: Morgan Kaufmann. Acknowledgement to: Authors and Publisher. 2 School of Lecture Contents Computing Science Internal Memory: Semiconductor Main Memory Memory Organization Error Correction Code (ECC) External Memory: Solid State Drives (SSD) Hard Disk Drive (HDD) 3 School of Computing Science Internal Memory 4 School of Semiconductor Memory Computing Science Types Most common memory is random-access memory (RAM): Supporting both read data from and write new data into rapidly. Volatile. Data is lost if no power supply. 5 Read-Only Memory (ROM) School of Computing Science Programmable ROM (PROM) ROM: Can read data, not change or write new data. Non-volatile: No power required to maintain bit values in memory. PROM: a less expensive alternative of ROM. Erasable programmable read-only memory (EPROM), Electrically erasable programmable read-only memory (EEPROM), Flash Memory. Flash Memory: Erase electrically. Can erase blocks of memory, but no byte-level erasure. Achieves high density. 6 School of Semiconductor Main Computing Science Memory Common properties of semiconductor memory: Exhibit two stable (or semi-stable) states, represent binary 1 and 0. Capable of being written into, to set the state. Capable of being read to sense the cell’s state. 3 functional terminals of memory cell (Select, Control, Data in/Sense): indicates read indicates read or write or write Select this set cell state Select this output of cell to 1 or 0 cell cell’s state 7 School of Dynamic RAM (DRAM) Computing Science RAM divided into: dynamic (DRAM), static (SRAM). DRAM is made with cells to store data as charge on capacitors. Presence or absence of electric charges in a capacitor is interpreted as: binary ‘1’ or ‘0’. Refresh operation: Capacitors have a natural tendency to discharge; DRAM requires periodic charges being refreshed to maintain data storage. That is why it is called dynamic, as opposed to the static storage in a SRAM cell. 8 School of SRAM versus DRAM Computing Science Both are volatile: Power must be continuously supplied to preserve the bit values. DRAM (dynamic cell): Simpler to build, smaller. Denser (smaller cells = more cells per unit area) Less expensive (only 1 transistor) Requires refresh circuitry. Used for main memory. SRAM (static cell): Faster. More expensive. (contains 6 transistors) Used for cache memory. 9 School of Typical Organization of Computing Science A 16-Mibit DRAM DRAMs require a refresh operation 𝑅𝐴𝑆: row address select row by row. Each row must be 𝐶𝐴𝑆: column address select refreshed periodically. Refresh counter steps through row values 𝑊𝐸: write enable. one by one. 𝑂𝐸: output enable multiplexer Organization for this 16 Mi-bit RAM: Row No. = log2w = 11 Memory Array: 2048×2048×4 bits = w = 211 = 2,048 211×211×4 = 16 Mi-bit. row col data 4-bit Column No. data bus = log2w = 11 Address bus = 11 10 bits School of DRAM Memory Cycle Computing Science Time Access time: time from 𝑅𝐴𝑆 pull low (address is presented to DRAM), to 𝐶𝐴𝑆 pull high (data is available for use): from t1 to t2 Recharge time: recharge all DRAM cells before they can be accessed again: from t2 to t3 Memory cycle time = access time + recharge time. Access time Recharge time 11 School of Example 3.1 – DRAM Computing Science Memory Cycle Time ❖ A DRAM read operation waveform is shown below. The access time is from t1 to t2. The recharge time is from t2 to t3. a). What’s the memory cycle time? What’s the maximum data rate of this DRAM, assuming a 1-bit output? b). What is data transfer rate if constructing a 32-bit memory system using these cells in parallel? Solution: (a) Memory cycle time = access time + recharge time = 60 ns + 40 ns = 100 ns. 1 bit data needs 100 ns, thus max data rate = 1 / (100×10-9) = 10 Mb/s. b) 32-bit data need 32 such DRAM cells connecting in parallel. Thus 32 bits data needs 100 ns. The data rate = 32/(100×10-9) = 320 Mb/s = 40 MB/s. 12 60 ns 40 ns School of Example 3.2 – DRAM Computing Science Refresh Time ❖ A memory is built from 64 Ki×1 DRAM cells. According to data sheet, DRAM cell array is organized into 256 rows. Each row must be refreshed periodically at least once every 4 ms. a). What is the time interval of successive refresh requests between every 2 rows? b). How long a refresh address counter do we need? Solution: a) As there are 256 rows in total, all rows need be refreshed at least once in 4 ms. For row by row refreshing, the time interval of every 2 rows refreshing is: 4 ms / 256 = 4 * 10-3 / 256 = 15.625 * 10-6 second = 15.625 μs. b) The refresh counter need to address each row one by one. For 256 rows, an 8 bits refresh counter is needed to generate 256 unique addresses, as 28 = 256. 13 School of Error Correction Computing Science Memory errors categorized as hard failures & soft errors. Hard Failure: Permanent physical defect. Memory cell or cells affected cannot reliably store data, become stuck at 0 or 1 or switch erratically between 0 and 1. Harsh environmental abuse, Manufacturing defects, Wear. Soft Error: Random, non-destructive event alters contents of one or more memory cells. No permanent damage to memory. Power supply problems, Alpha particles. 14 School of Error-Correcting Code Computing Science (ECC) Function Cannot correct the error, but report the error Corrected M bits data Code to correct error New K code Data with calculated error New K = f(M) M bits data M bit data New K = + K bit code Old K? Old K code Old K is calculated Old K = f(M) No errors are detected. An error is detected, and possible to correct it. An error is detected. but cannot correct it. Report 15 it. School of Hamming Code: Simplest Computing Science Error-Correcting Code Hamming code: for 4-bit words “1110” (M = 4), draw 3 intersecting circles, there are 7 compartments. 4 data bits are assigned to the inner compartments. parity bits: make total number of 1s in any circle is even If any error changes 1 data bit The error can be circle A & circle C corrected by have odd number of Error changing this bit. 1s. but not in circle B. detected 16 School of Syndrome Word of Computing Science Hamming Code For two K parity bits, a bit-by-bit comparison is done by taking exclusive-OR of these two. Result is called syndrome word. (x XOR y = 𝑥ҧ ∙ 𝑦 + 𝑥 ∙ 𝑦). ത If a bit of syndrome word = 0, no error. If a bit of syndrome word = 1, with error. K-bit syndrome word can detect which bit with error in (0, 2K -1). To correct a single bit error in M data bits, number of K check bits are derived by: 2k -1 ≥ M + K. e.g., calculate No. of check bits K for an 8 data bits (M = 8): If K = 3: 2k -1 = 7; M + K = 11, it means 2k -1 < M + K. If K = 4: 2k -1 = 15; M + K = 12, it means 2k -1 ≥ M + K. Hence need K = 4 bits to check an error in 8-bit data. 17 School of Example 3.3 – Length of Computing Science Hamming Code Check Bit ❖ How many check bits are needed if Hamming error correction code is used to detect single bit errors in a 512-bit and a 1024-bit data word? What are the size overhead caused by the check bits? Solution: For 512-bit data (29 = 512, and M = 512), if K = 9, 2k -1 = 511; M + K = 512 + 9 = 521, means 2k -1 < M + K. if K = 10, 2k -1 = 1023; M + K = 512 + 10 = 522, means 2k -1 ≥ M + K. Hence, K = 10, need 10 check bits to check errors in 512-bit data. For 1024-bit data (210 = 1024, and M = 1024), if K = 10, 2k -1 = 1023; M + K = 1024 + 10 = 1034, it means 2k -1 < M + K. if K = 11, 2K -1 = 2047; M + K = 1024 + 11 = 1035, it means 2k -1 ≥ M + K. Hence, K = 11, need 11 check bits to check errors in 1024-bit data. For 512-bit data, overhead is (512 + 10) / 512 – 100% = 1.95%. 18 For 1024-bit data, overhead is (1024 + 11) / 1024 – 100% = 1.07%. School of Layout of Data Bits and Computing Science Check Bits 23 22 21 20 Check bits are at position No. as powers of 2: 20 (C1), 21 (C2), 22 (C4), 23 (C8). Each check bit covers on data bits whose position number contains a 1 in the same bit position as check bit. C1 checks data with a 1 at the least significant bit: 0011 (3), 0101 (5), 0111 (7), 1001 (9), 1011 (11). Thus, C1 = D1⊕D2⊕D4⊕D5⊕D7. C2 checks data with a 1 at the 2nd least significant bit: 0011 (3), 0110 (6), 0111 (7), 1010 (10), 1011 (11). Thus, C2 = D1⊕D3⊕D4⊕D6⊕D7. C4 checks data with a 1 at the 2nd most significant bit: 0101 (5), 0110 (6), 0111 (7), 1100 (12). Thus, C4 = D2⊕D3⊕D4⊕D8. C8 checks data with a 1 at the most significant bit: 1001 (9), 1010 (10), 19 1011 (11), 1100 (12). Thus, C8 = D5⊕D6⊕D7⊕D8. School of Example 3.4 – Hamming Computing Science Code Check for 8 Bit Data ❖ An 8-bit data is “00111001”, with data bit D1 in the rightmost position. Suppose now that data bit D3 sustains an error and is changed from 0 to 1. Illustrate Hamming code able to find such error. Solution: Calculate check bits based on correct data “00111001.” C1 = D1⊕D2⊕D4⊕D5⊕D7 = 1⊕0⊕1⊕1⊕0 = 1. C2 = D1⊕D3⊕D4⊕D6⊕D7 = 1⊕0⊕1⊕1⊕0 = 1. C4 = D2⊕D3⊕D4⊕D8 = 0⊕0⊕1⊕0 = 1. C8 = D5⊕D6⊕D7⊕D8 = 1⊕1⊕0⊕0 = 0. The data received with error is “00111101”. Calculate the check bits are: C’1 = D’1⊕D’2⊕D’4⊕D’5⊕D’7 = 1⊕0⊕1⊕1⊕0 = 1. C’2 = D’1⊕D’3⊕D’4⊕D’6⊕D’7 = 1⊕1⊕1⊕1⊕0 = 0. C’4 = D’2⊕D’3⊕D’4⊕D’8 = 0⊕1⊕1⊕0 = 0. C’8 = D’5⊕D’6⊕D’7⊕D’8 = 1⊕1⊕0⊕0 = 0. C”8 = C8⊕C’8 = 0. C”4 = C4⊕C’4 = 1. C”2 = C2⊕C’2 = 1. C”1 = C1⊕C’1 = 0. 20 Thus the syndrome word indicates the position number as “0110” in error, meaning D3. Illustration for Single-Error- School of Computing Science Correcting (SEC) Code 23 22 21 20 0 Error 1 C”8 = C8⊕C’8 = 0. C”4 = C4⊕C’4 = 1. C”2 = C2⊕C’2 = 1. C”1 = C1⊕C’1 = 0. 21 Thus, the syndrome word indicates the position number as 0110 in error, meaning D3. School of Single-Error-Correcting, Double- Computing Science Error-Detecting (SEC-DED) Code A single-error-correcting (SEC) code can correct a single error. More commonly, semiconductor memory is equipped with a single- error-correcting, double-error-detecting (SEC-DED) code. SEC-DED codes require an additional bit compared with SEC codes. 22 School of Hamming SEC-DED Computing Science Code 4 data bits “0110” (M=4) are assigned to the inner compartments 2 errors occur here. even parity bits Change this bit by Seems error correcting error But error still detected, as 23 occurs here 1 more parity bit is used. School of SEC-DED Code Computing Science An error-correcting code enhances the reliability of the memory at the cost of added complexity. With a 1-bit-per-chip organization, an SEC-DED code is generally considered adequate. e.g., using an 8-bit SEC-DED code for each 64 bits of data in main memory. Thus, the size of main memory becomes about 12% larger (overhead): (64+8) / 64 * 100% - 100% = 12.5%. Using a 7-bit SEC-DED for each 32 bits of memory, for a 22% overhead: (32+7) / 32 * 100% - 100%= 21.9%. 24 Synchronous DRAM School of Computing Science (SDRAM) SDRAM and DDR-SDRAM: currently dominate market. Unlike asynchronous DRAM, SDRAM exchanges data with CPU synchronized to a clock signal. Run at full speed of data bus without wait states. With synchronous access, SDRAM moves data in and out under control of system clock. CPU gives instruction and address, which is latched by SDRAM. SDRAM then responds after a set number of clock cycles (latency). 25 School of Example Waveforms of Computing Science Synchronous DRAM (SDRAM) Synchronous DRAM Waveform (with Clock) CAS latency Asynchronous DRAM Waveform (no Clock) 26 School of Example Waveform of Computing Science SDRAM SDRAM uses a burst mode to eliminate address setup time; row and column line pre-charge time after the first access. In burst mode, a series of data can be clocked out rapidly once the 1st data has been accessed. (Burst length: 1, 2, 4, 8, full page.) SDRAM performs best when transferring large blocks of data sequentially, such as video and audio files. CAS latency 27 School of Double Data-Rate Computing Science SDRAM (DDR DRAM) DDR DRAM: data transfer is synchronized to both rising and falling edge of the clock, rather than just the rising edge. This doubles the data rate. SDRAM Clock Signal DDR SDRAM 28 School of Computing Science External Memory 29 School of Solid State Drives Computing Science (SSD) SSD is a memory device made by semiconductors electronic circuitry, replacement to a hard disk drive. SSDs v.s. HDDs: High-performance input/output operations per second. Durability: Less susceptible to physical shock and vibration. Longer lifespan: SSDs are not susceptible to mechanical wear. Lower power consumption: use considerably less power. Quieter and cooler: Less dimension, lower energy costs. Lower access times and latency rates: Over 10 times faster than spinning disks in an HDD. But HDD is cheaper per GB, and larger storage capacity. 30 School of Practical Issues Computing Science of SSD 1) Performance has a tendency to slow down along usages. Flash memory accessed in blocks, a typical block size 512 KB. Become fragmented over time, pages scattered over multiple blocks. The more space occupied, the more fragmentation. Writing of a new file into multiple blocks becomes slower. 2) SSD becomes unusable after a typical number of 100,000 writes as lifetime of flash cell. Most flash devices can estimate their own remaining lifetimes, can anticipate failure and take preemptive action. 31 School of Magnetic Disk Computing Science A disk is a circular platter constructed of nonmagnetic substrate, coated with a magnetizable material. Organization of data on platter surface is in a concentric set of rings, called tracks. Thousands of tracks per surface. Hundreds of sectors per track, with size of 512 bytes per sector. Adjacent tracks separated by intertrack gaps. Adjacent sectors separated by intersector gaps. 32 School of Nonvolatile RAM within Computing Science the Memory Hierarchy SRAM: Rapid access time, but the most expensive and least bit density. Suitable to cache memory. DRAM: Cheaper, denser, and slower than SRAM, Suitable to off-chip main memory. Flash Memory and SSD. Hard disk: very high bit density, very low cost per bit, with relatively slow access times. Suitable to external storage. 33 School of Computing Science Next Lecture Lecture 4: Input/Output and Operating System 34

University of Glasgow CSC1104 Lecture 3: Internal & External Memory PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue