Computer Architecture - BIC10503 - Memory Organization PDF

BIC10503 Computer Architecture MEMORY ORGANIZATION 3.4 Cache Memory Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 2 What is a cache and why do we need them? CPU processe...

BIC10503 Computer Architecture MEMORY ORGANIZATION 3.4 Cache Memory Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 2 What is a cache and why do we need them? CPU processes instructions and data. To run efficiently, the CPU needs to be able to access instructions and data quickly and these are generally stored on the Hard Drive (HDD), in main memory (RAM), and in Cache Memory. An HDD is slow cheap bulk storage. RAM is orders of magnitude faster than an HDD but also more expensive, and a Cache Memory is orders of magnitude faster than RAM, but is also the most expensive kind of storage. Cache Memory is therefore a compromise between speed and cost. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 3 Cache Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or module Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 4 Cache and Main Memory Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 5 Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 6 Cache/Main Memory Structure Tag= to identify which particular block is currently used Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 7 Cache operation – overview CPU requests contents of memory location Check cache for this data If present, get from cache (fast) If not present, read required block from main memory to cache Then deliver from cache to CPU Cache includes tags to identify which block of main memory is in each cache slot Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 8 Cache Read Operation - Flowchart RA-Receive Address Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 9 Typical Cache Organization When a cache hit occurs, the data and address buffers are disabled and communication is only between processor and cache, with no system bus traffic. When a cache miss occurs, the desired address is loaded onto the system bus and the data are returned through the data buffer to both the cache and the processor. For a cache miss, the desired word is first read cache hit the processor immediately reads or writes the data into the cache and then in the cache line transferred from cache to cache miss, the cache allocates a new entry, and copies in processor data from main memory; then, the request is fulfilled from the contents of the cache. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 10 Elements of Cache Design Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 11 Cache Address Almost all modern processors support virtual memory virtual memory is a facility that allows programs to address memory from a logical point of view, without regard to the amount of main memory physically available. Virtual memory allows a program to treat its memory space as single contiguous block that may be considerably larger than main memory A Memory Management Unit (MMU) takes care of the mapping between virtual and physical addresses MMU translates each virtual address into a physical address in main memory. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 12 Dynamic relocation User program Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 13 Cache Address logical cache, also known as a virtual cache, stores data using virtual addresses. The processor accesses the cache directly, without going through the MMU. Advantage: Logical cache is that cache access speed is faster than for a physical cache, because the cache can respond before the MMU performs an address translation. The disadvantage has to do with the fact that most virtual memory systems supply each application with the same virtual memory address space Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 14 Cache Address Physical cache stores data using main memory physical addresses Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 15 Cache Size Size does matter The larger the cache, the larger the number of gates involved in addressing the cache. The result is that large caches tend to be slightly slower than small ones. Cost —More cache is expensive Speed —More cache is faster (up to a point) —Checking cache for data takes time Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 16 Comparison of Cache Sizes Year of Processor Type L1 cache L2 cache L3 cache Introduction IBM 360/85 Mainframe 1968 16 to 32 KB — — PDP-11/70 Minicomputer 1975 1 KB — — VAX 11/780 Minicomputer 1978 16 KB — — IBM 3033 Mainframe 1978 64 KB — — IBM 3090 Mainframe 1985 128 to 256 KB — — Intel 80486 PC 1989 8 KB — — Pentium PC 1993 8 KB/8 KB 256 to 512 KB — PowerPC 601 PC 1993 32 KB — — PowerPC 620 PC 1996 32 KB/32 KB — — PowerPC G4 PC/server 1999 32 KB/32 KB 256 KB to 1 MB 2 MB IBM S/390 G4 Mainframe 1997 32 KB 256 KB 2 MB IBM S/390 G6 Mainframe 1999 256 KB 8 MB — Pentium 4 PC/server 2000 8 KB/8 KB 256 KB — High-end server/ IBM SP 2000 64 KB/32 KB 8 MB — supercomputer CRAY MTAb Supercomputer 2000 8 KB 2 MB — Itanium PC/server 2001 16 KB/16 KB 96 KB 4 MB SGI Origin 2001 High-end server 2001 32 KB/32 KB 4 MB — Itanium 2 PC/server 2002 32 KB 256 KB 6 MB IBM POWER5 High-end server 2003 64 KB 1.9 MB 36 MB CRAY XD-1 Supercomputer 2004 64 KB/64 KB 1MB — Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 17 Mapping Function The replacement policy decides where in the cache a copy of a particular entry of main memory will go. In order to make room for the new entry on a cache miss, the cache generally has to remove one of the existing entries. The fundamental problem with any replacement policy is that it must predict which existing cache entry is least likely to be used in the future. So?? Mapping function - Algorithm for determining which main memory block currently occupies a cache line. Three techniques: direct mapping; associative mapping; set-associative mapping Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 18 Division of main memory address block Tag Line Word (s-r) (r) (w) s Tag (s-r) = represent unique identifier for that address block Line (r) = represent cache line Word (w)= represent lease significance bit uniquely specify and address from main memory Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 19 Direct Mapping The simplest technique, known as direct mapping, maps each block of main memory into only one possible cache line. The mapping is expressed as i = j modulo m where —i cache line number —j main memory block number —m number of lines in the cache Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 20 Direct Mapping A Direct Mapped Cache provides a specific cache location for any address in main memory. This makes it a very simple, very fast caching algorithm. Since each location in main memory maps to only a single cache location, whenever a collision occurs, the data already stored in the cache is simply replaced. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 21 Direct Mapping from Cache to Main Memory Each block of main memory maps into one unique line of the cache. The next blocks of main memory map into the cache in the same fashion; that is, block Bm of main memory maps into line L0 of cache, block Bm1 maps into line L1, and so on. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 22 Direct Mapping Advantages Simple to implement. Disadvantages Direct mapped cache is the least flexible of the three types of caches simulated. This means there is a high likelihood of collisions - resulting in significantly slower CPU/cache performance. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 23 Associative Mapping In a Fully Associative Cache, any location from main memory can be stored in any location within the cache. When the cache is full, a replacement policy is used to determine which row to replace. Some replacement policies that could be used are random, least recently used, least frequently used, first in first out, but there are many others. The efficiency of a fully associative cache depends largely on the replacement policy that is chosen. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 24 Associative Mapping Overcome the disadvantage of direct mapping - A main memory block can load into any line of cache Memory address is interpreted as tag and word Tag uniquely identifies block of memory Every line’s tag is examined for a match Cache searching gets expensive Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 25 Associative Mapping from Cache to Main Memory Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 26 Associative Mapping Advantages —there is flexibility as to which block to replace when a new block is read into the cache Disadvantages —the complex circuitry required to examine the tags of all cache lines in parallel. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 27 Set Associative Direct Mapping This type of cache is like a hybrid between the Direct Mapped Cache, and the Fully Associative cache that has proven to be an efficient and cost effective To check the cache for a hit, first the index is used to find the proper set just like in a Direct Mapped Cache. Next, the set is searched and the data returned or, in the event of a cache miss, a replacement policy is used to determine which row of the set to replace just like a Fully Associative cache. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 28 Set Associative Mapping from Cache to Main Memory Word Tag 9 bit Set 13 bit 2 bit Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 29 Set Associative Direct Mapping Parking lot analogy — Suppose we have 1000 parking spots. This time, instead of using a 3 digit number for each parking spot, we use 2 digits. Thus, the parking spots are numbered 00 up to 99. However, instead of one parking spot per number, we have 10 for each number. Thus, there are ten parking spots numbered 00, ten numbered 01,..., and ten numbered 99. — Let’s assume your parking spot is 01 so you have up to 10 different parking spots you can park at. This gives you some flexibility about where to park. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 30 Set Associative Direct Mapping Advantages Very functional hybrid Disadvantages has some of the flexibility of the Fully Associative cache when it comes to handling collisions Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 31 Replacement Algorithm Replacement Algorithms Direct mapping No choice Each block only maps to one line Replace that line Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 32 Replacement Algorithm Replacement Algorithms Associative & Set Associative Hardware implemented algorithm (speed) —Least Recently used (LRU) - Replace that block in the set that has been in the cache longest with no reference to it —First in first out (FIFO) - Replace that block in the set that has been in the cache longest —Least frequently used -Replace that block in the set that has experienced the fewest references. Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 33 Number of cache Multilevel Caches High logic density enables caches on chip —Faster than bus access —Frees bus for other transfers Common to use both on and off chip cache —L1 on chip, L2 off chip in static RAM —L2 access much faster than DRAM or ROM —L2 often uses separate data path —L2 may now be on chip —Resulting in L3 cache – Bus access or now on chip… Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 34 Number of cache Unified vs Split Cache Unified - One cache for data and instructions or Split - two caches which is one for data and one for instructions Advantages of unified cache —Higher hit rate – Balances load of instruction and data fetch – Only one cache to design & implement Advantages of split cache —Eliminates cache contention between instruction fetch/decode unit and execution unit – Important in pipelining Fakulti Sains Komputer dan Technology Maklumat (FSKTM), UTHM 35 Thank You Q&A

Computer Architecture - BIC10503 - Memory Organization PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue