Computer Architecture Midterm - Memory Hierarchy

**COM ARCHI LEC** **CHAPTER 4** **The Memory Hierarchy Locality and Performance**. ### **Principle of Locality in Computer Systems** The **principle of locality**, or **locality of reference**, is a foundational concept in computer architecture and memory management. It reflects the observation that **memory references made by a program during execution tend to cluster**. This principle is critical for optimizing memory hierarchies and system performance. **KEY CONCEPT OF LOCALITY OF REFERENCE** - During the course of execution of a program, memory references by the processor, for both instruction and data, tend to cluster **DURING ANY INTERVAL OF TIME** - Some units of memory are more likely to be accessed than others **TWO FORMS OF LOCALITY** 1. **Temporal Locality --** tendency of a program to reference in the near future those units of memory referenced in the recent past 2. **Spatial Locality --** also reflects the tendency of a program to access data locations sequentially **Exploiting Temporal Locality** - he will need to read or write one of the documents in that file in the near future - he retrieves a folder from the filing cabinets, it is likely that in the near future he will need to some of the nearby folder as well **For cache memory, temporal locality -** is traditionally exploited by keeping recently used instruction and data values in cache memory **Spatial Locality -** is generally exploited by using larger cache blocks and by incorporating prefetching mechanisms (fetching items of anticipated use) into the cache control logic. **Dual locality of data spatial locality** **and instruction spatial locality. And, of course, temporal locality exhibits this same dual behavior: data temporal locality and instruction temporal locality.** ![](media/image2.png) **THE MEMORY HIERARCHY** - The designer would like to use memory technologies that provide for large-capacity memory, both because the capacity is needed and because the cost per bit is low. - to use expensive, relatively lower-capacity memories with short access times ### **Design Principles for the Memory Hierarchy** 1. Programs exhibit **spatial locality** (access data near recently accessed addresses) and **temporal locality** (reuse recently accessed data). 2. Locality allows frequently accessed data to reside in faster, smaller caches, reducing the average access time. 1. Ensures that data in a faster, smaller memory level (e.g., L1) also exists in the larger, slower memory levels (e.g., L2, L3, L4). 2. Expressed mathematically as **Mi ⊆ Mi+1**. 3. Data is **copied** between memory levels, not moved, so multiple copies can exist. 1. Ensures consistency across different levels and among caches shared by multiple cores. 2. Two requirements: - **Vertical Coherence**: Changes in a lower-level cache (e.g., L2) must propagate to higher levels (e.g., L3). If a core updates data in its L2 cache, the update must propagate to the shared L3 cache before another core can access the data. **CHAPTER 5\ CACHE MEMORY** **CACHE MEMORY PRINCIPLES** The passage explains the structure and functioning of cache memory in computer systems, along with the concept of multiple levels of cache (L1, L2, L3) and how data is transferred between cache and main memory. Here\'s a breakdown of the key concepts: If not, a **block of main memory,** consisting of some fixed number of words, is read into the cache and then the word is delivered to the processor **Block** - refers both to the unit of data transferred and to the physical location in main memory or cache **Line -** A portion of cache memory capable of holding one block **Tag -** A portion of a cache line that is used for addressing purposes, as explained subsequently **Bit** - to indicate whether the line has been modified since being loaded into the cache **Line Size -** The length of a line, not including tag and control bits each line includes a tag that identifies which particular block is currently being stored ![](media/image5.png) **.** ![](media/image7.png) **Replacement Algorithms** **LRU --** Least Recently Used **FIFO --** First-In-First-Out **A technique not based on usage (i.e., not LRU, LFU, FIFO, or some variant) is to pick a line at random from among the candidate lines.** **Write Through -** Using this technique, all write operations are made to main memory as well as to the cache, ensuring that main memory is always valid **Write Back -** minimizes memory writes. updates are made only in the cache **Update Occurs - dirty bit**, or **use bit**, associated with the line is set. **TWO ALTERNATIVES IN THE EVENT OF A WRITE MISS:** **Write Allocate -** The block containing the word to be written is fetched from main memory (or next level cache) into the cache and the processor proceeds with the write cycle. **No Write Allocate -** The block containing the word to be written is modified in the main memory and not loaded into the cache. **POSSIBLE APPROACHES TO CACHE COHERENCY INCLUDE THE FOLLOWING:** **1. Bus watching with write through** - Each cache controller monitors the address lines to detect write operations to memory by other bus primaries. If another primary writes to a location in shared memory that also resides in the cache memory, the cache controller invalidates that cache entry. This strategy depends on the use of a write- through policy by all cache controllers. **2. Hardware transparency -** Additional hardware is used to ensure that all updates to main memory via cache are reflected in all caches. Thus, if one processor modifies a word in its cache, this update is written to main memory. In addition, any matching words in other caches are similarly updated. 3\. **Noncacheable memory -** Only a portion of main memory is shared by more than one processor, and this is designated as noncacheable. In such a system, all accesses to shared memory are cache misses, because the shared memory is never copied into the cache. The noncacheable memory can be identified using chip-select logic or high-address bits. **MULTILEVEL CACHES** **UNIFIED VERSUS SPLIT CACHES** - For a given cache size, a unified cache has a higher hit rate than split caches because it balances the load between instruction and data fetches automatically. That is, if an execution pattern involves many more instruction fetches than data fetches, then the cache will tend to fill up with instructions, and if an execution pattern involves relatively more data fetches, the opposite will occur. - Only one cache needs to be designed and implemented. **INCLUSIVE POLICY -** dictates that a piece of data in one cache is guaranteed to be also found in all lower levels of caches. **EXCLUSIVE POLICY -** dictates that a piece of data in one cache is guaranteed not to be found in all lower levels of caches. **NONINCLUSIVE POLICY -** a piece of data in one cache may or may not be found in lower levels of caches. **INTEL CACHE EVOLUTION**

Computer Architecture Midterm - Memory Hierarchy

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue