Cache and Memory Hierarchy

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Which memory level in the memory hierarchy has the fastest access time?

  • L1 Cache (correct)
  • Main Memory
  • Local Secondary Storage
  • L3 Cache

Remote secondary storage, like web servers, is generally faster and more expensive per byte than main memory.

False (B)

In a cache system, what is the purpose of the 'valid' bit?

To indicate whether the data in the cache line is valid or not.

In a direct-mapped cache, if the tag does not match, the old line is ________.

<p>evicted</p> Signup and view all the answers

Match the cache types with their descriptions:

<p>Direct-Mapped Cache = Each memory location maps to a specific location in the cache. Set-Associative Cache = Each memory location can map to one of a set of locations in the cache. Fully Associative Cache = Each memory location can map to any location in the cache.</p> Signup and view all the answers

Which type of cache miss occurs when the cache is initially empty?

<p>Cold (compulsory) miss (A)</p> Signup and view all the answers

In a write-through cache, data is only written to the cache and not to the main memory.

<p>False (B)</p> Signup and view all the answers

What is 'thrashing' in the context of cache memory?

<p>Repeated eviction and replacement of cache contents, severely impacting performance.</p> Signup and view all the answers

The number of sets in a cache is determined by the ________ bits in the memory address.

<p>index</p> Signup and view all the answers

Match the terms related to cache memory with their definitions:

<p>Cache Line = A block of memory in the cache. Tag = A portion of the memory address used to identify the cache line. Index = A portion of the memory address used to select the cache set. Offset = A portion of the memory address used to select a byte within the cache line.</p> Signup and view all the answers

What is the primary advantage of using a set-associative cache over a direct-mapped cache?

<p>Reduced conflict misses (B)</p> Signup and view all the answers

Spatial locality refers to the tendency of a program to access the same memory location repeatedly over a short period.

<p>False (B)</p> Signup and view all the answers

Briefly explain the difference between a 'write-through' and a 'write-back' cache.

<p>Write-through: data written to cache and main memory simultaneously. Write-back: data written to cache, updates to main memory deferred.</p> Signup and view all the answers

In the context of cache writes, ________ means to defer writing to the next level until the line is evicted.

<p>write-back</p> Signup and view all the answers

Match the cache write policies with their descriptions:

<p>Write-Through = Data is written to both the cache and main memory simultaneously. Write-Back = Data is written only to the cache, and main memory is updated when the cache line is evicted. Write-Allocate = On a write miss, the cache line is loaded into the cache before being written to. No-Write-Allocate = On a write miss, the data is written directly to main memory without loading it into the cache.</p> Signup and view all the answers

What is the purpose of the Least Recently Used (LRU) replacement policy in a cache?

<p>To replace the least recently used cache line (D)</p> Signup and view all the answers

Increasing the associativity of a cache always reduces the miss rate.

<p>False (B)</p> Signup and view all the answers

Explain how conflict misses can occur even when a cache is not full. Provide an example.

<p>Conflict misses occur because multiple memory locations map to the same cache location. Example: accessing addresses 0, 8, 0, 8 if cache size is small.</p> Signup and view all the answers

In a direct-mapped cache with a block size of 8 bytes, the least significant 3 bits of the address are used as the ________.

<p>block offset</p> Signup and view all the answers

Match the following cache concepts with their impact on cache performance:

<p>Temporal Locality = Reduces cache misses by reusing recently accessed data. Spatial Locality = Reduces cache misses by accessing nearby data. Conflict Misses = Increases cache misses due to address collisions. Cache Thrashing = Severely degrades performance due to continuous cache eviction and replacement.</p> Signup and view all the answers

A program exhibits poor spatial locality. Which of the following cache optimizations would be MOST effective in reducing the miss rate?

<p>Increasing the block size (A)</p> Signup and view all the answers

A 'dirty bit' in a cache line is used to indicate that the data in the cache line is inconsistent with main memory and needs to be written back.

<p>True (A)</p> Signup and view all the answers

Explain the concept of cache-friendly code and provide one specific example of how to write code that improves cache performance.

<p>Cache-friendly code maximizes data locality to reduce cache misses. Example: accessing array elements sequentially.</p> Signup and view all the answers

When the set of active cache blocks is larger than the cache itself, a ________ miss occurs.

<p>capacity</p> Signup and view all the answers

Match the cache hierarchy levels with their typical characteristics:

<p>L1 Cache = Smallest size, fastest access time, per-core. L2 Cache = Larger size, slower access time than L1, per-core. L3 Cache = Largest size, slowest access time amongst the cache levels, shared by all cores.</p> Signup and view all the answers

In a multi-core processor, what is the main advantage of having a shared L3 cache?

<p>Efficient sharing of data between cores (A)</p> Signup and view all the answers

Write-allocate is beneficial when there are few writes to a particular memory location.

<p>False (B)</p> Signup and view all the answers

What is the significance of 'stride-1 reference patterns' in the context of cache performance?

<p>Stride-1 reference patterns exhibit good spatial locality, leading to fewer cache misses.</p> Signup and view all the answers

In E-way set associative caches, an associative memory is an array of ________ and ________ pairs

<p>key, value</p> Signup and view all the answers

Match the cache access outcomes with their corresponding actions:

<p>Cache Hit = Data is retrieved from the cache. Cache Miss = Data is fetched from main memory and placed in the cache. Conflict Miss = A valid cache line is evicted due to address collision. Capacity Miss = A cache line is evicted because the cache is full.</p> Signup and view all the answers

Consider the following code snippet:

for (int i = 0; i < N; i++) {
  for (int j = 0; j < N; j++) {
    a[j][i] += 1;
  }
}

Assuming a is a large 2D array, what can be done to improve the cache performance of this code?

<p>Swap the inner and outer loops (A)</p> Signup and view all the answers

In the memory hierarchy, CPU registers hold words retrieved from the L2 cache.

<p>False (B)</p> Signup and view all the answers

What is the purpose of cache simulation?

<p>To analyze and predict the performance of a cache design without building the actual hardware.</p> Signup and view all the answers

In a direct-mapped cache, the number of lines per set is always _____.

<p>one</p> Signup and view all the answers

Match the following cache terminology with its corresponding description:

<p>Address of word = Composed of tag bits, set index bits, and block offset. Set index = Used to locate a specific set within the cache. Block offset = Identifies the location of data within the cache block. Tag = Used to identify the data stored inside the cache.</p> Signup and view all the answers

Consider a direct-mapped cache with 4 sets and a block size of 2 bytes. Given the following memory access sequence: 0, 1, 7, 8. Which accesses result in a cache miss?

<p>0, 1, 7, 8 (D)</p> Signup and view all the answers

Temporal locality in cache memory refers to accessing data elements within close physical proximity in memory.

<p>False (B)</p> Signup and view all the answers

Why is it important to choose the correct cache block size?

<p>The block size affects the hit rate. Too small and the benefits of spatial locality are lost. Too large and the block is filled with unused data.</p> Signup and view all the answers

To improve the efficiency of matrix multiplication, loops should be arranged to enhance data ________.

<p>locality</p> Signup and view all the answers

Match the cache types with their eviction policies:

<p>Direct-Mapped = No choice, the corresponding cache line is always replaced. Set-Associative = LRU, Random, or other policies to choose among the lines in the set. Fully Associative = LRU, Random, or other policies can be used to select any cache line for replacement.</p> Signup and view all the answers

Assuming an 8 bit memory address, what is the set index for address 0001 1100 if there are 2 sets and a block size of 16 bytes?

<p>1 (B)</p> Signup and view all the answers

A cache that has 2 blocks and 4 sets is how many way set associative?

<p>It doesn't have set associativity (B)</p> Signup and view all the answers

Flashcards

Memory Hierarchy

A hierarchical system that utilizes multiple levels of memory with varying speeds and costs to optimize performance.

CPU Registers

Small, fast storage locations within the CPU used to hold data and instructions for immediate use.

Cache Memory

A small, fast memory component that stores frequently accessed data to reduce the average time to access memory.

Main Memory (DRAM)

The main system memory that the CPU can access directly, typically using DRAM (Dynamic Random-Access Memory).

Signup and view all the flashcards

Local Secondary Storage

Non-volatile storage used to store large amounts of data, slower and cheaper than main memory (local disks).

Signup and view all the flashcards

Remote Secondary Storage

External storage accessed over a network, such as web servers, offering large capacity and accessibility.

Signup and view all the flashcards

Direct-Mapped Cache

A cache organization where each memory location maps to exactly one location in the cache.

Signup and view all the flashcards

Lines per Set (E)

The number of lines in a set. Used to organize the cache.

Signup and view all the flashcards

Cache Block Size (B)

The amount of data that can be stored in each location.

Signup and view all the flashcards

Valid Bit

A binary value indicating whether a cache line contains valid data.

Signup and view all the flashcards

Tag (Cache)

Portion of the memory address used to identify the cache line storing the data.

Signup and view all the flashcards

Set Index

Portion of the address used to select a specific block within the cache.

Signup and view all the flashcards

Block Offset

Portion of the address used to select a specific byte within the cache block.

Signup and view all the flashcards

Cold (Compulsory) Miss

Occurs on first reference to a data item, cache is empty.

Signup and view all the flashcards

Conflict Miss

Occurs when multiple data objects map to same cache location.

Signup and view all the flashcards

Capacity Miss

Occurs when the set of active blocks is larger than the cache.

Signup and view all the flashcards

Thrashing

A situation where constantly replaced content reduces performance.

Signup and view all the flashcards

Set-Associative Cache

A cache organization where each memory location can map to multiple locations in the cache set.

Signup and view all the flashcards

Replacement Policy

Choosing which cache line to evict when a new block needs to be stored.

Signup and view all the flashcards

LRU (Least Recently Used)

A replacement policy where the least recently used item is evicted first.

Signup and view all the flashcards

Write-Through

Writing the data immediately to the next level.

Signup and view all the flashcards

Write-Back

Defer writing until cache line is evicted.

Signup and view all the flashcards

Write-Allocate

Loads into the cache, updates line in cache.

Signup and view all the flashcards

No-Write-Allocate

Writes straight to memory, does not load into cache.

Signup and view all the flashcards

Temporal Locality

Repeated references to variables.

Signup and view all the flashcards

Spatial Locality

Stride-1 patterns, array accesses.

Signup and view all the flashcards

Study Notes

  • Systems Programming: Cache II presented on November 18, 2024.

Caching & The Memory Hierarchy

  • Memory hierarchy consists of multiple levels of storage.
  • Registers sit at L0, followed by L1 cache (SRAM), L2 cache (SRAM), L3 cache (SRAM), main memory (DRAM), local secondary storage (local disks), and remote secondary storage (Web servers).
  • As the levels go up: memory becomes smaller, faster, and more costly per byte.
  • As the levels go down: memory becomes larger, slower, and cheaper per byte.
  • CPU registers hold words retrieved from the L1 cache.
  • L1 cache holds cache lines retrieved from the L2 cache.
  • L2 cache holds cache lines retrieved from the L3 cache.
  • L3 cache holds cache lines retrieved from main memory.
  • Main memory holds disk blocks retrieved from local disks.
  • Local disks hold files retrieved from disks on remote servers.

General Cache Organization: (S, E, B)

  • S represents the number of sets in the cache, where S = 2s.
  • E represents the number of lines per set, and a line contains a set and a line.
  • B represents number of bytes per cache block
  • Cache size (C) is calculated as S x E x B data byte.
  • Each cache line includes a valid bit and a tag.

Cache Read Operation

  • The process involves locating the appropriate set, checking for a matching tag within any line of that set.
  • Hit occurs when a matching tag is found and the line is valid.
  • Data retrieval starts at a specific offset within the cache line.
  • Address is divided into tag bits (t bits), set index bits (s bits), and block offset bits (b bits).

Direct-Mapped Cache (E = 1)

  • Direct-mapped cache has only one line per set.
  • Cache block size is assumed to be 8 bytes.
  • To find a desired set, the address of the int is sectioned into t-bits, and then 0...01 (100)

Direct-Mapped Cache Operation

  • If the tag matches and the line is valid, it's a hit.
  • The data is retrieved using the block offset.
  • If the tag doesn't match, the old line is evicted and replaced.

Direct-Mapped Cache Simulation Example

  • A direct-mapped cache with (S, E, B, m) = (4, 1, 2, 4) is considered.
  • "Thrashing" happens with repeated eviction and replacement of the contents of a particular set in the cache.
  • Example: an access pattern like addresses 0, 8, 0, 8, 0, and 8 cause this repeated eviction.

Types of Cache Misses

  • Cold (compulsory) miss: This occurs when the cache is empty.
  • Conflict miss: Most caches limit blocks at level k+1 to a small subset of the block positions at level k.
    • It can occur when multiple data objects map to the same level k block.
  • Capacity miss: The set of active cache blocks (working set) is larger than the cache.

Conflict Misses in Direct-Mapped Caches

  • Conflict misses are common in real programs and can cause performance issues.
  • They often occur in direct-mapped caches when programs access arrays whose sizes are powers of 2.
  • Example: In a function dotprod(float x[8], float y[8]), floats are 4 bytes, a block holds 16 bytes, and the cache has 2 sets (total cache size: 32 bytes).
  • If x starts at address 0 and y starts after x at address 32:x and y map to identical cache sets, can cause conflict misses.

Programmatic Solutions to Conflict Misses

  • Pad array sizes away from powers of 2 to prevent direct mapping conflicts.
  • Use more lines per set to increase associativity and reduce conflicts.

E-Way Set Associative Cache

  • An E-Way Set Associative Cache has E = 2 which means two lines per set, with block size of 8 bytes.
  • Associative memory is an array of (key, value) pairs.
  • Keys are a concatenation of Tag & Valid bits.

E-Way Set Associative Cache: Read Operation

  • E = 2 implies two lines per set.
  • Block size is assumed to be 8 bytes.
  • Compare both of them, and if valid? +match: yes means hit.

E-Way Set Associative Cache: No Match

  • No Match means One line in set is selected for eviction & replacement.
  • Replacement policy required, either random or LRU (least recently used)

2-Way Set Associative Cache: Simulation

  • A direct-mapped cache with (S, E, B, m) = (4, 2, 2, 4)
  • M = 16 bytes, m = 4 bits; B = 2 bytes/block; S = 2 sets
  • E = 2 blocks/set

Intel Core i7 Cache Hierarchy

  • L1 i-cache and d-cache: 32 KB, 8-way, Access: 4 cycles
  • L2 unified cache: 256 KB, 8-way, Access: 10 cycles.
  • L3 unified cache: 8 MB, 16-way, Access: 40-75 cycles
  • Block size: 64 bytes for all caches.

Writing Cache Friendly Code

  • Make the common case go fast by focusing on the inner loop.
  • Minimize the cache misses in the inner loop.
  • Repeated references to variables creates good temporal locality.
  • Stride-1 reference patterns creates good spatial locality.
  • Key Idea: Qualitative notion of locality is quantified through our understanding of the cache.

Cache Write Strategies

  • Strategies to handle multiple copies of data that exist must exist to cover L1, L2, possibly L3, and main memory.
  • Write-through: Write immediately to next level.
  • Write-back: Defer write to the next level until line is evicted.
    • Must track which cache lines are modified (dirty bit).
  • What to do on a write miss?
    • Write-allocate: Load into cache, update line in cache.
      • Works if more writes to location follow.
    • No-write-allocate: Writes straight to memory as such data is not loaded into cache.
  • Default write function:
    • Write-through + No-write-allocate
    • Write-back + Write allocate.

Rearranging Loops for Locality

  • Matrix multiplication is the canonical example: -Multiply N × N matrices, Matrix elements are doubles.
    • O(n³) total operations, N reads per source element.
    • N values summed per destination.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Cache Memory Quiz
1 questions

Cache Memory Quiz

HalcyonForesight avatar
HalcyonForesight
Cache Memory vs Primary Memory
8 questions
Memory Hierarchy and Cache
39 questions
Use Quizgecko on...
Browser
Browser