(10pts) You are evaluating three cache designs for the instruction cache.
The three designs are: direct mapped with one instruction per block,
direct mapped with 4 instructions per block and 2-way set associative
with 4 instructions per block. The instructions being executed are:
Address Instruction
======= ============================
4000 Loop: beq $s0, $zero, Exit # imm = 6, offset to Exit
4004 add $t0, $s0, $s2 # compute read address
4008 add $t1, $s0, $s3 # compute write address
4012 lw $t2, 0($t0) # read data
4016 sw $t2, 0($t1) # write data
4020 sub $s0, $s0, $s1 # subtract offset
4024 j Loop # imm = 1000 which is 4000/4
4028 Exit:
The layout for each cache will follow in the individual questions for
each cache. To determine where to store the instruction in the cache,
convert the address into a 32 bit binary number. Ignore the lower 2
bits of the memory address (since each instruction is 4 bytes long and
the address is a multiple of 4, the lower 2 bits will always be 00).
Use the remaining 30 bits to calculate the cache row address and tag
for each cache design.
- Direct mapped cache with 1 instruction per block
The cache row address and tag for this cache design will be calculated
as follows:
31 ... 6|5 ... 2|1 0
--------------------
| Tag | Row |0 0| instruction address
--------------------
/|\
|
ignore these bits
Fill in the cache and state how many cache misses this design has
assuming the code starts executing at the Loop: tag and that it
executes for two iterations.
Direct Mapped Cache - 1 instruction per block
Row (4 bits) |
Valid |
Tag (26 bits) |
Data (1 instruction) |
0000 (0) |
|
|
|
0001 (1) |
|
|
|
0010 (2) |
|
|
|
0011 (3) |
|
|
|
0100 (4) |
|
|
|
0101 (5) |
|
|
|
0110 (6) |
|
|
|
0111 (7) |
|
|
|
1000 (8) |
|
|
|
1001 (9) |
|
|
|
1010 (10) |
|
|
|
1011 (11) |
|
|
|
1100 (12) |
|
|
|
1101 (13) |
|
|
|
1110 (14) |
|
|
|
1111 (15) |
|
|
|
- Direct mapped cache with 4 instructions per block
When the cache has more than one instruction per block, the cache row
address is now divided into a row number and word offset as follows:
4 instructions per row, 2 bit word offset = address[3:2]
row number is 2 bits = address[5:4]
tag is 26 bits = address[31:6]
31 ... 6|5 4|3 2|1 0
--------------------
| Tag |Row| |0 0| address
--------------------
/|\
|
word offset
If there is a cache miss on one instruction in a row, all 4 instructions
for the row are pulled into the cache (e.g. all instructions that have
the same upper 28 bits as the instruction that caused the cache miss).
Fill in the cache and state how many cache misses this design has
assuming the code starts executing at the Loop: tag and that it
executes for two iterations.
Direct Mapped Cache - 4 instructions per block
Row (2 bits) |
Valid |
Tag (26 bits) |
Data (4 instructions, 2 bit word offset) |
Word 00 |
Word 01 |
Word 10 |
Word 11 |
00 (0) |
|
|
|
|
|
|
01 (1) |
|
|
|
|
|
|
10 (2) |
|
|
|
|
|
|
11 (3) |
|
|
|
|
|
|
- 2-way set associative cache with 4 instructions per block
The main difference between the 2-way set associative cache with 4
instructions per block and the direct mapped cache with 4 instructions
per block is that each row of the 2-way set associative cache contains
2 blocks instead of 1 block. The blocks are unrelated except for the
fact that they map to the same cache row address. If the cache is to
hold the same number of instructions as the direct mapped cache, then
the number of rows must be divided by 2 (since it is 2-way associative).
The other alternative is to double the number of instructions the cache
can hold. We'll use this alternative for this assignment and let this
cache hold 32 instructions instead of just 16.
Since the cache can now hold 32 instructions, the cache row address,
word offset and tag calculations are exactly the same as for the direct
mapped cache with 4 instructions per block. As with that design, if
there is a cache miss on one instruction in a row, all 4 instructions
for the block are pulled into the cache.
Fill in the cache and state how many cache misses this design has
assuming the code starts executing at the Loop: tag and that it
executes for two iterations.
2-way Associative Cache - 4 instructions per block
Row (2 bits) |
Valid |
Tag (26 bits) |
Data (4 instructions, 2 bit word offset) |
Word 00 |
Word 01 |
Word 10 |
Word 11 |
00 (0) |
|
|
|
|
|
|
|
|
|
|
|
|
01 (1) |
|
|
|
|
|
|
|
|
|
|
|
|
10 (2) |
|
|
|
|
|
|
|
|
|
|
|
|
11 (3) |
|
|
|
|
|
|
|
|
|
|
|
|