You are evaluating three possible cache designs: direct mapped with 1 word per row, directed mapped with 4 words per row and 2-way set associative with 4 words per row. The cache will be an instruction cache, so the data field will contain the instruction. You are evaluating the performance of the candidate caches on the following loop from Lab 7:
Memory Address Label Instruction ============== ===== ==================== 4000 Loop: beq $s0, $zero, Exit # imm = 6, offset to Exit 4004 add $t0, $s0, $s2 # compute read address 4008 add $t1, $s0, $s3 # compute write address 4012 lw $t2, 0($t0) # read data 4016 sw $t2, 0($t1) # write data 4020 sub $s0, $s0, $s1 # subtract offset 4024 j Loop # imm = 1000 which is 4000/4 4028 Exit:Count how many cache misses there are for each cache on the first iteration of this loop.
The cache layouts are at the end of this lab. Print them to use as a worksheet to determine the cache misses for each cache. To determine where to store the instruction in the cache, convert the memory address to a 32 bit binary number. The least two significant bits are ignored in determining the row and tag as they indicate a specific byte (8 bits) within the 32 bit instruction. The next x least significant bits are the row number (where x is indicated in the cache diagrams below). The remainder of the address is the tag. Here's a picture of the breakdown:
row number is 5 bits = address[6:2] tag is (32 - 2 - 5) = 25 bits = address[31:7] 31 ... 7|6 ... 2|1 0 -------------------- | Tag | Row | | address -------------------- /|\ | byte offset, ignoreWhen the cache has more than one word per row, the row number is now divided into a row number and word offset as follows:
4 words per row, 2 bit word offset = address[3:2] row number is 3 bits = address[6:4] tag is (32 - 2 - 2 - 3) = 25 bits = address[31:7] 31 ... 7|6 5 4|3 2|1 0 ---------------------- | Tag | Row | | | address ---------------------- /|\ | word offsetWhen you load one word (instruction) into a row, you also load all matching instructions into the remaining 3 slots. You determine the instructions you need to load by replacing the word offset with all possible values (00, 01, 10, 00 for this lab). For example:
instruction address: 0000 1101 0011 0010 0000 1010 1111 0100 4 words per row, 2 bit word offset = address[3:2] row number is 3 bits = address[6:4] tag is (32 - 2 - 2 - 3) = 25 bits = address[31:7] 0000 1101 0011 0010 0000 1010 1|111 |01|00 address tag |row |wd| byte offset: 00 (ignore) word offset: 01 row number: 111 tag: 0000 1101 0011 0010 0000 1010 1 Other instructions to load have the word offsets 00, 10, 11, so they are: 0000 1101 0011 0010 0000 1010 1111 0000 0000 1101 0011 0010 0000 1010 1111 1000 0000 1101 0011 0010 0000 1010 1111 1100
Your writeup should summarize how many cache misses there are for each cache for the first iteration of the loop. State which cache performed best and why it performed best. Also turn in the worksheet.
Row (5 bits) | Valid | Tag (25 bits) | Data (1 word, 2 bit byte offset) |
---|---|---|---|
00000 (0) | |||
00001 (1) | |||
00010 (2) | |||
00011 (3) | |||
00100 (4) | |||
00101 (5) | |||
00110 (6) | |||
00111 (7) | |||
01000 (8) | |||
01001 (9) | |||
01010 (10) | |||
01011 (11) | |||
01100 (12) | |||
01101 (13) | |||
01110 (14) | |||
01111 (15) | |||
10000 (16) | |||
10001 (17) | |||
10010 (18) | |||
10011 (19) | |||
10100 (20) | |||
10101 (21) | |||
10110 (22) | |||
10111 (23) | |||
11000 (24) | |||
11001 (25) | |||
11010 (26) | |||
11011 (27) | |||
11100 (28) | |||
11101 (29) | |||
11110 (30) | |||
11111 (31) |
Row (3 bits) | Valid | Tag (25 bits) | Data (4 words, 2 bit word offset and 2 bit byte offset) | |||
---|---|---|---|---|---|---|
Word 00 | Word 01 | Word 10 | Word 11 | |||
000 (0) | ||||||
001 (1) | ||||||
010 (2) | ||||||
011 (3) | ||||||
100 (4) | ||||||
101 (5) | ||||||
110 (6) | ||||||
111 (7) |
Row (2 bits) | Valid | Tag (26 bits) | Data (4 words, 2 bit word offset and 2 bit byte offset) | |||
---|---|---|---|---|---|---|
Word 00 | Word 01 | Word 10 | Word 11 | |||
00 (0) | ||||||
01 (1) | ||||||
10 (2) | ||||||
11 (3) | ||||||