CS 321 - Lab 10

Lab 10 - Caches

Purpose: To see how data and instructions are stored in a cache.

You are evaluating three possible cache designs: direct mapped with 1 word per row, directed mapped with 4 words per row and 2-way set associative with 4 words per row. The cache will be an instruction cache, so the data field will contain the instruction. You are evaluating the performance of the candidate caches on the following loop from Lab 7:

Memory Address    Label   Instruction
==============    =====   ====================
4000              Loop:   beq $s0, $zero, Exit   # imm = 6, offset to Exit
4004                      add $t0, $s0, $s2      # compute read address
4008                      add $t1, $s0, $s3      # compute write address
4012                      lw $t2, 0($t0)         # read data
4016                      sw $t2, 0($t1)         # write data
4020                      sub $s0, $s0, $s1      # subtract offset
4024                      j Loop                 # imm = 1000 which is 4000/4
4028              Exit:

Count how many cache misses there are for each cache on the first iteration of this loop.

The cache layouts are at the end of this lab. Print them to use as a worksheet to determine the cache misses for each cache. To determine where to store the instruction in the cache, convert the memory address to a 32 bit binary number. The least two significant bits are ignored in determining the row and tag as they indicate a specific byte (8 bits) within the 32 bit instruction. The next x least significant bits are the row number (where x is indicated in the cache diagrams below). The remainder of the address is the tag. Here's a picture of the breakdown:

 row number is 5 bits = address[6:2]
 tag is (32 - 2 - 5) = 25 bits = address[31:7]

   31 ... 7|6 ... 2|1 0 
   --------------------
  |  Tag   |  Row  |   |  address
   --------------------
                    /|\
                     |
                 byte offset, ignore

When the cache has more than one word per row, the row number is now divided into a row number and word offset as follows:

 4 words per row, 2 bit word offset = address[3:2]
 row number is 3 bits = address[6:4]
 tag is (32 - 2 - 2 - 3) = 25 bits = address[31:7]

   31 ... 7|6 5 4|3 2|1 0
   ----------------------
  |  Tag   | Row |   |   |  address
   ----------------------
                  /|\
                   |
               word offset

When you load one word (instruction) into a row, you also load all matching instructions into the remaining 3 slots. You determine the instructions you need to load by replacing the word offset with all possible values (00, 01, 10, 00 for this lab). For example:

 instruction address: 0000 1101 0011 0010 0000 1010 1111 0100
 4 words per row, 2 bit word offset = address[3:2]
 row number is 3 bits = address[6:4]
 tag is (32 - 2 - 2 - 3) = 25 bits = address[31:7]


 0000 1101 0011 0010 0000 1010 1|111 |01|00   address
              tag               |row |wd|

 byte offset: 00 (ignore)
 word offset: 01
 row number: 111
 tag: 0000 1101 0011 0010 0000 1010 1

 Other instructions to load have the word offsets 00, 10, 11, so they are:
   0000 1101 0011 0010 0000 1010 1111 0000
   0000 1101 0011 0010 0000 1010 1111 1000
   0000 1101 0011 0010 0000 1010 1111 1100

Your writeup should summarize how many cache misses there are for each cache for the first iteration of the loop. State which cache performed best and why it performed best. Also turn in the worksheet.

Direct Mapped Cache - 1 word per row
Row (5 bits) Valid Tag (25 bits) Data (1 word, 2 bit byte offset)

00000 (0)

00001 (1)

00010 (2)

00011 (3)

00100 (4)

00101 (5)

00110 (6)

00111 (7)

01000 (8)

01001 (9)

01010 (10)

01011 (11)

01100 (12)

01101 (13)

01110 (14)

01111 (15)

10000 (16)

10001 (17)

10010 (18)

10011 (19)

10100 (20)

10101 (21)

10110 (22)

10111 (23)

11000 (24)

11001 (25)

11010 (26)

11011 (27)

11100 (28)

11101 (29)

11110 (30)

11111 (31)

Direct Mapped Cache - 4 words per row
Row (3 bits) Valid Tag (25 bits) Data (4 words, 2 bit word offset and 2 bit byte offset)

Word 00 Word 01 Word 10 Word 11

000 (0)

001 (1)

010 (2)

011 (3)

100 (4)

101 (5)

110 (6)

111 (7)

2-way Associative Cache - 4 words per row
Row (2 bits) Valid Tag (26 bits) Data (4 words, 2 bit word offset and 2 bit byte offset)

Word 00 Word 01 Word 10 Word 11

00 (0)

01 (1)

10 (2)

11 (3)

Row (5 bits)	Valid	Tag (25 bits)	Data (1 word, 2 bit byte offset)
00000 (0)
00001 (1)
00010 (2)
00011 (3)
00100 (4)
00101 (5)
00110 (6)
00111 (7)
01000 (8)
01001 (9)
01010 (10)
01011 (11)
01100 (12)
01101 (13)
01110 (14)
01111 (15)
10000 (16)
10001 (17)
10010 (18)
10011 (19)
10100 (20)
10101 (21)
10110 (22)
10111 (23)
11000 (24)
11001 (25)
11010 (26)
11011 (27)
11100 (28)
11101 (29)
11110 (30)
11111 (31)

Row (3 bits)	Valid	Tag (25 bits)	Data (4 words, 2 bit word offset and 2 bit byte offset)
Row (3 bits)	Valid	Tag (25 bits)	Word 00	Word 01	Word 10	Word 11
000 (0)
001 (1)
010 (2)
011 (3)
100 (4)
101 (5)
110 (6)
111 (7)

Row (2 bits)	Valid	Tag (26 bits)	Data (4 words, 2 bit word offset and 2 bit byte offset)
Row (2 bits)	Valid	Tag (26 bits)	Word 00	Word 01	Word 10	Word 11
00 (0)
00 (0)
01 (1)
01 (1)
10 (2)
10 (2)
11 (3)
11 (3)