Comparison of 80x86 and MIPS Architectures Learning assembly demands some understanding of computer architecture, in particular the Instruction Set Architecture (ISA) of the processor. The ISA is the hardware/software interface and this is where assembly language lives. This guide covers Intel 80x86, one of the oldest and most widely-used family of processors, and MIPS (developed in the 80s for embedded systems and used in Sony PlayStation2, DEC machines, CISCO routers... ). ISA design includes registers, instruction operands, memory and addressing modes, branches, procedure calls and instruction formats. Intel 80x86(x86) is a family of processors with backward compatibility to x86: 1971: 4004 introduced; first microprocessor; 4-bit CPU 1978: x86 introduced; cutting-edge 16-bit processor 1981: IBM uses the 8088 in first IBM PC project 1989: 80486 introduced; floating-point unit on main processor chip; RISC-based pipelining to increase performance 1997: Pentium 2; superscalar, multiprocessing, special instructions for multimedia applications 2002: Pentium 4; high clock rates (3.06 GHz); more multimedia instructions; large on-chip cache ************* * REGISTERS * ************* Registers are memory locations on the processor where work is performed. How many, how big and what type dictate instruction format. Fewer general-purpose registers means more data must be stored in memory, and more memory accesses will be needed. MIPS32 registers o 32 (0-31) 32-bit wide registers o $t0-$t9, $s0-$s7 are for programmer use - all others are reserved $zero(0) - contains a constant 0 $at(1) - used by assembler to converts pseudo-instructions $v0-$v1(2-3) - contain return values from procedures $a0-$a3(4-7) - store arguments for procedure calls $t0-$t7(8-15) - caller-saved registers; values not saved across calls $s0-$s7(16-23) callee-saved registers; values are preserved across calls $t8-$t9(24,25) - caller-saved registers; values not preserved across calls $k0(26), $k1(27) - reserved for assembler and OS $gp(28) - pointer to global area $sp(29) - pointer to top of current stack $fp(30) - pointer to start of current frame $ra(31) - holds the return address from a procedure call x86-32 registers o four general-purpose 32-bit registers: EAX EBX ECX EDX o four 32-bit registers generally used to address memory: ESP EBP ESI EDI o several 16-bit registers for the segmented memory model: CS SS DS ES FS GS o 32-bit register for instruction pointer/program counter: EIP o 32-bit register EFLAGS contains condition codes for branch instructions *********************** * Instruction Formats * *********************** MIPS32 uses 32-bit, three-address, register-to-register instructions: operation 3 operands --------- --------- add a, b, c / \ \ destination 2 sources Meaning: a = b + c, where a and b are registers, c is a register or constant. o three instruction formats: R-type, I-type and J-type o formats are very uniform, leading to simpler hardware. o each instruction is 32 bits, so easy to compute instruction addresses for branch and jump targets o fields are located in the same relative positions when possible o sufficient for most operations - if not, implement as multiple instructions x86 uses a two-address, register-to-memory instructions: operation 2 operands --------- ---------- add a, b / \ dest & src1 src2 Meaning: a = a + b, where a can be a register or a memory address, b can be a register, a memory reference, or a constant, a and b cannot both be memory addresses. x86 also has one-address instructions, where dest and src1 are implicit. o instruction formats range in size from 1 to 17 bytes, mostly due to complex and multiple addressing modes o x86 assembly more difficult for programmer, assembler and hardware o instruction decoding is very complex o harder to compute the address of an arbitrary instruction o some instructions appear in two formats - simpler but shorter one, and a more general but longer one o multiple ways to encode some instructions o non-orthogonality is confusing for programmers ********** * MEMORY * ********** MIPS Memory o byte-addressable o each address stores an 8-bit value o addresses can be up to 32 bits long, resulting in up to 4 GB of memory o one addressing mode: indexed addressing lw $t0, 20($a0) # $t0 = M[$a0 + 20] sw $t0, 20($a0) # M[$a0 + 20] = $t0 o The lw/sw instructions access one word, or 32 bits of data, at a time o a word is four *contiguous* bytes in memory o words must be aligned, starting at addresses divisible by four o attempting to read non-aligned data is a bus error x86 Memory o Memory is byte-addressable o original x86 had 20-bit address bus that could address just 1MB of RAM o modern x86 CPUs can access 64GB of main memory, using 36-bit addresses o x86 is a 16-bit processor so an x86 word is 16 bits o 32-bit quantity is a double word o data does not have to be aligned - programs can access data at any address x86 Segments o two 16-bit registers are needed to produce a 20-bit memory address o a segment register specifies the upper 16 bits of the address o another register specifies the lower 16 bits of the address o these two registers are added together in a funky, messy way: 4 bits 16 bits ----- 16-bit segment register + ---- ---- ---- ---- 16-bit offset register --------------------------- = 20-bit address o modern x86 processors support a flat 32-bit address space plus segments x86 addressing modes o immediate mode is similar to MIPS mov eax, 4000000 # eax = 4000000 o displacement mode accesses a constant address mov eax, [4000000] # eax = M[4000000] o register indirect mode uses the address in a register mov eax, [ebp] # eax = M[ebp] o indexed addressing is similar to MIPS mov eax, [ebp+40] # eax = M[ebp+40] o scaled indexed addressing does multiplication for you mov eax, [ebx+esi*4] # eax = M[ebx+esi*4] MIPS array accesses o scaled addressing will step through arrays with multi-byte elements o access word $t1 of an array at $t0 takes 3 instructions: mul $t2, $t1, 4 # $t2 is byte offset of element $t1 add $t2, $t2, $t0 # $t2 is address of element $t1 lw $a0, 0($t2) # $a0 contains the element x86 array accesses o accessing double word esi of an array at ebx takes 1 instruction: mov eax, [ebx+esi*4] # eax gets element esi MIPS branches and jumps o four basic instructions for branching and jumping bne beq j jr o other ways to branches are split into two separate instructions slt $at,$a0,$a1 # $at=1if$a0<$a1 bne $at, $0, Label # branch if $at != 0 o slt uses a temporary register to store a boolean value that is then tested by a bne/beq instruction. o branches and jumps implement conditional statements, loops, and procedure calls and returns **************** * FLOW CONTROL * **************** x86 branches and jumps o x86 chips contain a special register of status flags, EFLAGS o the bits in EFLAGS are set as a result of arithmetic and test instructions: S = 1 if the ALU result is negative O = 1 if the operation caused a signed overflow Z = 1 if the result was zero C = 1 if the operation resulted in a carry out o x86 ISA provides instructions to branch (called jump) if any of the flags are set or not set js/jns jo/jno jz/jnz jc/jnc ******************* * PROCEDURE CALLS * ******************* MIPS procedure calls o the jal instruction saves the address of the next instruction in $ra before transferring control to a procedure. o register conventions for passing arguments ($a0-$a3), returning values ($v0-$v1), and preserving caller-saved and callee-saved registers. o the stack is a special area of memory used to support procedures o procedures can allocate a private stack frame for local variables and register preservation o stack modifications are explicit by modifying $sp and using load/store instructions with $sp as the base register x86 procedure calls o control flow for x86 procedure calls involves two aspects o the CALL instruction is similar to jal in MIPS, but the return address is placed on the stack instead of in a register o RET pops the return address on the stack and jumps to it o arguments and return values can be passed either in registers or on stack o procedures are expected to preserve the original values of any registers they modify; i.e., all registers are callee-saved o the x86 also relies upon a stack for local storage o the stack can be manipulated explicitly through the esp register o the CPU also includes special PUSH and POP instructions, which can manage the stack pointer automatically ************************ * CISC v. RISC DESIGNS * ************************ A complex instruction set computer (CISC) is a computer design where single instructions can execute several low-level operations (such as a load from memory, an arithmetic operation, and a memory store) and/or are capable of multi-step operations or addressing modes within single instructions. The term CISC was retroactively applied to non-RISC architectures. Examples of CISC instruction set architectures are IBM System/360 through z/Architecture, PDP-11, VAX, Motorola 68k, and Intel x86. A reduced instruction set computer (RISC) instruction set architecture (ISA) is based on simpler instructions that execute more quickly, resulting in higher performance. The RISC family includes DEC Alpha, AMD 29k, ARC, ARM, Atmel AVR, Blackfin, PA-RISC, Power (including PowerPC), SuperH, SPARC and MIPS (Microprocessor without Interlocking Pipeline Stages). MIPS instruction set is much more simple than 80x86 and assembly is easier. RISC and CISC architectures share common features: o general-purpose registers o simple branch and jump instructions for control flow o stacks and special instructions implement procedure calls CISC (Complex Instruction Set Computer) o x86-based processors are examples of CISC architecture o in the 1970s memory was expensive and slow o keeping encodings of common instructions short helped in two ways + made programs shorter and saved memory space + shorter instructions can also be fetched faster o more complex, longer instructions were still available when needed o assembly programmers like more powerful instructions to make coding easier o compilers had to balance compilation and execution speed. RISC (Reduced Instruction Set Computer) o a processor design in response to problems with CISC o simpler instructions and formats a radical idea in the 1980s o RISC-based programs need more instructions -- harder to write than CISC o more instructions mean that RISC programs use more memory o but now memory is faster and cheaper o compilers generate code instead of assembly programmers o simpler hardware made advanced techniques like pipelining easier to implement Intel Pentium Processor Family o original IBM PC was 8088 - x86 with 8-bit data bus instead of a 16-bit one o continually adding to x86 instruction set o 8088 cheaper to design, and maintained compatibility with existing 8-bit memories, chipsets and hardware o 8088 registers were still 16 bits, requiring two cycles to transfer data between a register and memory o improved floating-point unit o backward compatibility with x86 a drawback: + hard to implement pipelining and superscalar architectures until recently + overall performance suffered o Pentiums now use RISC-based ideas: + complex x86 instructions are translated to simpler RISC-like instructions + deep pipelining is possible, improving performance + slower, inefficient instructions in the ISA (for x86 compatibility) can be avoided o Pentiums now have MMX, SSE and SSE2 instructions for parallel computing o Intel 80386SX used 16-bit data bus whereas 80386 had a 32-bit bus o less expensive processors (Intel Celeron and AMD Duron) have smaller caches and/or slower buses than more expensive cousins o price v. performance - cheaper processors often mean reduced performance o low-power processors: Intel Mobile Pentiums - AMD has Mobile Athlons; Transmeta Crusoe and IBM/Motorola PowerPC o Pentium backward compatibility is a strength but also prevents enhancements to CPU design o Intel Itanium is a completely new designed 64-bit processor did not succeed o AMD is a 64-bit, backward compatible extension to the x86 architecture