Assignment 5 Solutions

addem.s

Here's addem.s, a MIPS assembly language program to add up five numbers:

# addem.s

#   Add five numbers
#   C. Vickery
#   CS-343

#   Entry point
        .globl  main

#   Five data values to sum
        .data
alpha:  .word   5, -5, 10, -10, -123

#   main()
#   ------------------------------------------
        .text
main:   li      $t0, 0          # i   = 0
        li      $a0, 0          # sum = 0
        li      $t1, 0x14       # lim = 20

loop:   lw      $a1, alpha($t0) # x = alpha[i]
        add     $a0, $a0, $a1   # sum += x
        addi    $t0, $t0, 4     # i++
        blt     $t0, $t1, loop  # i < lim

        li      $v0, 1          # print_int()
        syscall
        li      $v0, 10         # exit()
        syscall

        .end
  

Answers to Questions

To begin, here is what the SPIM assembler generated for addem.s:

[0x00400024]	0x34080000  ori $8, $0, 0                   ; 14: li      $t0, 0          # i   = 0
[0x00400028]	0x34040000  ori $4, $0, 0                   ; 15: li      $a0, 0          # sum = 0
[0x0040002c]	0x34090014  ori $9, $0, 20                  ; 16: li      $t1, 0x14       # lim = 20
[0x00400030]	0x3c011001  lui $1, 4097                    ; 18: lw      $a1, alpha($t0) # x = alpha[i]
[0x00400034]	0x00280821  addu $1, $1, $8
[0x00400038]	0x8c250000  lw $5, 0($1)
[0x0040003c]	0x00852020  add $4, $4, $5                  ; 19: add     $a0, $a0, $a1   # sum += x
[0x00400040]	0x21080004  addi $8, $8, 4                  ; 20: addi    $t0, $t0, 4     # i++
[0x00400044]	0x0109082a  slt $1, $8, $9                  ; 21: blt     $t0, $t1, loop  # i < lim
[0x00400048]	0x1420fffa  bne $1, $0, -24 [loop-0x00400048]
[0x0040004c]	0x34020001  ori $2, $0, 1                   ; 23: li      $v0, 1          # print_int()
[0x00400050]	0x0000000c  syscall                         ; 24: syscall
[0x00400054]	0x3402000a  ori $2, $0, 10                  ; 25: li      $v0, 10         # exit()
[0x00400058]	0x0000000c  syscall                         ; 26: syscall
  

Question 1. The lw instruction specifies alpha($t0) as the effective address, but alpha was assigned to memory address 0x10010000, which does not fit in the 16-bit Address field of a lw instruction. So the assembler generated three instructions to compute the effective address in register $1 (also known as $at, the "assembler temporary" register). The lui instruction has the leftmost 16 bits of the address 0x10010000 as its immediate operand (the rightmost 16 bits of the lui instruction); this 16-bit value gets loaded into the leftmost 16 bits of $at, and the rightmost 16 bits of $at are set to zero. Since the address that alpha represents ends with 0x0000, the complete address of alpha has been loaded into $at by the lui instruction. If the rightmost 16 bits of the address alpha represents were not all zeros, the assembler would have generated an ori instruction to put the correct value into $at bits 0:15.

The second instruction is an addu (add unsigned) to add the contents of $8 (which is $t0, the register I used as the index into the alpha array) to $at. The assember generated an unsigned add because that's what lw instructions do: the value in the Address field of the instruction (bits 0:15 of the instruction) are a signed value, but the contents of rs ($t0 in this case) are unsigned. What's signed and what's unsigned is backwards from what you might intuit from an expression like "alpha($t0)" where you would think the array address (alpha) would be unsigned and the subscript (register $t0) would be signed. But it makes sense when you think of the "Address" field of the instruction actually being a (signed) offset rather than the base address of the array.

Finally, the assembler generated a conventional lw instruction using the complete effective address in $at for the rs register and zero for the offset.

Question 2. The li instructions were turned into ori instructions with $zero (register $0, the pseudo-register that can't be changed and which always provides the value 0x00000000) as the rt register. The assembler could have generated addi instructions, also with $zero for the rt register, with the same effect.

Question 3. The blt instruction generated a slt instruction followed by a bne. That is, the assembler used an slt instruction to compare the two registers, leaving the value 0 (false) or 1 (true) in register $at to indicate the result of the comparison. Then the bne conditionally branched back to the top of the loop if the comparison was true.

The SPIM assembler and simulator calculated the branch target address by subtracting the address of the target instruction (0x00400030) from the address of the bne instruction (0x0040048), a difference of negative 18 in hexadecimal or -24 in decimal. But the difference between the addresses of two instructions will always end with two binary zeros, so the architecture specifies that the target address field of the branch instructions drops those two bits and leaves the difference in word addresses rather than the difference in byte addresses in the instruction. Thus, the hexadecimal code for the instruction is 0x1420fffa, rightmost sixteen bits are 0xFFFA, and the decimal value of this is -6, which is -24 divided by 4.

As discussed in class, there is a discrepancy between how the SPIM software handles branches and how they are done in Chapter 5. Since the address of an instruction is provided by the PC register, the hardware calculates the branch target address by adding the address field of the instruction (sign extended and shifted left two places) to the PC. The datapath in Chapter 5 of the book calculates the branch target address using the value of the PC after it has been incremented by 4 and thus points to the instruction after the branch instruction, but the simulator and assembler use the address of the branch instruction itself. In class I used the terms "PC+4" and "PC" to talk about this difference.

Question 4. 38 clock cycles. Each instruction takes exactly one clock cycle to execute. There are three li instructions before the loop, and seven instructions inside the loop (three for the lw, an add, an addi, and two for the blt. The instructions inside the loop get executed 5 times each.

Question 5. Assuming the ALU control logic is extended so that an ALUOp input value of 112 causes the ALU function code to be based on the opcode field of the instruction instead of the func field, as discussed in class, the following will work:

Input or output Signal name addi andi ori slti
Inputs Op5 0 0 0 0
Op4 0 0 0 0
Op3 1 1 1 1
Op2 0 1 1 0
Op1 0 0 0 1
Op0 0 0 1 0
Outputs RegDst 0 0 0 0
ALUSrc 1 1 1 1
MemtoReg 0 0 0 0
RegWrite 1 1 1 1
MemRead 0 0 0 0
MemWrite 0 0 0 0
Branch 0 0 0 0
ALUOp1 1 1 1 1
ALUOp0 1 1 1 1

Question 6.

  Function generate_controlword receives instruction, returns control word
    Decode instruction, giving opcode
    case opcode in
      R-format: return RegDst | RegWrite | ALUOp1
      lw:       return ALUSrc | MemtoReg | RegWrite | MemRead
      sw:       return ALUSrc | MemWrite
      beq:      return Branch | ALUop0
  

Question 7. There are two listings: ControlWord.java, which includes the function to generate the control workds, and ControlUnit.java, which provides a wrapper for reading the file containing the instructions passing them to the control word generator, and showing the results.