Introduction To MIPS Assembly Language Programming

Please contact the author at ckann@gettysburg.edu if you wish to adopt this book for a course - thank you. This book is available at The Cupola: Scholarship at Gettysburg College: https://cupola.gettysburg.edu/oer/2.

Introduction

1 Binary Numbers

1 Values for Binary Numbers

They often feel that the way the binary values were presented in their math classes (as true/false) is the "correct" way to represent them. They will all be used in this text, but most of the time the binary values 0 and 1 will be used.

2 Binary Whole Numbers

If a flip-flop has a positive voltage it is called high or on (true), and if it has 0 volts it is low or off (false). To work with the base number 2, it is necessary to know the value of powers of 2.

2 Converting Binary, Decimal, and Hex Numbers

1 Converting Binary to Decimal

2 Converting Decimal to Binary using Binary Powers

3 Converting Decimal to Binary using Division

4 Converting between binary and hexadecimal

The hexadecimal numbers can then be arranged in groups of 4 (or 32 bits) to facilitate translation from a 32-bit computer. Because I need 4 binary digits to print the hexadecimal representation of a binary number.

3 Character Representation

This is often the case where specific bits of a 32 bit number need to be set. To convert a numeric sign digit to a number, it is only necessary to subtract the sign value of 0.

4 Adding Binary Whole Numbers

Also note that all ASCII uppercase letters start with the binary string and are offset 1 for each new character. Therefore, all uppercase letters differ from their lowercase counterpart by a 1 in the digit 0100.

5 Integer Numbers (2's Complement)

This relationship between lowercase and uppercase letters will be used to illustrate logical operations later in this chapter.

1 What is an Integer

For example -3 is 3 units (is the magnitude) away from zero in the negative direction (is the sign). For now, just realize that it's incredibly complex to do arithmetic using signed-size numbers.

2 2's complement operation and 2's complement format

When learning math, negative numbers are represented using a signed-magnitude format, where a number has a sign (positive or negative) and a magnitude (or value).

3 The 2's Complement Operation

If for no other reason, this results in a 0 in the sign bit, and the negative number is made positive. The leftmost (or high) bit in the number is usually referred to as a sign bit, a convention that this text will continue.

4 The 2's Complement (or Integer) Type

Note that in the equation above, the bits in the number have simply been reversed, with 0 becoming 1 and 1 becoming 0. While positive numbers will start with a leftmost 0 and negative numbers with a leftmost 1, these are not just sign bits in the same sense as a signed magnitude number, but rather part of the representation of the number.

6 Integer Arithmetic

Because the range of integer values is limited to the 2n values (where n is the size of the integer) that can be represented by 2's complement, about half of which are positive and half of which are negative, approximately 2n-1 size values possible. Finally, as indicated in the previous section, just as zeros can be added to the left of a positive number without affecting its value, in 2's complement ones can be added to the left of a negative number without affecting its value.

1 Integer Addition

This happens when the two numbers being added are large positive or negative values, and combining the values results in numbers that are too large to store in the integer.

2 Overflow of Integer Addition

In Figure 1.5, there is a transfer of 1 and a transfer of 0, so there is an overflow. Here the input is 0 and the output is 1, so there is an overflow again.

3 Integer multiplication using bit shift operations

In Figure 1.3, the carry and carry for the last bit are both 0, so there is no overflow. Likewise, in Figure 1.4 the carry and carry are both 1, so there was no overflow.

4 Integer division using bit shift operations

7 Boolean Logical and Bitwise Operators

1 Boolean Operators

This text will deal only with the most common Boolean operators, the unary NOT (or inverse) operator, and the binary5 operators AND, OR, NAND, NOR, and XOR. These operators are usually characterized by their truth tables, and two truth tables are given below for these operators.

2 Logical and Bitwise Boolean Operators

Therefore | operator does not short-circuit, it will process each bit, regardless of whether a previous bit would cause the operation to fail or not. In most cases, the single-character operator is a bit-wise operator (eg &, |, etc.), and the double-character operator is the logical operator (&&, ||, etc.).

8 Context

In this case, the OR operator, |, must control each bit in the variables. For example, in the previous if statement, if a bitwise operator was not used short.

9 Summary

It is possible to use bitwise operators instead of logical operators, but this is usually incorrect.

10 Exercises

Note that the input parameter string for both functions can contain uppercase and lowercase letters, so you should not try to do this using addition. Would you consider using the bitwise implementation of multiplication and division for two variables, or would you always use the mul or div operators in MIPS assembly?.

First Programs in MIPS assembly

1 The MARS IDE

2 MIPS and memory

1 Types of memory

2 Overview of a MIPS CPU

Data from memory, user, or disk drives must first be loaded into a register before the CPU can use it. To use data from memory, the address and data to be read/written are placed on the system bus using a load/store command and transferred to/from memory to the CPU.

Figure 2-2: 3-address store/load computer architecture

3 Registers

As with many other things in this chapter, the meaning of each of the registers will become clear later in the text. Also note that in MARS only the register name in lower case is valid (e.g. $t0 is valid, $T0 is not).

4 Types of memory

3 First program in MIPS assembly

Once the program has been entered into the editing window, it must be assembled. If you made any mistakes, the errors will appear in a box at the bottom of the screen.

1 Program 2-1 Commentary

The main: tag does not need to be included, as MARS assumes that the program starts on the first line in the compound program. A list of all MIPS operators used in this text can be found in the MIPS Greensheet or by using the help option in MARS.

4 Program to prompt and read an integer from a user

1 Program 2-2 Commentary

5 Program to prompt and read a string from a user

1 Program 2-3 Commentary

The first is a reference to the memory to use to store the string (stored in $a0) and the second is the maximum size of the string to read (stored in $a1). Because the reference is passed, the actual value of the string can be changed in memory in the function.

Figure 2-6: Memory before entering a string

6 Summary

7 Java program for call by value and reference

So what is stored in b is the pointer to the array, or simply the address of b in memory. If the reference is copied, but the reference cannot be changed, the referenced value can.

8 Exercises

MIPS arithmetic and Logical Operators

1 3-Address machines

The second entry is specified by the operator type, two types of which will be discussed in this chapter. These will be discussed later, and the text and the reason for the format will be obvious.

2 Addition in MIPS assembly

1 Addition operators

When considering this operator, it is important to remember that in the real format I instruction, the immediate value can contain only 16 bits. The first step is to load the upper 16 bits of the number into a register using the Load Upper Immediate (lui) 9 operator, and then load the lower 16 bits using an Immediate Or operator (ori).

2 Addition Examples

Next, notice that in Figure 3-2, the first line of the program is highlighted in yellow. This means that the program is ready to be executed on the first line in the program.

3 Introduction to pseudo code

In addition, there will be a volatile modifier that is used later, which will mean that the programmer must use a memory variable. They tell the programmer that for the input a prompt must be written to the console, and a value from the user must be read and stored.

4 Assembly language addition program

Using the register modifier on the int declaration tells the programmer to use a storage register (s0.s7) if possible to hold these values. But the programmer can implement them any way they want, as macros, as subprograms, or directly in code.

5 Assembly language addition program commentary

3 Subtraction in MIPS assembly

In addition to the real operators, there are a number of pseudo sub-operators that use 32-bit direct values.

4 Multiplication in MIPS assembly

So a simple check for overflow when multiplying two positive numbers to see if the hi. So to show overflow in a, the result contained in the hi register must match all 0's or all 1's, and must match the high order (sign) bit of the lo register.

5 Division in MIPS Assembly

The operator divides Rs by Rt and stores the result in a pair of registers [hi,lo] with the quotient in lo and the remainder in hi. The specifics of the check will not be discussed at this point, as it involves the bne and break instructions.

1 Remainder operator, even/odd number checker

2 Remainder operator, even/odd number checker

6 Solving arithmetic expressions in MIPS assembly

7 Division and accuracy of an equation

In general, however, the simple rule to follow is that when there is a mixture of operations, including division by integers, code the expression to do the multiplication, addition, and subtraction before doing any division.

8 Logical operators

The operator performs a bitwise AND of Rs and the immediate value and stores the result in the Rt register. The operator performs a bitwise OR of Rs and the immediate value and stores the result in the Rt register.

9 Using logical operators

1 Storing immediate values in registers

2 Converting a character from upper case to lower case

3 Reversible operations with XOR

10 Shift Operations

The operator shifts the value in Rt bits to the left with the number in Rs, and replaces the shifted bits with 0s. The operator shifts the value in Rt bits to the right with the number in Rs, and replaces the shifted bits with 0s.

1 Program illustrating shift operations

11 Summary

12 Exercises

Prompt the user for all variables in the . expression and print the results in a meaningful way. Implement the idea gas law program where the user is prompted to enter values for n, T, and P and V is calculated and printed. Deploy an accurate version of this program.

Translating Assembly Language into Machine Code

1 Instruction formats

The operation to be performed by the ALU is contained in the function, which is the last field in the R instruction. It is the second input to the ALU when there is an rd specified in the instruction (as in ADD) and the destination register when rd is not specified in the instruction (as in ADDI).

2 Machine code for the add instruction

3 Machine code for the sub instruction

4 Machine code for the addi instruction

5 Machine code for the sll instruction

6 Exercises

Simple MIPS subprograms

1 Exit Subprogram

1 Commentary on Exit subprogram

An exit subroutine stops the program so that there is no return statement when it is finished. Once the program is written, it is very difficult to remember what these values represent, and if we do not document them, the subroutine is very difficult to use and maintain.

2 PrintNewLine subprogram

This means that in this case a jump operator (j) could have been used to transfer control to the Exit subroutine. The programmer must state the program name, author, purpose, inputs, outputs, and side effects of program execution.

1 Commentary on Exit subprogram

A tag containing the variable address for the new line was required in the PrintNewLine subroutine. The $a0 and $v0 registers have been modified, but this is not indicated as a side effect of the program.

3 The Program Counter ($pc) register and calling a subprogram

Part of the registers definition states that the storage registers ($s0.$s9) must contain the same value when the subroutine returns as they did when the subroutine was called. So while changing them is a side effect of the function call, it doesn't have to be sorted as implied in the .

Figure 5-1: $pc when program execution starts

4 Returning from a subprogram and the $ra register

The jal operator also sets the $ra register to the next consecutive address after the subprogram call. When the first subroutine is called, the value for the return is stored in the $ra.

5 Input parameter with PrintString subprogram

When a subroutine is called, the jal operator updates the $pc register to point to the address of the subroutine label. When the "jr $ra" instruction is called, the subroutine returns by transferring control to the address in the $ra register.

6 Multiple input parameters with PrintInt subprogram

7 Return values with PromptInt subprogram

8 Create a utils.asm file

9 Final program to prompt, read, and print an integer

Note that by using subprogram abstraction, how much easier this program is to understand than the one at the beginning of the chapter.

10 Summary

11 Exercises

When the program runs, it never ends and behaves like it's stuck in an infinite loop. Think of a mechanism that shows that this program is indeed in an infinite loop.

MIPS memory - the data segment

1 Flat memory model

So if you want to load a word of memory, you cannot specify the address 0x15fc8232 because it is not on a word boundary (but it is a byte boundary and a half word boundary). When talking about data, a word of memory is 4 bytes (32 bits) in size, but it also sits on a word boundary.

2 Static data

This image shows the contents of the memory data segment for the previous code snippet. The 34;ASCII" checkbox allows data in memory to be displayed as ASCII characters when appropriate.

3 Accessing memory

This is very powerful, but it is difficult to keep track of and very prone to error.

4 Methods of accessing memory

1 Addressing by label

2 Register direct access

3 Register indirect access

4 Register offset access

These variables are called pointer variables and are found in the compiled executable files for all programming languages. Having warned against the use of pointer variables in an HLL, there is no way to avoid their use in assembly.

5 Exercises

The basic control structures covered are: .. a) if statements b) if-else statements c) if-elseif-else statements d) control control loops e) counter control loops.

Assembly language program control structures

The reason for this is that although direct implementation of programs in assembly allows more flexibility, in assembly language programming programmers often implement these programs in a very unstructured way, which often results in bad programs and can develop into poor understanding and practices. The reader is strongly advised to follow the principles described in this chapter and not to attempt to develop programs directly in assembly language.

1 Use of goto statements

To re-emphasize the point, it is the author's experience that although many new assembly language programmers often try to escape the structured programming paradigm and think with an assembly language program, the results are rarely satisfactory.

2 Simple if statements

1 Simple if statements in pseudo code

2 Simple if statement translated to assembly

If the boolean is true, the code block is entered (the branch is not taken). This tag will be accessible from the branch statement that specifies whether or not to enter a block of code.

3 Simple if statement with complex logical conditions

If the test is false, branch to the end of the code block and so the code block is not entered. Thus, the simple if code fragment above is translated by: .. a) calculating the boolean value to check the insertion into the code block of the if statement.

3 if-else statements

So the complex if statement above would be translated into the equivalent of the following code fragment: You now have the basic structure of the if statement, and your code should have the following assembly code fragment. lw $t0, count.

4 if-elseif-else statements

5 Loops

1 Sentinel control loop

How to translate each of these looping structures from pseudocode to assembly language will be covered in the next two sections. 6) The necessary structure for the sentinel control loop is now in place. The logic to be executed in the code block can be included and any other code needed to complete.

2 Counter control loop

6 Nested code blocks

This complete code fragment is then placed into the program, resulting in the following complete program.

7 A full assembly language program

Although somewhat lengthy, this assembly code is straightforward to produce and relatively easy to follow, especially if the documentation at the beginning of the program includes pseudocode for the algorithm that was used. However, the code is much more extensive and the ability to use abstraction is greatly reduced.

8 How to calculate branch amounts in machine code

1 Instruction Addresses

2 Value in the $pc register

However, because the $pc is incremented to point to the next instruction, -3 words, or -12 bytes, must be subtracted from the $pc in the branch instruction. The following MARS screenshot shows that these are indeed the branch offsets for each of the branch instructions.

3 How the word boundary effects branching

4 Translating branch instructions to machine code

The beq instruction is an I-type instruction with an op code of 0x4, so the machine code translation of this instruction is 0x1100000f. This means we can subtract 18-13 (since the $pc has been updated), and the branch offset is 5.

5 PC relative addressing

The machine code translation of this instruction is 0x0100ffee. etc.) are implemented using branch declarations.

9 Exercises

You should print only the prime factors, not how many times they occur (for example, in the number 60 2 occurs twice). Press the least number of coins. quarters, pennies, nickels, and pennies) which will produce n.

Reentrant Subprograms

1 Stacks

1 Stack data structure: definition

2 Another stack implementation

Note that in the StringStack class, the size of the string is stored with the string on the stack, and the push and pop operations use the size to determine the amount of memory to be. The following illustrates this stack after the strings "ab" and "cde" have been pushed onto the stack.

2 The program stack

1 The non-reentrant subprogram problem

In the BadSubprogram subprogram example above, the subprogram makes a call to another subprogram, PrintString. From this MARS screenshot, notice that before the call to PrintString, the value of the $ra register is the address 0x00400004.

2 Making subprograms re-entrant

When the subroutine is entered, the stack pointer ($sp) register points to the current end of the stack. The stack frame for this subroutine is then popped by adding the amount of used memory back onto the stack.

3 Recursion

If the program does not need to share any data or transfer control outside the subprogram, the allocation of the stack frame can be avoided. Most of the problems at the end of the chapter will also fall into this category.

1 Recursive multiply in a HLL

It is difficult to present a true use of recursion without obfuscating the details of how recursion is implemented, or implementing more complex data structures that would require a full and separate treatment to explain. So prudence dictates that it be saved, and the only storage place to save it is on the stack.

4 Exercises

For example, if the user enters 5, your program will add 1+2+3+4+5 and print the answer 15. 7) Write a recursive program to calculate Fibonacci numbers.

Arrays

1 Heap dynamic memory

1 What is heap memory

Because heap memory is allocated when the user requests it at runtime, and the amount of memory can be determined at runtime. Heap memory is much more complicated to manage than static and stack memory, and whole books have been written about heap management.

2 Allocating heap memory example – PromptString subprogram

For example, in C/C++, memory is allocated using the free() function call or the delete operator. The biggest problem with heap memory is that since it is allocated and allocated into chunks of non-fixed size, once the memory is deallocated, it leaves multiple holes in the memory that are of different sizes.

3 Commentary on PromptString Subprogram

In the MARS screenshot below, notice that the heap memory is now displayed. The two input strings ("This is a first test" and "This is a second test") appear in heap memory, each taking 80 bytes.

2 Array Definition and creation in Assembly

The definition of an array becomes clear when the mechanics of accessing elements in an array are explained. The minimum data needed to define an array consists of a variable that contains the address of the beginning of the array, the size of each element, and the space to store the elements.

1 Allocating arrays in memory

Addressing arrays is not complicated, but it does require the programmer to remember what an address is in relation to a value, and to know how to calculate an array offset. Finally, to allocate an array on the heap, the number of elements to be allocated is multiplied by the size of each element to obtain the amount of memory to be allocated.

3 Printing an Array

4 Bubble Sort

1 Bubble Sort in MIPS assembly

Repeating this outer loop for all elements results in the array being sorted in ascending order.

5 Summary

6 Exercises

Hello World program

A label is just a tag in the code that can be used in other statements, since the la $a0 greeting instruction says to load the text at the label greeting into the $a0 register. The services are: service 4 prints a string starting at the address of the memory contained in the $a0; and service 10 stops, or exits the program.

Program to read an integer from the user

Program to read a string from the user

The parameters passed to the method are the string reference in $a0, and the maximum size of the string in $a1. Note that in the case of the string in $a0, the value for the string is contained in memory, and only the reference is passed to the function.

Addition Examples

Print tells the program to print the string and add the summation result. But the problem is that the lower 4 bits represent a positive number, so 1111 means that the lowest 1 (the underlined one) is really part of the multiplication result and not the sign extension.

Figure 3-3: Addition Example after running 1 step.

Even/odd number checking program

Program to show order of operations matters

In this case, the 3rd operator is immediate value, so it's just a shorthand for implementing the andi. The operator does a bit-wise OR of Rs and Rt and stores the result Rd register.

Letter case conversion in assembly

The operator shifts the value in the shift sum Rt (shamt) bits to the left, replacing the shifted bits with 0s and storing the results in Rd. The operator shifts the value in the shift sum Rt (shamt) to the right, replacing the shifted bits with 0 and storing the results in Rd.

Program illustrating shift operations

It is the second input to the ALU when an rd is specified in the instruction (as in ADD) and the destination register when rd is not specified in the instruction (as in ADDI).

Implementing and calling Exit subprogram

This is a special case, and all other subroutines in this text will execute a callback using jr. This text will use the camel layer conventions of lowercase for variables and labels in a program (eg inputLabel) and uppercase camel for subroutines (eg

Implementing and calling the PrintNewLine subprogram

This subroutine will eventually become part of a utility package that will be used by many programs. We will explore this problem in more detail in the exercises at the end of the chapter.

Input parameter with the PrintString subprogram

Multiple input parameters with PrintInt subprogram

File utils.asm

Final program to prompt for, read, and print an integer

All these details about how the memory is implemented are hidden by the operating system (OS) and not visible to the programmer. Note the drop-down menu at the bottom of the screen that says the memory being viewed is it.

Quadratic program pseudo code

Accessing memory variables by label

Register Direct Access

Register Indirect Access

Register offset access

The memory at the start of the .data segment (0x10010000) contains an address (or reference22) to the actual storage location for the constants a, b, and c. These variables are then accessed by loading the memory address into a register and using that address to locate the constants.

A simple program to show an if statement

One of the reasons why programs became complex before structured programming became mainstream is that programmers would try to solve these kinds of complex logic conditions using programming logic. The easy way to solve this problem is to realize that in an HLL, the compiler is going to reduce the complex logical condition to a single equation.

Assembly logic for ((x > 0) && ((x%2) == 0))

Assembly language logic for ((x > 0) && ((x%2) == 0) && (x < 10))

The purpose here is to show how to translate an if-else statement from pseudocode to assembler. You now have the basic structure of the if statement and your code should include the following assembly code snippet.

Assembly code fragement for an if-else statement

Program to implement if-elseif-else statement in assembly

How to translate each of these looping structures from pseudocode to assembly language will be covered in the next two sections.

Sentinel control loop program

Counter control loop program

Program illustrating nested blocks

The following variable can be stored in a storage register. register int inputGrade # user input grade. For the following program, translate all branch statements (b, beqz, bnez, etc.) into machine code, making sure you have calculated the relative branch values correctly. with $a0, big print PrintString EndIf:.

Figure 7-1: Instruction addresses for a simple program

Stack class definition

String class stack definition

When this register was overwritten, the BadSubprogram subroutine lost its connection back to the main subroutine. In this code $ra is returned to the value it contained when the subroutine was called, so the subroutine can return to the appropriate line in the calling program.

Recursive multiplication

Heap memory is generally dealt with when the programmer no longer needs it. Due to the complexity of handling heap memory deallocation management, this text will only deal with memory allocation.

PromptString subprogram showing heap allocation

An array is a multi-valued variable stored in a contiguous area of memory that contains elements that are all the same size. These HLL arrays always hide some abstraction, and knowing what a field actually is can help in understanding how HLL manipulates the array.

AllocateArray subprogram

Printing an array of integers

Bubble Sort