You are on page 1of 23

Lecture 9

LEARNING GOALS:

1. Learn what are the Motorola 68000 user registers.

2. Learn how the 68000 memory space is organized.

3. Familiarize yourself with the fundamentals of 68000 assembly language programming.

TABLE OF CONTENTS

9.1 THE 68000's SET OF REGISTERS

9.2 THE MEMORY SPACE OF THE 68000 PROCESSOR

9.3 ELEMENTARY PROGRAMMING IN 68000 LANGUAGE

9.3.1 Some basic 68000 instructions

9.3.2 The size of an operation

9.3.3 Some Assembler Directives

9.3.4 A complete 68000 program

9.1 THE 68000's SET OF REGISTERS


The Motorola 68000 processor has 18 registers that are directly accessible by the user. Those are:

8 general-purpose data registers numbered from D0 to D7


8 address registers numbered from A0 to A7
1 Program Counter (the PC)
1 Status Register (the SR), which is divided in 2 parts: the System Byte, which we'll consider in later lectures, and the Condition Code Register (the
CCR).

Figure 9.1
The 68000 User Registers

Here are the important points you should know about the 68000 user registers:

All the data registers, all the address registers and the Program Counter are 32-bits (4 bytes) wide.
The Status Register, or SR, is 16 bits (2 bytes) wide. Only the low-order byte of the SR, which is called the CCR, or Condition Code Register, can be
accessed by the user. The high-order byte of the SR, the so-called System Byte, can be seen and accessed only by the Operating System during special
emergency cases called interrupts and exceptions which will be discussed in the second part of the course. Until then, we can forget about the System
Byte and work only with the CCR.

The data registers are used to store any data. They are general-purpose registers, because they haven't been reserved for any specific task by the 68000
chip designers, and they are interchangeable, in the sense that whatever you can do with register Di you can also do with register Dj. There are some rare
instructions that require a specific register as an operand, but those are really special cases and we won't see many of them.

The address registers are used to store addresses of locations in main memory. In other words, address registers are pointers to locations in memory.
Registers A0 to A6 are general-purpose and interchangeable, just like their cousins the data registers. Register A7, also referred to as SP, is more special:
it is the processor's stack pointer. It is used by the system to maintain a stack of subroutine return addresses (to be discussed later). You are free to access
and modify the contents of register of A7, just like any other address register, but you always have to bear in mind what A7 is used for. Note that any
address register can be used to maintain a stack. Thus, it is possible to handle several stacks at the same time. Note also that although you can store an
address of a memory location in a data register, you cannot use a data register as a pointer to that location. The differences between address and data
registers will become more clear when we start looking at the instructions that use those registers.

As we've already mentioned, the Program Counter is a register used to store the address in main memory of the next instruction to be executed. It is 32
bits wide, because memory addresses in the 68000 system are 32-bit numbers. The contents of the PC are automatically updated as each instruction is
fetched and executed. The contents of the PC are always an even number, because an instruction can begin only at an even address. The user has a more
restricted access to the PC than to a data or address register. We'll see later how the PC can be put to use in a program.

Bytes, words, longwords and bit numbering

As we said, most of the 68000 user registers are 32 bits wide, i.e. they can accommodate 32 bit numbers. In order to be able to reference a particular bit in the
register, each bit is given a number. The convention is to number the bits from 0 to 31, bit 0 being the least significant, i.e. the rightmost bit, and bit 31 being the
most significant, i.e. the leftmost bit. In other words, the bit numbering starts at 0 at the right end and goes from right to left in increasing order (see figure 8.2).

This numbering system is called little-endian (pronounce Little Indian) numbering, because if you read the bit numbers from right to left you have a little
number at the end. The opposite system, called big-endian (pronounce Big Indian), consists in starting the numbering at 0 at the most significant (i.e. leftmost)
bit and go in increasing order to the right, until the rightmost bit receives number 31 (you have a big number at the right end). There is nothing wrong with this
numbering system, but the Motorola chip designers decided to go with the little-endian system, because it is more logical and more convenient to use.

Registers have the potential to accommodate 32 bit numbers (one longword), but they don't limit the user to perform operations only on 32 bit numbers. In fact,
the programmer can work with either

the entire register size, i.e. a longword (bits 0 through 31), or


only the least significant word (bits 0 through 15), or
only the least significant byte (bits 0 through 7), in the case of data registers only.
Figure 9.2

The CCR

Let's now look in more detail at the CCR. The CCR is a very important piece of hardware because it allows conditional behavior (i.e. high level constructs of the
kind if X then Y; else Z;) to be implemented, and the Control Unit often bases its decisions on the contents of the CCR.

Here is the basic setup of the CCR:

Figure 9.3
The CCR

The CCR is an 8 bit register, but only bits 0 to 4 are actually used. Bits 5, 6 and 7 are always ignored, and their value doesn't matter. It may be assumed that they
are always set to 0. Bits 0 to 4 are flags, each of whom is affected in specific ways by the various operations performed by the CPU. Almost every instruction
that is executed by the CPU forces an update on the value of one or more CCR bits.

Here is the function of each of the five CCR flags, or bits:


Bit 0, known as the C bit, is the carry bit. It is set (to 1) whenever the result of an operation generates a carry coming from the most significant bit of the
result, and cleared (set to 0) otherwise.
Bit 1, known as the V bit, is the overflow bit. It is set if an operation results in arithmetic overflow in terms of two's complement arithmetic, i.e. when the
result of an operation is too large or too small to be handled by the destination operand.
Bit 2, known as the Z bit, is the zero bit. It is set when the result of an operation is zero (i.e. all the bits of the result are 0), and is cleared otherwise.
Bit 3, known as the N bit, is the negative bit. It is set when the result of an operation is negative, and cleared when the result is positive. In other words,
the N bit holds the value of the most significant bit of the result of the operation (remember two's complement numbers).
Bit 4, known as the X bit, is the extend bit. It often has the same value as the C bit. It is used in multiple-precision arithmetic, i.e. arithmetic involving
numbers greater than 32 bits in size, to hold the carry.

We'll come back to the CCR when we start examining the 68000 assembler instructions. But before attacking the actual 68000 language, let's examine the
organization of the 68000 memory space.

Back to top of page

9.2 THE MEMORY SPACE OF THE 68000 PROCESSOR

The memory space of the 68000 processor is one big linear array of memory locations, each of them being able to store one byte. The memory is said to be
byte-addressable, i.e. each byte within the memory has its own unique address and can be accessed directly. Note that the memory of the 68000 is not bit-
addressable, which means that you cannot access data in memory bit by bit. That also means that you cannot start reading memory in the middle of a memory
location, only at the beginning. Later versions of the 68000 processor, like the 68020, allow you to overcome those access restrictions.

By convention, the memory locations are numbered in a big-endian order. If the memory space is represented vertically, then the memory locations with a small
address number are found at the top, and those with a big address number are found at the bottom.

Figure 9.4
The linear memory of the 68000.
Each memory location is one byte in size. The 32 bit addresses are given as 8 hexadecimal digits.
As you know, the 68000 has a 32 bit Program Counter and 32 bit address registers. This is so because addresses of locations in memory are 32 bit numbers, and
consequently you can address up to 232 locations, i.e. 232 bytes, or 4 gigabytes (each memory location is one byte). In order to do so, you need a 32 bit address
bus to carry a 32 bit address number from the CPU to memory and vice versa. The 68000 and its successors the 68020, the 68030, etc. have indeed a 32 bit
address bus. However, the 68000 processor as such is a special case, because due to space restrictions only the first 24 lines of the address bus (address lines 0 to
23) actually leave the chip and connect it to memory. Address lines 24 to 31 are simply not brought off-chip. Since one address line transports one bit, that
means that only bits 0 to 23 within a 32 bit address are used to specify a memory location. For example, if you write a program to access locations 8012345616
and 4512345616, you will access the same physical location. The state of bits 24 to 31, represented by the two leftmost hexadecimal digits, simply doesn't
matter. In other words, the 68000 behaves as if its addresses are 24 bit quantities, not 32 bit quantities. That means that the addressable memory space of the
68000 is in practice only 224 bytes, or 16 megabytes. Note that addresses in the 68000 are still represented as and stored as 32 bit numbers, even if only the first
24 bits of those numbers are actually used.

This limitation does not exist with the newer members of the 68000 family. The 68020, 68030 and 68040 have a fully connected 32 bit address bus and a true
address space of 4 gigabytes.
Figure 9.5
The address bus of the 68000

Back to top of page

9.3 ELEMENTARY PROGRAMMING IN 68000 LANGUAGE

9.3.1 Some basic 68000 instructions


The 68000 assembly language, like any other assembly language, is composed of two types of statements: the assembler directive and the executable
instruction. An executable instruction is one of the processor's valid instructions which is translated by the Assembler into machine language and actually
executed by the CPU. An assembler directive, on the other hand, is just an indication to the Assembler about the program and its environment. Assembler
directives are not translated into machine language.

Executable instructions can be divided into several categories:

data movement instructions


integer arithmetic instructions
logic instructions
shifts and rotations
bit operations
comparisons
program control instructions
etc.

The most common 68000 instructions and assembler directives, as well as the most common addressing modes are described in detail in the NeXT textbook
written by David Cloutier. We will therefore simply introduce the most important topics covered in the NeXT textbook, and we'll concentrate on giving some
examples of programs, Motorola syntax, and interfacing assembly language with C.

Let's introduce some instructions, and then we'll put them to use in a simple program.

Instruction RTL representation Description in words


MOVE #N,D1 [D1] <- N Register D1 is loaded with the number N
The contents of register D1 are copied to memory
MOVE D1,L [M(L)]<- [D1]
location L.
MOVE D1,D2 [D2] <- [D1] The contents of register D1 are copied to register
D2
The contents of register D3 are added to the
ADD D3,D7 [D7] <- [D3] + [D7] contents of register D7, and the result is stored in
register D7
ADD #N,D0 [D0] <- [D0] + N The number N is added to the contents of register
D0 and the result is stored in D0

SUB #N,D0 [D0] <- [D0] - N The number N is subtracted from the contents of
register D0 and the result is stored in D0
The contents of register D1 are subtracted from the
SUB D1,D5 [D5] <- [D5] - [D1] contents of register D5, and the result is stored in
register D5
Subtract the number N from the contents of register
CMP #N,D2 [D2] - N
D2. The result is discarded and the CCR is set up.
CMP D1,D2 [D2] - [D1] Subtract the contents of D1 from the contents of D2.
The result is discarded, and the CCR is set up.
BEQ X IF CCR(Z) = 1 THEN [PC]<- X Branch to location X if the Z bit of the CCR is set,
i.e. if the previous operation yielded zero as result.
Branch to location X if the Z bit of the CCR is
BNE X IF CCR(Z) = 0 THEN [PC]<- X cleared, i.e. if the previous operation didn't yield
zero as result.

Here is how some of the above instructions can be used in an assembly language program. Consider the following C code fragment:

x = 0;
y = Q;
if (y == 5)
x = x + y;
y = y - 6;
x = y;

Here is an assembly language program that executes the above code:

MOTOROLA SYNTAX
MOVE #0,D0 x = 0; loads D0 with the value 0;
* we use D0 to represent x
MOVE Q,D1 y = Q; loads D1 with Q;
* we use D1 to represent y; Q is a reference
* to a memory location
CMP #5,D1 Compare the number 5 with D1 (y)
BNE EXIT_IF If not equal, then branch to(go to)label EXIT_IF
ADD D1,D0 x = x + y; this statement is executed
* only if y == 5
EXIT_IF SUB #6,D1 y = y - 6; subtracts 6 from D1
MOVE D1,D0 x = y; moves the value of y (D1) into x (D0)
MILO SYNTAX
move #0,d0 |x = 0; loads D0 with the value 0;
# we use D0 to represent x
move q,d1 |y = Q; loads D1 with q;
# we use D1 to represent y; q is a reference
# to a memory location
cmp #5,d1 |Compare the number 5 with D1 (y)
bne exit_if |If not equal, then branch to(go to)label exit_if
add d1,d0 |x = x + y; this statement is executed
# only if y == 5
exit_if: sub #6,d1 |y = y - 6; subtracts 6 from D1
move d1,d0 |x = y; moves the value of y (D1) into x (D0)

You certainly recognize the 4 fields in the layout of this program: the label field, the instruction field, the operands field, and the comments field. Here are some
points to note about the above example:

The instructions are executed in a sequence from top to bottom; only instructions of the form Bcc LABEL or Jcc LABEL, i.e. Branch on Condition Code
or Jump on Condition Code (see the NeXT textbook for more details) can force a non-sequential execution of the instructions by loading the PC with the
address of the instruction bearing the label LABEL. This is how conditional statements and loops are implemented in assembly language.

You are maybe starting to appreciate the presence of 8 general purpose data registers which provide you with a lot of on-chip working space. Note that the
above operations could have been implemented by using memory locations instead of data registers, but the program would have been much slower.

Note how the if statement is constructed. First we execute a CMP instruction. Its effect is to substract 5 from the contents of D1 without storing the result
in D1, i.e. without affecting the value stored in D1. This instruction is used only to set up the CCR bits, so that a branch instruction can be executed next.
We execute the BNE (Branch on Not Equal) instruction, which means that if 5 is not equal to the contents of D1 we will take a branch to the instruction
labeled EXIT_IF. How does the CPU know if 5 is equal or not to the contents of D1? Well, if the CMP instruction yielded 0 as result, then the Z bit would
be set and the branch will not be taken. If [D1] - 5 is not equal to 0, then the Z bit will be cleared (set to 0) and the branch will be taken. If the branch is
taken, the PC is loaded with the address of the instruction labeled EXIT_IF, if the branch is not taken, then the instruction directly following the Branch
instruction is executed.

Other conditional constructs found in if statements, loops etc. are executed in a very similar manner, by using Branch or Jump instructions based on the
value of one or more CCR bits.

Back to top of page

9.3.2 The size of an operation


There is nothing wrong with the algorithm of the above program; however, the program written as such would probably not run on a real machine. This is
because it suffers from one major oversimplification: the size of the operands is not indicated anywhere.

The 68000 allows you to work with operands of 3 different sizes: bytes, words, and longwords; registers can be used to store bytes, words, or longwords. For
example, when you work with character data, you may want to work with bytes; if you work with integers you'll probably work with words, or longwords. The
Assembler who translates your program into machine code has no way to know when you want to perform a longword operation and when a word operation
unless you explicitly specify its size.
Operations on 32 bit numbers are specified by appending the suffix .L to the end of the instruction mnemonic, operations on 16 bit numbers are specified by
appending a .W suffix to the end of the instruction mnemonic, and operations on 8 bit numbers are specified by appending a .B suffix to the end of the
instruction mnemonic.

Assembly language instruction RTL form Picture

MOVE.B D0,D1 [D1(0:7)] <- [D0(0:7)]

MOVE.W D0,D1 [D1(0:15)] <- [D0(0:15)]

MOVE.L D0,D1 [D1(0:31)] <- [D0(0:31)]

It is very important to keep in mind that only the bits specified by the size of the operation are affected by the operation. For example, if register D4 contains the
number FFFFFFFF16 and you perform the operation
ADD.W #1,D4
the result (which will be stored in D4 and overwrite the previously held value) will not be 0000000016, as it may be expected at first sight, but rather
FFFF000016.

Furthermore, the value of the CCR bits calculated after an operation is also determined by the size of the operation. Thus, in the above case:

the Z bit will be set, because the operation yielded a zero result, even if D4 as a whole does not contain 0.
the C bit will be set since a carry was generated from bit 15.
the N bit will be cleared since bit 15 is 0.
the V bit will be set since the operation resulted in arithmetic overflow: FFFF16 + 000116 = 1000016, which cannot be stored within a 16 bit word. We
started with FFFF16 as operand, we end up with 0000 as result. The sign of the result has changed, that is enough to set the V bit.

Note that the CPU doesn't know whether you perform signed or unsigned arithmetic, or any other kind of operation; it updates its CCR bits blindly, and
it's up to you to decide what use to make of the CCR bits.

The size of an operation is specified in a slightly different way in the Milo syntax. The Motorola syntax uses a dot to separate the instruction from the size,
while the Milo syntax does not use a dot, it simply appends the letter of the size to the end of the instruction.

Motorola syntax Milo syntax


MOVE.W movew

Let's now look at another simple program which specifies its operand sizes. Consider the following fragment of C code:

/*
We assume we have the following declarations:
char C = 'A';
int X = 0x100;
long int Y = 0x2000A111;

We assume chars are 1 byte in size,


ints are 2 bytes in size,
and long ints are 4 bytes in size;
*/

X++ ;
if (C != 'B')
X -= 0x5;
Y += 0x9001;

Here is the equivalent fragment of assembly language code:

MOTOROLA SYNTAX
* First, fetch the data from memory
*
MOVE.W X,D1 Fetch X and place it in D1. Note: X is 2 bytes!
MOVE.L Y,D2 Fetch Y and place it in D2. Note: Y is 4 bytes!
MOVE.B C,D3 Fetch C and place it in D3. Note: C is 1 byte!
*
ADD.W #1,D1 Executes X++
CMP.B #$42,D3 Compares the ASCII code for 'B' (0x42) to C
BEQ EXIT_IF Go to label EXIT_IF,thus skipping the next instruction,
* if C == 'B'
SUB.W #$5,D1 Executes X -= 0x5
EXIT_IF ADD.L #$9001,D2 Executes Y += 0x9001
MILO SYNTAX
# First, fetch the data from memory
#
movew X,d1 |Fetch X and place it in d1. Note: X is 2 bytes!
movel Y,d2 |Fetch Y and place it in d2. Note: Y is 4 bytes!
moveb C,d3 |Fetch C and place it in d3. Note: C is 1 byte!
#
addw #1,d1 |Executes X++
cmpb #0x42,d3 |Compares the ASCII code for 'B' (0x42) to C
beq exit_if |Go to label EXIT_IF,thus skipping the next
# instruction, if C == 'B'
subw #0x5,d1 |Executes X -= 0x5
exit_if: addl #0x9001,d2 |Executes Y += 0x9001. Note Milo indicates
# hex numbers with 0x, just like in C

This program is intended to show you the importance of operand sizes. Let's assume that C, X, and Y are originally stored in memory as shown in the following
diagram:

Figure 9.6
The above diagram represents an area of memory where our 3 variables are stored. Note that all numbers are in hexadecimal. Note also the big endian ordering
of memory. Variable C, which holds the ASCII code for A (41 hexadecimal), is stored in location 1001. C takes up only one location, because char type
variables are one byte in size. Variable X is an int, it is 2 bytes in size, therefore it takes the next two locations 1002 and 1003. Y is a long int, it is stored in 4
consecutive locations from 1004 to 1007.

When the instructions

MOVE.W X,D1
MOVE.L Y,D2
MOVE.B C,D3

are executed, here is what happens:

Figure 9.7
Note very carefully the order in which bytes are transferred from memory to registers and vice versa: the most significant byte is stored at the smallest
address in memory, and the least significant byte is stored at the greatest address. A word at location N occupies byte addresses N and N+1. A longword at
location N occupies byte addresses N, N+1, N+2 and N+3.

What would have happened if we had done, let's say, MOVE.W Y,D2? There is nothing illegal with such an instruction, but it would be incorrect in our case. The
least significant word of D2, i.e. bits 0 to 15, would hold 200016 and bits 16 to 31 would stay untouched.

It is important to remain consistent with the operand sizes throughout the entire program. When you know that your D2 holds a 32 bit number, keep on
performing 32 bit operations on that register until you start using the register for something else. What would have happened if you had performed ADD.W
#$9001,D2 instead of ADD.L #$9001,D2? After the ADD.W #$9001,D2 instruction, register D2 will hold the value 2000311216. Your program will not crash
because of that, but it will give you an incorrect result. The correct result should be 2001311216, but you get 2000311216 because a word operation affects only
bits 0 to 15 and leaves bits 16 to 31 unaffected. Thus, the carry from bit 15 which should normally be added to bit 16 goes instead to the C bit of the CCR.

In the above example, D3 and D1 are not used to their full capacity. In fact, it is possible to use bits 8 to 31 in D3 and bits 16 to 31 in D1 to store other useful
information not related to this program. Such "multiple-purpose" use of data registers is perfectly legal, but highly discouraged because it is confusing and may
lead to errors.

Back to top of page

9.3.3 Some Assembler Directives


The above example raises several questions:

1) How does the computer know exactly where in memory variables C, X, and Y are actually stored?
2) And how do you declare variables in assembly language? What is the assembly language equivalent of char C = 'A'; ?
3) How do you declare constants?
4) How can the programmer participate in the management of memory space? How do you indicate the start and the end of a program?

In this section, we'll answer the above questions.

Question 1:

The letters C, X, and Y are identifiers that refer to address of locations in memory, in the above example C refers to 1001, X refers to 1002 and Y refers to 1004.
In general, programmers don't have to worry about the actual numerical address, as the computer automatically takes care of that, as we'll see later. However, if
for some reason you want to store one byte of information in a specific location, e.g. 100116, then you can use the assembler directive EQU to equate the name
C to the value 100116. The syntax of the EQU directive:

identifier EQU expression

Thus, you can write


C EQU $1001

This way, you give a name to the number 100116, and you can use the name and the actual numerical value interchangeably in calculations and other
expressions.

C EQU $1001 Copies the contents of location 100116 to


.... MOVE.B $1001,D1
D1 and is equivalent to
MOVE.B C,D1

In the above example, we use EQU to equate the name C to the absolute address $1001. The EQU directive can be used in other circumstances as well. For
example you can have

LENGTH EQU 35
WIDTH EQU 15
AREA EQU LENGTH*WIDTH

Later in your program, you can use the identifiers instead of the actual numerical values they represent, and the assembler will take care of replacing the
identifiers with the numerical values. The use of EQU is very similar to the use of #define in C. It makes the program more clear and readable. Just beware of
illegal forward references:

LENGTH EQU 35 This code won't work


AREA EQU LENGTH*WIDTH because at this stage
WIDTH EQU 15 WIDTH hasn't been declared yet

Note that identifiers equated to a value through the EQU directive are still to be treated as literals, and prefixed by the # sign, like in MOVE.W #LENGTH,D4.

The equivalent of EQU in the Milo syntax is the .set directive, its syntax is:
.set identifier,expression

This directive tells the Assembler to replace all the occurence of identifier with expression. Consult with the NeXT textbook for details.

Question 2:

The use of the EQU directive is not the method of choice for declaring variables and constants. There is a specific assembler directive used for purposes
equivalent to variable declarations in high-level languages. This directive is DS and is qualified by .B, .W or .L. Its syntax:

NAME DS.S <amount of storage space> where S stands for size, i.e. B, W, or L.

DS means "define storage" and it reserves, or allocates, storage locations in memory. Here is how it works:

C DS.B 1 Reserves one byte of memory and calls it C


X DS.W 1 Reserves one word of memory and calls it X
Y DS.L 1 Reserves one longword of memory and calls it Y
ARRAY DS.W 25 Reserves 25 words of memory and calls this space ARRAY
TABLE DS.B $80 Reserves 128 bytes of memory and calls them TABLE
ADRS DS.L 5 Reserves 5 longwords for variable ADRS

When you use the DS directive, you don't care anymore about the absolute numerical address of the location(s) where the variables are stored, everything is
being taken care of by the Assembler. Just as in a high level language, you don't need to know where a variable is stored, only that space is being allocated
somewhere in memory for that variable.

The Milo syntax for the DS directive is somewhat quite different. In Milo, if you want to reserve one word for a variable called milovar and initialize it to the
value 16 you will write

milovar: .word 16
Don't be confused by the number 16: here you don't reserve 16 words for the variable milovar, you reserve one word and give it the initial value 16. For more
details about Milo directives, consult the NeXT textbook.

Question 3:

Constants are declared by the directive DC, which means "define constant". DC is qualified by .B, .W or .L, depending on the storage space that the constant is
meant to occupy . Its syntax:

NAME DC.S constant where S stands for size, i.e. B, W, or L. All the occurences of NAME will be replaced by the corresponding constant.

Here is how it is used:

MYPAY DC.B 12 The value $0C is stored in one byte of memory


and is given the name MYPAY.
DC.W 3,$3B2 The values $0003 and $03B2 are stored
in consecutive locations, each of them taking up 2 bytes
DC.L $DEADBEEF The value $DEADBEEF (4 bytes) is stored in memory
DC.L $3B2 The value $000003B2 is stored in 4 bytes of memory
DC.B 'Arnie' The ASCII characters are stored as 5 bytes

The constant that is being stored with the DC directive can be a decimal number, a hex number prefixed with $, a binary number prefixed with %, or an ASCII
string enclosed in single quotes. You can store constants in consecutive locations by separating them with a comma. We'll show you shortly how they are
actually stored in memory.

You can combine the DC directive with other directives, for example

START EQU $20C8 Equate START to $20C8


OFFSET EQU 32 Equate OFFSET to $0020
DC.L 4*START+OFFSET Store the longword 4*$20C8+16 in memory

To declare constants with Milo, you could use the same directives used for variable declaration, i.e. .byte, .word and .long. Consult with the NeXT textbook
for details.

Question 4:

There is a directive called ORG, the origin directive, whose operand specifies the absolute address of the beginning of the area of memory where a program
and its associated data are located. Its syntax:
ORG <address>

An ORG directive can be located at any point in the program, it simply resets the value of the location counter that keeps track of where the next item is to be
located in the processor's memory.

Consider the following lines of assembler directives:

ORG $001200 Sets the origin of the data area at address $001200
DC.B 12 The value $0C is stored in one byte of memory
DC.W 3,$3B2 The values $0003 and $03B2 are stored
in consecutive locations, each of them taking up 2 bytes
DC.L $DEADBEEF The value $DEADBEEF (4 bytes) is stored in memory
DC.L $3B2 The value $000003B2 is stored in 4 bytes of memory
DC.B 'Arnie' The ASCII characters are stored as 5 bytes

Here is how the memory of the processor would look like (note: all numbers are in base 16):

Address Contents
001200 0C
001201 00
001202 03
001203 03
001204 B2
001205 DE
001206 AD
001207 BE
001208 EF
001209 00
00120A 00
00120B 03
00120C B2
00120D 41
00120E 72
00120F 6E
001210 69
001211 65

A similar directive exists in Milo, its syntax is: .org <address>. Its use is essentially the same.
The ORG directive is thus used to specify the beginning of a program in memory. One program can have multiple origins, e.g. one for the data region and one
for the instructions region. Many assemblers, however, don't require the use of the ORG directive, and they locate the program in memory wherever there is
space for it. In fact, the use of the ORG directive to specify an absolute location for the program in memory can be very dangerous, since it can overwrite
another program or data region already present at that address. It is therefore recommended not to use ORG unless you are certain there is no other program
running on your computer at the same time, which happens virtually never on modern systems.

In order to know where the source code of a program ends, some Assemblers require the presence of the END directive at the last line of the assembly language
program. This directive simply tells the Assembler that there are no more instructions or directives to be assembled. Most of the Assemblers who employ this
directive use it without parameters, but if you use the University of Teesside cross-assembler then you need to supply a single parameter: the address in memory
where the code is located, i.e. the point where it is to start executing. This address is in general the same as the one supplied by the ORG directive.

Back to top of page

9.3.4 A complete 68000 program


Now we return to the example we saw earlier in section 8.3.2 and we'll give the full assembly language program that corresponds to the following C code:

/*
We assume chars are 1 byte in size,
ints are 2 bytes in size,
and long ints are 4 bytes in size;
*/

char C = 'A';
int X = 0x100;
long int Y = 0x2000A111;

X++ ;
if (C != 'B')
X -= 0x5;
Y += 0x9001;

Here is the assembly language program:


ORG $001000 Starting address in memory
A EQU $41 Equate A to the ASCII code for 'A'
B EQU $42 Equate B to the ASCII code for 'B'
C DS.B 1 Reserve one byte for variable C
X DS.W 1 Reserve one word for variable X
Y DS.L 1 Reserve one longword for variable Y
*
MOVE.B #A,C Initializes C to 'A'; C = 'A';
MOVE.W #$100,X Initializes X to 0x100; X = 0x100;
MOVE.L #$2000A111,Y Initializes Y to 0x2000A111
*
MOVE.W X,D1 Fetch X and place it in D1.
MOVE.L Y,D2 Fetch Y and place it in D2.
MOVE.B C,D3 Fetch C and place it in D3.
*
ADD.W #1,D1 Executes X++
CMP.B #B,D3 Compares the ASCII code for 'B' (0x42) to C
BEQ EXIT_IF Go to label EXIT_IF,thus skipping the next
* instruction if C == 'B'
SUB.W #$5,D1 Executes X -= 0x5
EXIT_IF ADD.L #$9001,D2 Executes Y += 0x9001
END $001000

Here is the listing file produced by the Teesside cross-assembler.

Source file: EXL8.X68


Assembled on: 98-06-16 at: 14:40:38
by: X68K PC-2.1 Copyright (c) University of Teesside 1989,93
Defaults: ORG $0/FORMAT/OPT A,BRL,CEX,CL,FRL,MC,MD,NOMEX,NOPCO

1 00001000 ORG $001000 ;Starting address in memory


2 00000041 A: EQU $41 ;Equate A to the ASCII code for 'A'
3 00000042 B: EQU $42 ;Equate B to the ASCII code for 'B'
4 00001000 00000001 C: DS.B 1 ;Reserve one byte for variable C
5 00001002 00000002 X: DS.W 1 ;Reserve one word for variable X
6 00001004 00000004 Y: DS.L 1 ;Reserve one longword for variable Y
7 *
8 00001008 11FC00411000 MOVE.B #A,C ;Initializes C to 'A'; C = 'A';
9 0000100E 31FC01001002 MOVE.W #$100,X ;Initializes X to 0x100; X = 0x100;
10 00001014 21FC2000A111 MOVE.L #$2000A111,Y ;Initializes Y to 0x2000A111
1004
11 *
12 0000101C 32381002 MOVE.W X,D1 ;Fetch X and place it in D1.
13 00001020 24381004 MOVE.L Y,D2 ;Fetch Y and place it in D2.
14 00001024 16381000 MOVE.B C,D3 ;Fetch C and place it in D3.
15 *
16 00001028 5241 ADD.W #1,D1 ;Executes X++
17 0000102A 0C030042 CMP.B #B,D3 ;Compares the ASCII code for 'B' (0x42) to C
18 0000102E 67000004 BEQ EXIT_IF ;Go to label EXIT_IF,thus skipping the next
19 * instruction if C == 'B'
20 00001032 5B41 SUB.W #$5,D1 ;Executes X -= 0x5
21 00001034 068200009001 EXIT_IF: ADD.L #$9001,D2 ;Executes Y += 0x9001
22 00001000 END $001000

Lines: 22, Errors: 0, Warnings: 0.

Finally, here is the Milo version of the same program. You have probably noted that the most important difference between the Motorola and the Milo syntax is
found in the way assembler directives are written and used. Once again, you are encouraged to refer to the NeXT textbook for information about the Milo
syntax.

.data

#Reserve one byte for variable C and initialize it to the ascii value of A
C: .ascii "A"

.even
#Reserve one word for variable X and initialize it to 0x0100
X: .word 0x100

#Reserve one longword for variable Y and initialize it to 0x2000a111


Y: .long 0x2000a111

.text

.set B, 0x42 |Same as B EQU $42


movew X,d1 |Fetch X and place it in d1.
movel Y,d2 |Fetch Y and place it in d2.
moveb C,d3 |Fetch C and place it in d3.

addw #1,d1 |Executes X++


cmpb #B,d3 |Compares the ASCII code for 'B' (0x42) to C
beq exit_if |Go to label EXIT_IF,thus skipping the next
# instruction, if C == 'B'
subw #0x5,d1 |Executes X -= 0x5
exit_if: addl #0x9001,d2 |Executes Y += 0x9001.

Back to top of page

Copyright © McGill University, 1998. All rights reserved.


Reproduction of all or part of this work is permitted for educational or research purposes provided that this copyright notice is included in any copy.

You might also like