Professional Documents
Culture Documents
Harun iljak
Preface
This short text is meant to serve as a quick reference for Nios II assembly
language. It does not attempt to replace the official Nios II processor reference
guide, as it is the most authoritative reference, providing comprehensive information
and all necessary details. This booklet on the other hand gives the basic information
needed for drafting relatively simple code for Nios II, troubleshooting assembly
language and hex files and getting a brief overview of Nios II architecture and
assembly syntax from the practical perspective.
Technical details on how to use Altera software for both hardware implementation of Nios II and its programming are not provided, as they are not within the
scope of this text.
Trademarks for Harry Potter references belong to J.K. Rowling, her publishers
and Warner Bros. Nios II is a registered trademark of the Altera Corporation. This
text is not an official publication and no copyright infringement is intended.
M`ar`gin`al
n`otes
`excepti`ons
is
inffl-
frin`gedffl, `confun`d`o.
LOAD
MOV
OR
TRIM
DEL
Contents
Preface
Chapter 1.
3
Marauders Map of Nios II
9
9
9
11
12
12
13
14
15
16
16
18
CHAPTER 1
Figure 1.0.3. General purpose pins and their Data register numeration
Youffl
`canffl
usfi`e
UART to sfi`en`dffl
`anffl `owl to th`e
h`om`e
`computerffl
`an`dffl vi`ce vflersfi`affl
CHAPTER 2
If
`divisfii`onffl
`an`dffl/`orffl
multifflpli`cati`onffl `ar`e n`ot
sfiupp`ortedffl
th`e
`anffl
by
pr`ocesfisfi`orffl,
`excepti`onffl
Unimplem`entedffl
insfitru`cti`onffl
is
`gen`er`atedffl.
10
numbers in two registers. If you need the 32 high-order bits and you are multiplying
two signed integers in registers, use mulxss RC, RA, RB. If the integers are to be
taken as unsigned, use mulxss RC, RA, RB. Finally, the instruction mulxsu RC,
RA, RB treats the contents of Rone as signed and Rtwo as unsigned integer.
As an example, we will calculate the division remainder for numbers in two
registers.
div
mul
sub
r6 , r4 , r 5
r7 , r6 , r 5
r8 , r4 , r 7
Note that in this case, we know that multiplication is not going to produce
a larger number than the number stored in r4, hence it is known that it only
comprises of 32 low-order bits.
2.2.2. Basic bitwise logical operations. Logical bitwise operations directly
implemented in Nios II assembly are AND, OR, XOR and partially NOR.
Logical bitwise conjunction for contents of two registers is performed by R-type
instruction and RC, RA, RB. When a 16-bit constant is used, it can be conjuncted
with the 16 low-order bits of a register (i.e. padding it with 16 zeros to the left before
conjunction) using I-type instruction andi RB, RA, IMM16. If the conjunction of
the constant should be performed with the 16 high-order bits of a register (i.e.
padding it with 16 zeros to the right first), instruction andhi RB, RA, IMM16 is
used. Note that there is no sign extension, padding is always done with zeros. This
is called logical padding.
Logical bitwise disjunction is performed in the same manner: for numbers
stored in two registers, R-type instruction or RC, RB, RA is used. If an immediate
16-bit constant is used with the 16 low-order bits of a register, the instruction is
ori RB, RA, IMM16, while disjunction with the 16 high-order bits is performed
with orhi RB, RA, IMM16.
In the same manner, xor RC, RA, RB is used for exclusive disjunction of two
registers. If an immediate 16-bit constant is used with the 16 low-order bits of a
register, the instruction is xori RB, RA, IMM16, while exclusive disjunction with
the 16 high-order bits is performed with xorhi RB, RA, IMM16.
Bitwise logical NOR operation only exists in R-type instruction form for two
registers as nor RC, RA, RB.
As an example, we will change the first and the last bit in a register.
ori
orhi
r4 , r4 , 0 x0001
r4 , r4 , 0 x8000
An`oth`erffl
`excep
ffl-
ti`onffl,
Divisfii`onffl
`err`orffl `d`etects `difflvi`d`e insfitru`cti`ons
th`at pr`odu`ce `affl
`qfiu`oti`ent th`at `canfflt
bfle
r`epr`esfi`entedffl:
`an`dffl
`divisfii`onffl
`divisfii`onffl by zer`o
th`e
tivfle
-1.
lar`gesfit
`of
n`egaffl-
numbflerffl
by
11
the right n positions, where n is the number represented with 5 least significant bits
of the third register. However, in the default implementation of Nios II processor,
there is no rori, but its behaviour can be achieved by using roli with 32 n as
the immediate value.
If the shift is supposed to be non-circular, then it is assumed that free places in
a register whose contents are being shifted are filled with zeros. Left logical shift is
done either with R-type instruction sll RC, RA, RB or another R-type instruction
slli RC, RA, IMM5. The logic is the same as in the case of rotation.
Logical shift to the right is similarly done with srl RC, RA, RB or with srli
RC, RA, IMM5.
Another type of shift to the right is the arithmetic shift done with sra RC, RA,
RB or with srai RC, RA, IMM5. Unlike logic shift, filling the newly freed places in
the register is in this case done with duplicating the sign bit, just like in arithmetic
operations covered in the beginning of this chapter. Notice that the arithmetic shift
is not implemented for left shift, as in the case of left shift, the empty places appear
on the least significant bits, making the padding with sign bit nonsensical.
As an example, we will introduce multiplication by 4, signed and unsigned
division by 4 respectively using shifts.
slli
srai
srli
r4 , r4 , 0 x02
r4 , r4 , 0 x02
r4 , r4 , 0 x02
12
cmple RC, RA, RB places 1 in RC if RA RB, 0 otherwise and it is implemented as cmpge with swapped parameters. Its immediate equivalent cmplei RB,
RA, IMM16 is implemented as cmpgei with swapped parameters. The unsigned
equivalents cmpleu RC, RA, RB and cmpleui RB, RA, IMM16 are implemented as
cmpgeu and cmpgeui with swapped parameters.
2.4. Branching instructions
Nios-II assembly offers several branching instructions. An I-type instruction
beq RA, RB, label moves the execution of the program to the PC+4+IMM16 denoted by the label if contents of the two registers are the same. Otherwise, it
continues with the next instruction. The I-type instruction bge RA, RB, label
does the same under the condition RARB for signed values in registers. An I-type
instruction bgeu RA, RB, label does the same with unsigned values.
Signed comparison RA>RB branching is performed with an I type instruction
blt RA, RB, label while the unsigned version of it is bltu RA, RB, label.
Signed comparison RA>RB branching is performed with a pseudo-instruction
bgt RA, RB, label which is interpreted as blt with swapped parameters. Unsigned version of it, bgtu RA, RB, label is interpreted as bltu with swapped
parameters.
Signed comparison RARB branching is performed with a pseudo-instruction
ble RA, RB, label which is interpreted as bge with swapped parameters. Unsigned version of it, bleu RA, RB, label is interpreted as bgeu with swapped
parameters.
Branching if the two registers are not equal is done with an I type instruction
bne RA, RB, label.
An unconditional branching (a GOTO) is performed by an I type instruction
br label. If the address where the program should continue is not a constant (an
immediate value) but a calculated value in a register, then an R-type instruction
jmp RA is used.
Finally, if the full address in the 256 MB range of PC has to be provided, J-type
instruction jmpi label where label is an IMM26 is used. The jump is performed to
PC[31..28]:IMM26x4.
to
bfle
n`ot
`divisfiible by 4 (iffl.`e.
`except
`d`es-
tin`ati`onffl `a`d`dr`esfis"
`excepti`onffl.
h`app`enffl
to
13
directly (actually, PC+4 again) is an R-type instruction nextpc RC. The content of
PC incremented by four is saved in the specified register.
Return from a subroutine simply returns the content of ra register to PC and
it is performed by an R-type instruction without parameters ret.
Return from an exception is done with an R-type instruction eret. The content
of ea register moves to PC and content of estatus moves to status.
An R-type instruction trap is used either as trap or trap IMM5 to save the
address of the next instruction in ea register, contents of status to estatus,
disable interrupts and start the exception handler. IMM5 is used only for debugging
purposes.
Registers like status, estatus etc. are called control registers and can be
read and written in using an R-type instruction for reading, rdctl RC, N which
rcopies the contents of Nth control register to register RC, and a writing instruction
wrctl N, RA which writes the contents of register RA into Nth control register.
An I-type instruction rdprs RB, RA, IMM16 reads from register RA in the previous register set, adds sign-extended value IMM16 to its value and places it in RB.
This only functions if the version of Nios II used allows shadow register sets.
Writing in the previous register set is done via R-type instruction wrprs RC,
RA. It copies the value of register RA in the current register set to register RC in
previous register set. Note that to write to an arbitrary register set, software can
insert the desired register set number in status.PRS prior to executing wrprs .
2.6. Miscellaneous instructions
The most powerful magical unforgivable curse of Nios II language is an R-type
instruction named custom. It enables introduction of 256 different custom user
designed instructions to Nios II assembly. You design a custom hardware structure
using hardware description adjacent to the Nios II ALU which can use two registers
as inputs and one as an output (but it doesnt have to, it can use its own custom
registers). The syntax is custom N, xresult, xone, xtwo where x can stand
either for R, general purpose Nios II register, or C, custom register. The part about
machine code of custom instructions will provide more explanations.
Most assembly languages include an instruction which does nothing and it is
usually called nop. In Nios II assembly language nop is implemented as a pseudoinstruction nop. The instruction behind it is add r0, r0, r0. It is used to lose
one instruction cycle for timing purposes.
Debuggers place debugging breaking points using special R-type instructions.
Such instructions are exclusively used by debuggers and hence they should not appear in exception handling routines, user programs and operating systems. Syntax
of the breakpoint placement instruction is either break or break IMM5, where the 5bit immediate constant can be used by the debugger as the descriptor of breakpoint
type. The effect of breakpoint is
b s t a t u s < s t a t u s
PIE < 0
U < 0
ba < PC + 4
PC < break h a n d l e r a d d r e s s
On the other hand, bret instruction returns from the break by performing the
following:
s t a t u s < b s t a t u s
PC < ba
Sin`ce
th`e
`conffltent `of th`e r`affl `orffl
`eaffl
r`egisfiterffl
m`ay
Oth`erwisfi`e,
it
thr`ows
th`e
illegal
`op`er`ati`onffl
`excepti`onffl.
M`anipulati`onffl
`of
`contr`ol
r`eg-
tr`aps
`an`dffl
isfiters,
sfi`ets,
`er`et
`affl
`only
r`egisfiterffl
`canffl thr`ow
sfiup`ervisfi`orfflinsfitru`cti`onffl
`excepti`onffl.
Off
to
Azk`abanffl
It is p`osfisfiible to
h`avfle `affl misfi`ali`gn`edffl
`a`d`dr`esfis
r`egisfiterffl.
inffl
baffl
Th`enffl
tin`ati`onffl `a`d`dr`esfis"
`excepti`onffl.
If it
is
`a`ccesfisfi`edffl
inffl
usfi`erffl m`od`e, `an`dffl
n`ot inffl sfiup`ervisfi`orffl
m`od`e, it thr`ows
th`e
"sfiup`ervisfi`orffl`only
insfitru`cti`on"
`excepti`onffl.
14
RB, %h i ( v a l u e )
RB, RB, %l o ( v a l u e )
or
movhi
RB, %h i a d j ( v a l u e )
`d`ataffl
`a`d`dr`esfis,
TLB
p`ermisfisfii`onffl
viffl`olati`onffl,
fasfit
`orffl
addi
15
RB, RB, %l o ( v a l u e )
CHAPTER 3
Instruction fields
A(5) B(5) C(5) 0x31 (6) 0x0 (5) 0x3A (6)
A(5) B(5) IMM16(16) 0x04 (6)
A(5) B(5) C(5) 0x0E (6) 0x0 (5) 0x3A (6)
A(5) B(5) IMM16(16) 0x2C (6)
A(5) B(5) IMM16(16) 0x0C (6)
A(5) B(5) IMM16(16) 0x26 (6)
A(5) B(5) IMM16(16) 0x0E (6)
A(5) B(5) IMM16(16) 0x2E (6)
A(5) B(5) IMM16(16) 0x16 (6)
A(5) B(5) IMM16(16) 0x36 (6)
A(5) B(5) IMM16(16) 0x1E (6)
0x0 (5) 0x0 (5) IMM16(16) 0x06 (6)
0x0 (5) 0x0 (5) 0x1E (5) 0x34 (6) IMM5 (5) 0x3A (6)
0x1E (5) 0x0 (5) 0x1E (5) 0x09 (6) 0 (5) 0x3A (6)
IMM26 (26) 0x0 (6)
A (5) 0x0 (5) 0x1F (5) 0x1D (6) 0x0 (5) 0x3A (6)
A (5) B (5) C (5) 0x20 (6) 0x0 (5) 0x3A (6)
A (5) B (5) IMM16 (16) 0x20 (6)
A (5) B (5) C (5) 0x08 (6) 0x0 (5) 0x3A (6)
A (5) B (5) IMM16 (16) 0x08 (6)
A (5) B (5) C (5) 0x28 (6) 0x0 (5) 0x3A (6)
A (5) B (5) IMM16 (16) 0x28 (6)
A (5) B (5) C (5) 0x10 (6) 0x0 (5) 0x3A (6)
A (5) B (5) IMM16 (16) 0x10 (6)
A (5) B (5) C (5) 0x30 (6) 0x0 (5) 0x3A (6)
A (5) B (5) IMM16 (16) 0x30 (6)
A (5) B (5) C (5) 0x18 (6) 0x0 (5) 0x3A (6)
A (5) B (5) IMM16 (16) 0x18 (6)
A(5) B(5) C(5) reada(1) readb(1) readc(1) N(8) 0x32 (6)
A(5) B(5) C(5) 0x25(6) 0x0 (5) 0x3A (6)
A(5) B(5) C(5) 0x24(6) 0x0 (5) 0x3A (6)
16
eret
flushd
flushda
flushi
flushp
initd
initda
initi
jmp
jmpi
ldb
ldbio
ldbu
ldbuio
ldh
ldhio
ldhu
ldhuio
ldw
ldwio
mul
muli
mulxss
mulxsu
mulxuu
nextpc
nor
or
orhi
ori
rdctl
rdprs
ret
rol
roli
ror
sll
slli
sra
srai
srl
srli
stb
stbio
sth
sthio
stw
stwio
sub
sync
17
trap
wrctl
wrprs
xor
xorhi
xori
18