Professional Documents
Culture Documents
Reference Manual
Revision 1.0
PTSC
10989 Via Frontera
San Diego, CA 92127
1 (858) 674 5000 voice
1 (858) 674 5005 fax
www.ptsc.com
IGNITE™ IP Reference Manual
For company and product information, access www.ptsc.com. Patriot Scientific Corporation is publicly traded over the
counter, symbol PTSC.
ShBoom and IGNITE are trademarks of Patriot Scientific Corporation. Any other brands and products used within this
document are trademarks or registered trademarks of their respective owners.
The technology discussed in this document may be covered by one or more of the following US patents:
5,440,749; 5,530,890; 5,604,915; 5,659,703; 5,784,584; 5,809,336. Other US and Foreign patents pending.
IMPORTANT NOTICE
Disclaimer
Patriot Scientific Corporation (PTSC) reserves the right to make changes to its products or specifications at any time, or
to discontinue any product, without notice. PTSC advises its customers to obtain the latest product information available
before designing-in or purchasing its products. PTSC assumes no responsibility for the use of any circuitry described
other than the circuitry embodied in a PTSC product. PTSC makes no representations that the circuitry described herein
is free from patent infringement or other rights of third parties, which may result from its use. No license is granted by
implication or otherwise under any patent, patent rights or other rights, of PTSC. PTSC assumes no liability for any
product designs, customer designs, design assistance, or use of its products.
Information within this document is subject to change without notice, but was believed to be accurate at the time of
publication. No warranty of any kind, including but not limited to implied warranties of merchantability or fitness for a
particular application, are stated or implied. PTSC and the author assume no responsibility for any errors or omissions,
and disclaim responsibility for any consequences resulting from the use of the information included herein.
ii
IGNITE™ IP Reference Manual
Contents
IMPORTANT NOTICE.............................................................................................................................................. ii
Disclaimer ............................................................................................................................................................. ii
Critical Applications Policy................................................................................................................................. ii
Figures......................................................................................................................................................................... vi
Tables.......................................................................................................................................................................... vii
Microprocessor Unit ................................................................................................................................................... 3
Address Space....................................................................................................................................................... 5
Registers and Stacks ............................................................................................................................................ 5
Programming Model............................................................................................................................................ 6
Instruction Set Overview..................................................................................................................................... 7
ALU Operations.............................................................................................................................................. 8
Branches, Skips, and Loops .......................................................................................................................... 10
Literals.......................................................................................................................................................... 10
Data Movement............................................................................................................................................. 10
Loads and Stores........................................................................................................................................... 10
Stack Data Management................................................................................................................................11
Stack Cache Management............................................................................................................................. 12
Byte and Word Operations ............................................................................................................................ 12
Floating-Point Math ..................................................................................................................................... 14
Debugging Features ..................................................................................................................................... 14
On-Chip Resources ....................................................................................................................................... 14
Miscellaneous ............................................................................................................................................... 15
Stacks and Stack Caches.................................................................................................................................... 15
Stack-Page Exceptions.................................................................................................................................. 16
Stack Initialization ........................................................................................................................................ 16
Stack Depth................................................................................................................................................... 17
Stack Flush and Restore................................................................................................................................ 17
Exceptions and Trapping................................................................................................................................... 18
Floating-Point Math Support ........................................................................................................................... 20
Data Formats................................................................................................................................................ 20
Status and Control Bits ................................................................................................................................. 20
GRS Extension Bits ....................................................................................................................................... 21
Rounding....................................................................................................................................................... 21
Exceptions..................................................................................................................................................... 22
Hardware Debugging Support.......................................................................................................................... 22
Breakpoint..................................................................................................................................................... 23
Single-Step .................................................................................................................................................... 23
Register mode ..................................................................................................................................................... 24
MPU Reset.......................................................................................................................................................... 26
Interrupts............................................................................................................................................................ 26
Bit Inputs ............................................................................................................................................................ 26
Bit Outputs ......................................................................................................................................................... 27
Instruction Pre-fetch.......................................................................................................................................... 27
Posted-Write....................................................................................................................................................... 27
On-Chip Resources ............................................................................................................................................ 27
iii
IGNITE™ IP Reference Manual
iv
IGNITE™ IP Reference Manual
ret.................................................................................................................................................................. 54
rev ................................................................................................................................................................. 54
rnd................................................................................................................................................................. 55
scache ........................................................................................................................................................... 55
sdepth............................................................................................................................................................ 55
sexb ............................................................................................................................................................... 55
sexw .............................................................................................................................................................. 56
shift_ ............................................................................................................................................................. 56
shl_ ............................................................................................................................................................... 57
shr_ ............................................................................................................................................................... 58
skip_.............................................................................................................................................................. 59
split ............................................................................................................................................................... 60
st ................................................................................................................................................................... 61
step................................................................................................................................................................ 62
sto ................................................................................................................................................................. 62
sub................................................................................................................................................................. 63
subb............................................................................................................................................................... 63
subexp ........................................................................................................................................................... 63
testb............................................................................................................................................................... 64
testexp ........................................................................................................................................................... 64
xcg................................................................................................................................................................. 64
xor................................................................................................................................................................. 65
Interrupt Controller ................................................................................................................................................. 68
Resources ............................................................................................................................................................. 68
Operation ............................................................................................................................................................. 68
Interrupt Request Servicing.................................................................................................................................. 68
Recognizing Interrupts......................................................................................................................................... 68
ISR Processing..................................................................................................................................................... 69
Bit Inputs ................................................................................................................................................................... 70
Resources ............................................................................................................................................................. 70
Input Sampling..................................................................................................................................................... 70
Interrupt Usage .................................................................................................................................................... 71
General-Purpose Bits ........................................................................................................................................... 71
Bit Outputs ................................................................................................................................................................ 73
Resources ............................................................................................................................................................ 73
On-Chip Resource Registers .................................................................................................................................... 73
Usage ................................................................................................................................................................... 73
Bus Interface ...................................................................................................................................................... 78
Posted Writes ................................................................................................................................................ 78
Memory Fault ............................................................................................................................................... 80
Timing Information ........................................................................................................................................... 80
v
IGNITE™ IP Reference Manual
Figures
Figure 1 CPU Block Diagram ..................................................................................................................................... 2
Figure 2 CPU Registers............................................................................................................................................... 3
Figure 3 CPU Memory Map........................................................................................................................................ 4
Figure 4 Byte Order .................................................................................................................................................... 5
Figure 5 Add Execution Example ............................................................................................................................... 6
Figure 6 CPU Instruction Format................................................................................................................................ 9
Figure 7 Stack Exception Region.............................................................................................................................. 15
Figure 8 Floating-Point Number Formats ................................................................................................................. 20
Figure 9 Register Mode............................................................................................................................................. 25
Figure 10 Bit Input Block Diagram........................................................................................................................... 70
Figure 11 Bit Input Register...................................................................................................................................... 73
Figure 12 Interrupt Pending Register ........................................................................................................................ 74
Figure 13 Interrupt Under Service Register .............................................................................................................. 74
Figure 14 Bit Output Register ................................................................................................................................... 75
Figure 15 Interrupt Enable Register .......................................................................................................................... 75
Figure 16 Memory Fault Address Register ............................................................................................................... 76
Figure 17 Memory Fault Data Register..................................................................................................................... 76
Figure 18 Miscellaneous C Register ......................................................................................................................... 77
vi
IGNITE™ IP Reference Manual
Tables
Table 1 Instruction Bandwidth Comparison .......................................................................................................... 3
Table 2 CPU Instruction Set .................................................................................................................................... 8
Table 3 ALU Instructions......................................................................................................................................... 8
Table 4 Code example: Rotate ................................................................................................................................. 9
Table 5 CPU Branch Ranges.................................................................................................................................... 9
Table 6 Branch, Loop and Skip Instructions.......................................................................................................... 9
Table 7 Literal Instructions.................................................................................................................................... 10
Table 8 Data Movement Instructions .................................................................................................................... 10
Table 9 Load and Store Instructions ..................................................................................................................... 10
Table 10 Code Example: Complex Addressing Mode.......................................................................................... 11
Table 11 Code Example: Memory Move and Fill................................................................................................. 11
Table 12 Stack Data Management Instruction..................................................................................................... 11
Table 13 Stack Cache Management Instruction................................................................................................... 12
Table 14 Byte and Word Operation Instructions................................................................................................. 12
Table 15 Code Example: Byte Store...................................................................................................................... 13
Table 16 Code Example: Null-Terminated String Move ..................................................................................... 13
Table 17 Code Example: Null Character Search ................................................................................................. 13
Table 18 Code Example: Byte Search ................................................................................................................... 14
Table 19 Floating Point Math Instruction ............................................................................................................ 14
Table 20 Miscellaneous Instructions ..................................................................................................................... 14
Table 21 Debugging Instruction ............................................................................................................................ 14
Table 22 On-Chip Resources Instruction.............................................................................................................. 14
Table 23 Code Example: Stack Initialization ....................................................................................................... 16
Table 24 Code Example: Stack Depth ................................................................................................................... 17
Table 25 Code Example: Save Context ................................................................................................................. 17
Table 26 Code Example: Restore Context ............................................................................................................ 18
Table 27 Traps Dependent on System State.......................................................................................................... 19
Table 28 Trap Priorities ......................................................................................................................................... 19
Table 29 Traps Independent of System State ....................................................................................................... 20
Table 30 GRS Extension Bit Manipulation Instructions ..................................................................................... 20
Table 31 Rounding Mode Action ........................................................................................................................... 21
Table 32 Code Example: Floating-Point Multiply ............................................................................................... 22
Table 33 Code example: Memory Fault Service Routine .................................................................................... 23
Table 34 Instructions that Hold-off Pre-fetch ...................................................................................................... 27
Table 35 CPU Mnemonics and Opcodes (Mnemonic Order).............................................................................. 66
Table 36 CPU Mnemonics and Opcodes (Opcode Order)................................................................................... 67
Table 37 Code Example: ISR Vectors ................................................................................................................... 69
Table 38 Code Example: Bit Input Without Zero-Persistence............................................................................ 71
Table 39 Code Example: CPU Usage of Bit Inputs .............................................................................................. 71
Table 40 Resource Register Reset Values ............................................................................................................. 77
Table 41 Signal Descriptions .................................................................................................................................. 78
Table 42CPU Read Timing Parameters ................................................................................................................ 81
Table 43 CPU Write Timing Parameters.............................................................................................................. 82
Table 44 Memory Fault Operation Timing Parameters ...................................................................................... 84
vii
IGNITE™ IP Reference Manual
viii
IGNITE™ IP Reference Manual
1
IGNITE™ IP Reference Manual
instruction
operand stack addressing
latch
Address Bus
Data Bus
multiplexer 2
decode/ Control
local register stack addressing
execute
s4
s3
CPU PC +1
s2
r3
s1 shift
r2 On-Chip
r1
ALU Resource
Registers
s0 shift
ioin
ioip
r0 +4/-4 ioius
ioout
ioie
mfltaddr
mfltdata
32 miscc
address
3
data
x 32
+4/-4
g15
sdepth +1/-1
sa +4/-4
ldepth +1/-1 g2
trap logic INTC
g1
la +4/-4 force reti
g0
call int ack
ct -1 fp ops int req
control/status 3 int #
mode global int
enable
2
IGNITE™ IP Reference Manual
3
IGNITE™ IP Reference Manual
Rather than consume the chip area for a barrel shifter, Registers and Stacks
the counted bit-shift operation is “smart” to first shift by
bytes, and then by bits, to minimize the cycles required. The register set contains 52 general-purpose registers,
The shift operations can also shift double cells (64 bits), a mode/status register, and two stack pointers. See Figure
allowing bit-rotate instructions to be easily synthesized. 2. It also contains 7 local address-mapped on-chip
Although floating-point math is useful, and resource registers used for I/O, configuration, and status.
sometimes required, it is not heavily used in embedded The operand stack contains eighteen registers and
applications. Rather than consume the chip area for a operates as a push-down stack, with direct access to the
floating-point unit, CPU instructions to efficiently perform top three registers (s0–s2). These registers and the
the most time-consuming aspects of basic IEEE floating- remaining registers (s3–s17) operate together as a stack
point math operations, in both single and double cache. Arithmetic, logical, and data-movement
precision, are supplied. The operations use the “smart” operations, as well as intermediate result processing, are
shifter to reduce the cycles required. performed on the operand stack. Parameters are passed
Byte read and write operations are available, but to procedures and results are returned from procedures
cycling through individual bytes is slow when scanning on the stack, without the requirement of building a stack
for byte values. These types of operations are made more frame or necessarily moving data between other
efficient by instructions that operate on all of the bytes registers and the frame. As a true stack, registers are
within a cell at once. allocated only as required, resulting in efficient use of
available storage. The external operand stack is
Address Space addressed by register sa.
The local-register stack contains sixteen registers
The CPU fully supports a linear four-gigabyte address and operates as a push-down stack with direct access to
space for all program and data operations. the first fifteen registers (r0–r14). Theses registers and
the remaining register (r15) operate together as a stack
Big En dian By t e Or d e r cache. As a stack, they are used to hold subroutine
return addresses and automatically nest local-register
31 24 23 16 15 8 7 0 Bit
data. The external local-register stack is addressed by
b yt e d at a register la.
cell d at a Both cached stacks automatically spill to memory
and refill from memory, and can be arbitrarily deep.
0 1 2 3 By t e Additionally, s0 and r0 can be used for memory access.
See Stacks and Stack Caches.
Figure 4 Byte Order The use of stack-cached operand and local registers
improve performance by eliminating the overhead
Several instructions or operations expect addresses required to save and restore context (when compared to
aligned on four-byte (cell) boundaries. These addresses processors with only global registers available). This
are referred to as cell-aligned. Only the upper 30 bits of allows for very efficient interrupt and subroutine
the address are used to locate the data; the two least- processing.
significant address bits are ignored but appear externally. In addition to the stacks are sixteen global registers
Within a cell, the high order byte is located at the low byte and three other registers. The global registers (g0–g15)
address. The next lower-order byte is at the next higher are used for data storage, and as operand storage for the
address, and so on. For example, the value 0x12345678 CPU multiply and divide instructions (g0). Remaining
would exist at byte addresses in memory, from low to high are mode, which contains mode and status bits; x, which
address, as 12 34 56 78. See Figure 4. is an index register (in addition to s0 and r0); and ct,
which is a loop counter and also participates in floating-
point operations.
5
IGNITE™ IP Reference Manual
Programming Mode
. .
. .
For those familiar with the Java Virtual Machine, f s5 s5
American National Standard Forth (ANS Forth), e s4 f s4
Postscript, or Hewlett-Packard calculators that use d s3 e s3
c s2 add d s2
postfix notation, commonly known as Reverse Polish b c
s1 s1
Notation (RPN), programming the IGNITE CPU will a s0 a+ b s0
in many ways be very familiar.
A CPU architecture can be classified as to the Op er an d St ack
number of operands specified within its instruction
format. Typical 16-bit and 32-bit CISC and RISC CPUs Figure 5 Add Execution Example
are usually two- or three-operand architectures, whereas
smaller microcontrollers are often one-operand Once data is on the operand stack it can be used for
architectures. In each instruction, two- and three- any instruction that expects data there. The result of an
operand architectures specify a source and destination, add, for instance, can be left on the stack indefinitely,
or two sources and a destination, whereas one-operand until used by a subsequent instruction. See Table 1.
architectures specify only one source and have an Instructions are also available to reorder the data in the
implicit destination, typically the accumulator. top few cells of the operand stack so that prior results can
Architectures are also usually not pure. For example, be accessed when required. Data can also be removed
one-operand architectures often have two-operand from the operand stack and placed in local or global
instructions to specify both a source and destination for registers to minimize or eliminate later reordering of stack
data movement between registers. elements. Data can even be popped from the operand
The IGNITE CPU is a zero-operand architecture, stack and restacked by pushing it onto the local-register
known as a stack computer. Operand sources and stack.
destinations are assumed to be on the top of the operand Computations are usually most efficiently performed
stack, which is also the accumulator. An operation such by executing the most deeply nested computations first,
as add uses both source operands from the top of the leaving the intermediate results on the operand stack, and
operand stack, adds them, and returns the result to the then combining the intermediate results as the
top of the operand stack, thus causing a net reduction of computation unnests. If the nesting of the computation is
one in the operand stack depth. See Figure 5. complex, or if the intermediate results are to be used some
Most ALU operations behave similarly, using two time later after other data would have been added to the
source operands and returning one result operand to the operand stack, the intermediate results can be removed
operand stack. A few ALU operations use one source from the operand stack and stored in global or local
operand and return one result operand to the operand registers.
stack. Some ALU and other operations also require a non-
stack register, and a very few do not use the operand stack
at all.
Non-ALU operations are also similar. Loads (memory
reads) either use an address on the operand stack or in a
specified register, and place the retrieved data on the
operand stack. Stores (memory writes) use either an
address on the operand stack or in a register, and use data
from the operand stack. Data movement operations push
data from a register onto the operand stack, or pop data
from the stack into a register.
6
IGNITE™ IP Reference Manual
Global registers are used directly and maintain their Subroutine return addresses are pushed onto the
data indefinitely. Local registers are registers within the local-register stack and thus appear as r0 on entry to the
local-register stack cache and, as a stack, must first be subroutine, with the previous r0 accessible as r1, and so
allocated. Allocation can be performed by popping data on. As data is pushed onto the stacks and the available
from the operand stack and pushing it onto the local- register space fills, registers are spilled to memory when
register stack one cell at a time. It can also be preformed required. Similarly, as data is removed from the stacks
by allocating a block of uninitialized stack registers at one and the register space empties, the registers are refilled
time; the uninitialized registers are then initialized by from memory as required. Thus from the program’s
popping data, one cell at a time, into the registers in any perspective, the stack registers are always available.
order. The allocated local registers can be deallocated by
pushing data onto the operand stack by popping it off of Instruction Set Overview
the local register stack one cell at a time, and then
discarding from the operand stack the data that is not Table 2 lists the CPU instructions; Table 35, page 66,
required. Alternatively, the allocated local registers can be and Table 36, page 67, list the mnemonics and opcodes.
deallocated by first saving any data required from the All instructions consist of eight bits, except for those that
registers, and then deallocating a block of registers at one require immediate data. This allows up to four
time. The method selected depends on the number of instructions (an instruction group) to be obtained on each
registers required and whether the data on the operand instruction fetch, thus reducing memory-bandwidth
stack is in the required order. requirements compared to typical RISC machines with
Registers on both stacks are referenced relative to the 32-bit instructions. This characteristic also allows looping
tops of the stacks and are thus local in scope. What was on an instruction group (a micro-loop) without additional
accessible in r0, for example, after one cell has been push instruction fetches from memory, further increasing
onto the local-register stack, is accessible as r1; the newly efficiency. Instruction formats are depicted in Figure 6.
pushed value is accessible as r0.
Parameters are passed to and returned from subrou-
tines on the operand stack. An unlimited number of
parameters can be passed and returned in this manner. An
unlimited number of local-register allocations can also be
made. Parameters and allocated local registers thus
conveniently nest and unnest across subroutines and
program basic blocks.
7
IGNITE™ IP Reference Manual
8
IGNITE™ IP Reference Manual
Branches
opcode opcode opcode branch 3-bit offset
opcode opcode branch offset 11-bit offset
opcode branch offset 19-bit offset
branch offset 27-bit offset
Literals
opcode opcode push.n opcode push nibble
(any positions)
opcode opcode push.b value push byte
Offset Bits Offset Range in Bytes opcode push.l opcode opcode push long
(any positions)
data for first push.l
3 -16/+12
data for second push.l (if present)
11 -4096/+4092 data for third push.l (if present)
data for fourth push.l (if present)
19 -1048576/+1048572
opcode opcode opcode opcode
27 -268435456/+268435452
All
Note:
opcode opcode opcode opcode
Encoded offset is in cells. Offset is added to the address of
the beginning of the cell containing the branch to compute
the destination address.
Figure 6 CPU Instruction Format
Table 5 CPU Branch Ranges
9
IGNITE™
IGNITE™ IP Reference Manual
10
IGNITE™ IP Reference Manual
11
IGNITE™
IGNITE™ IP Reference Manual
If more than a few stack data management this way can also improve performance by minimizing the
instructions are required to access a given operand stack RAS cycles required due to stack memory accesses.
cell, performance usually improves by placing data in a The _frame instructions can be used to allocate a
local or global register. However, there is a finite supply block of uninitialized register space at the top of the
of global registers, and local registers, at some point, spill SRAM part of a stack, or to discard such a block of
to memory. Data should be maintained on the operand register space when no longer required. They, like the
stack only while it is efficient to do so. In general, if the _cache instructions, can be used to group stack spills and
program requires frequent access to data in the operand refills to improve performance by minimizing the RAS
stack deeper than s2, that data, or other more accessible cycles required due to stack memory accesses.
data, should be placed in directly addressable registers to See Stacks and Stack Caches on page 15 for more
simplify access. information.
To use the local-register stack, data can be popped All stack cache management instruction opcodes are
from the operand stack and pushed onto the local-register formatted as 8-bit values with no encoded fields.
stack, or data can be popped from the local-register stack
and pushed onto the operand stack. This mechanism is
convenient to move a few cells when the resulting operand
stack order is acceptable. When moving more data, or
when the data order on the operand stack is not as desired,
Table 13 Stack Cache Management Instruction
lframe can be used to allocate or deallocate the required
local registers, and then the registers can be written and
read directly. Using lframe also has the advantage of
making the required local-register stack space available by
spilling the stack as a continuous sequence of bus transac-
tions, which minimizes the number of RAS cycles
required when writing to DRAM. The instruction sframe Table 14 Byte and Word Operation Instructions
behaves similarly to lframe, and is primarily used to
discard a number of cells from the operand stack.
All stack data management instruction opcodes are
formatted as 8-bit values with no encoded fields. Byte and Word Operations
Bytes can be addressed and read from memory
directly and can be addressed and written to memory with
Stack Cache Management
Other than initialization, and possibly monitoring of the code depicted in Table 15. Words (16-bit values) are
overflow and underflow via the related traps, the stack handled similarly.
caches do not require active management. Several Instructions are available for manipulating bytes
instructions exist to efficiently manipulate the caches for within cells. A byte can be replicated across a cell, the
context switching, status checking, and spill and refill bytes within a cell can be tested for zero, and a cell can be
scheduling. shifted by left or right by one byte. Code examples
The _depth instructions can be used to determine the depicting scanning for a specified byte, scanning for a null
number of cells in the SRAM part of the stack caches. byte, and moving a null-terminated string in cell-sized
This value can be used to discard the values currently in units are given below.
the cache, to later restore the cache depth with _cache, or All byte operation instruction opcodes are formatted
to compute the total on-chip and external stack depth. as 8-bit values with no encoded fields.
The _cache instructions can be used to ensure either
that data is in the cache or that space for data exists in the
cache, so that spills and refills occur at preferential times.
This allows more control over the caching process and
thus a greater degree of determinism during the program
execution process. Scheduling stack spills and refills in
12
IGNITE™ IP Reference Manual
13
IGNITE™
IGNITE™ IP Reference Manual
Debugging Features
Each of these instructions signals an exception and
traps to an application-supplied execution-monitoring
program to assist in the debugging of programs. See
Debugging Support.
Both debugging instruction opcodes are formatted as
8-bit values with no encoded fields.
On-Chip Resources
These instructions allow access to the on-chip
peripherals, status registers, and configuration registers.
All registers can be accessed with the ldo [] and sto []
instructions. The first six registers each contain eight bits,
which are also bit addressable with ldo.i [] and sto.i [].
Table 18 Code Example: Byte Search See On-Chip Resource Registers.
All on-chip resource instruction opcodes are
formatted as 8-bit values with no encoded fields.
14
IGNITE™ IP Reference Manual
The stack caches are designed to always allow the Figure 7 Stack Exception Region
current operation to execute to completion before an
implicit stack memory operation is required to occur. No
instruction explicitly pushes or explicitly pops more than The stacks can be arbitrarily deep. When a stack
one cell from either stack (except for stack management spills, data is written at the address in the stack pointer
instructions). Thus to allow execution to completion, the and then the stack pointer is decremented by four
stack cache logic ensures that there is always one or more (postdecremented stack pointer). Conversely, when a
cells full and one or more cells empty in each stack cache stack refills, the stack pointer is incremented by four, and
(except immediately after reset, see Stack Initialization) then data is read from memory (preincremented stack
before instruction execution. If, after the execution of an pointer). The stack pointer thus points to the next location
15
IGNITE™
IGNITE™ IP Reference Manual
to write and the stacks grow from higher to lower memory execution. Additionally, a memory fault must not occur
addresses. The stack pointer for the operand stack is sa, during a stack page access. The stack page exceptions are
and the stack pointer for the local-register stack is la. intended to be used to ensure valid stack pages can always
Since the stacks are dynamically allocated memory be accessed without memory faults.
areas, some amount of planning or management is Since stack-page exceptions can occur on any stack
required to ensure the memory areas do not overflow or spill or refill, usage of certain stack-cache management
underflow. The simplest is to allocate a sufficiently large instructions (_depth and _cache) must be modified to
memory area so that overflow conditions won’t occur. In ensure the expected result. A stack-page exception can
this case, a correctly written program does not produce occur after the stack-cache management instruction and
underflow. Alternatively, stack memory can be thus modify the cache state. To prevent this, the
dynamically allocated or monitored through the use of instruction must complete without a stack spill or refill
stack-page exceptions. that would cause a stack-page exception. This can be
accomplished by either causing a similar stack effect prior
Stack-Page Exceptions to executing the instruction, or by executing the
Stack-page exceptions occur on any stack-cache instruction twice in immediate sequence. See the supplied
memory access near the boundary of any 1024-byte stack management code examples in this section.
memory page to allow overflow and underflow protection
and stack memory management. To prevent thrashing
stack-page exceptions near the margins of the page
boundary areas, once a boundary area is accessed and the
corresponding stack-page exception is signaled, the stack
pointer must move to the middle region of the stack page
before another stack-page exception can be signaled. See
Figure 9.
Stack-page exceptions enable stack memory to be
managed by allowing stack memory pages to be
reallocated or relocated when the edges of the current
stack page are approached. The boundary regions of the
stack pages are located 32 cells from the ends of each
page to allow even a _cache or _frame instruction to
execute to completion and to allow for the corresponding
stack cache to be emptied to memory. Using the stack-
page exceptions requires that only 2 KB of addressable Table 23 Code Example: Stack Initialization
memory be allotted to each stack at any given time: the
current stack page and the page near the most recently Stack Initialization
encroached boundary. After CPU reset both of the CPU stacks should be
Each stack supports stack-page overflow and stack- considered uninitialized until the corresponding stack
page underflow exceptions. These exception conditions pointers are loaded, and this should be one of the first
are tested against the memory address that is accessed operations performed by the CPU.
when the corresponding stack spills or refills between the After a reset, the stacks are abnormally empty. That
execution of instructions. mode contains bits that signal is, r0 and s2 have not been allocated, and are allocated on
local-stack overflow, local-stack underflow, operand stack the first push operation to, or stack pointer initialization
overflow and operand stack underflow, as well as the of, the corresponding stack. However, popping the pushed
corresponding trap enable bits. cell causes that stack to be empty and require a refill. The
The stack-page exceptions have the highest priority of first pushed cell should therefore be left on that stack, or
all of the traps. As this implies, it is important to consider the corresponding stack pointer should be initialized,
carefully the stack effects of the stack trap handler code so before the stack is used further. See Table 23.
that stack-page boundaries are not be violated during its
16
IGNITE™ IP Reference Manual
Stack Depth
The total number of cells on each stack can readily be
determined by adding the number of cells that have spilled
to memory and the number of cells in the on-chip caches.
See Table 24.
17
IGNITE™
IGNITE™ IP Reference Manual
18
IGNITE™ IP Reference Manual
Notes:
1. +n > 0, –n < 0
2. If the instruction reads or writes memory or if a posted
write is in progress, a memory fault can also occur.
3. If the instruction is single-stepped, a single-step trap also
occurs.
4. If any trap occurs, a local-register stack overflow could
also occur. Table 28 Trap Priorities
Table 27 Traps Dependent on System State
19
IGNITE™
IGNITE™ IP Reference Manual
Sin g le Pr e cision
31 30 23 22 0
exp o n en sign if ican
sign h id d en
Doub le Pr e cision
31 0
sign if ican d
31 30 20 19 0
exp o n en sign if ican d
sign h id d en
Data Formats
Though single- and double-precision IEEE formats
are supported, from the perspective of the CPU, only 32-
bit values are manipulated at any one time (except for
double shifting). See Figure 8. The CPU instructions
Table 29 Traps Independent of System State directly support the normalized data formats depicted.
The related denormalized formats are detected by testexp
and fully supportable in software.
Floating-Point Math Support Status and Control Bits
The CPU supports single-precision (32-bit) and mode contains 13 bits that set floating-point
double-precision (64-bit) IEEE floating-point math precision, rounding mode, exception signals, and trap
software. Rather than a floating-point unit and the silicon enables. See Figure 9.
area it would require, the CPU contains instructions to
perform most of the time-consuming operations required
when programming basic floating-point math operations.
Existing integer math operations are used to supply the
core add, subtract, multiply, and divide functions, while
special instructions are used to efficiently manipulate the
exponents and detect exception conditions. Additionally, a
three-bit extension to the top one or two stack cells
(depending on the precision) is used to aid in rounding
and to supply the required precision and exception
signaling operations.
20
IGNITE™ IP Reference Manual
Rounding 1 x x x do nothing
The GRS extension maintains three extra bits of
precision while producing a floating-point result. These Round toward zero
bits are used to decide how to round the result to fit the x x x x do nothing
destination format. If one views the bits as if they were
just to the right of the binary point, then guard_bit has a Table 31 Rounding Mode Action
position value of one-half, round_bit has a positional
value of one-quarter, and sticky_bit has a positional value
of one-eighth. The rounding operation selected by
fp_round_mode uses the GRS extension bits and the sign
bit of ct to determine how rounding occurs. If guard_bit is
zero the value of GRS extension is below one-half. If
guard_bit is one the value of GRS extension is one-half or
greater. Since the GRS extension bits are not part of the
destination format they are discarded when the operation
is complete. This information is the basis for the operation
of the instruction rnd.
21
IGNITE™
IGNITE™ IP Reference Manual
Exceptions
To speed processing, exception conditions detected
by the floating-point instructions set exception signaling
bits in mode and, if enabled, trap. The following traps are
supported:
Breakpoint
The instruction bkpt performs an operation similar to
a call subroutine to address 0x134, except that the return
address is the address of the bkpt opcode. This behavior is
required because, due to the instruction push.l, the address
of a call subroutine cannot always be determined from its
return address.
Commonly, bkpt is used to temporarily replace an
instruction in an application at a point of interest for
debugging. The trap handler for bkpt typically restores the
original instruction, displays information for the user, and
waits for a command. Or, the trap handler could be
implemented as a conditional breakpoint to check for a
termination condition (such as a register value or the
number of executions of this particular breakpoint),
continuing execution of the application until the condition
is met. The advantage of bkpt over step is that the
applications executes at full speed between breakpoints.
Single-Step
The instruction step is used to execute an application
program one instruction at a time. It acts much like a
return from subroutine, except that after executing one
instruction at the return address, a trap to address 0x138
occurs. The return address from the trap is the address of
the next instruction. The trap handler for step typically
displays information for the user, and waits for a
command. Or, the trap handler could instead check for a
termination condition (such as a register value or the
number of executions of this particular location),
continuing execution of the application until the condition
is met.
23
IGNITE™
IGNITE™ IP Reference Manual
24
IGNITE™ IP Reference Manual
os_unf_trap_en os_unf_exc_sig
If set, enables an operand stack underflow trap to Set if an operand stack refill occurs, os_boundary is
occur after an operand stack underflow exception is clear, and the accessed memory address is in the last
signaled. thirty-two cells of a 1024-byte memory page.
Local-Register Stack
Mnemonic Description
ls_boundary boundary area entered
ls_unf_trap_en underflow trap enable
ls_unf_exc_sig underflow exception signal
ls_ovf_trap_en overflow trap enable
ls_ovf_exc_sig overflow exception signal
Operand Stack
Mnemonic Description
os_boundary boundary area entered
os_unf_trap_en underflow trap enable
os_unf_exc_sig underflow exception signal
os_ovf_trap_en overflow trap enable
os_ovf_exc_sig overflow exception signal
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Mnemonic Description
carry carry flag
power_fail power fail occurred
interrupt_en global interrupt enable
Memory Fault
Mnemonic Description
mflt_exc_sig exception signal
mflt_trap_en trap enable
mflt_write fault was a write
Floating Point
Mnemonic Description
sticky_bit rounding sticky bit
round_bit rounding round bit
guard_bit rounding guard bit
fp_rnd_exc_sig round exception signal
fp_rnd_trap_en round trap enable
fp_nrm_exc_sig normalize exception signal
fp_nrm_trap_en normalize trap enable
fp_ovf_exc_sig overflow exception signal
fp_ovf_trap_en overflow trap enable
fp_unf_exc_sig underflow exception signal
fp_unf_trap_en underflow trap enable
fp_exp_exc_sig exponent exception signal
fp_exp_trap_en exponent trap enable
fp_round_mode rounding mode (0=nearest,
1= !infinity, 2=+infinity, 3=zero)
fp_precision precision (0=single, 1=double)
25
IGNITE™
IGNITE™ IP Reference Manual
os_ovf_trap_en fp_unf_exc_sig
If set, enables an operand stack overflow trap to occur If set, a previous execution of norml, addexp or
after an operand stack overflow exception is signaled. subexp caused the exponent field to decrease to or beyond
os_ovf_exc_sig all zeros.
Set if an operand stack spill occurs, os_boundary is
clear, and the accessed memory address is in the first fp_unf_trap_en
thirty-two cells of a 1024-byte memory page. If set, enables a floating-point underflow trap to
occur after a floating-point underflow exception is
carry signaled.
Contains the carry bit from the accumulator. Saving
and restoring mode can be used to save and restore carry. fp_exp_exc_sig
If set, a previous execution of testexp detected an
power_fail exponent field containing all ones or all zeros.
Set during power-up to indicate that a power failure
has occurred. Cleared by any write to mode. Otherwise, fp_exp_trap_en
not writable. If set, enables a floating-point exponent trap to occur
after a floating-point exponent exception is signaled.
interrupt_en
If set, interrupts are globally enabled. Set by the fp_round_mode
instruction ei, cleared by di. Contains the type of rounding to be performed by the
CPU instruction rnd.
fp_rnd_exc_sig
If set, a previous execution of rnd caused a change in fp_precision
the least significant bit of s0 (s1, if fp_precision is set). If clear, the floating-point instructions operate on
stack values in IEEE single-precision (32-bit) format. If
fp_rnd_trap_en set, the floating-point instructions operate on stack values
If set, enables a floating-point round trap to occur in IEEE double-precision (64-bit) format.
after a floating-point round exception is signaled.
CPU Reset
fp_nrm_exc_sig
If set, one or more of the guard_bit, round_bit and The CPU begins executing at address 0x80000008
sticky_bit were set after a previous execution of denorm, with the mode register set to all zeros.
norml or normr.
Interrupts
fp_nrm_trap_en
If set, enables a floating-point normalize trap to occur The CPU contains an on-chip prioritized interrupt
after a floating-point normalize exception is signaled. controller that supports up to eight different interrupt
levels. Interrupts can be received through the bit inputs or
fp_ovf_exc_sig can be forced in software by writing to ioin. For complete
If set, a previous execution of normr, addexp or details of interrupts and their servicing, see Interrupt
subexp caused the exponent field to increase to or beyond Controller.
all ones.
Bit Inputs
fp_ovf_trap_en
If set, enables a floating-point overflow trap to occur
The CPU contains eight general-purpose bit inputs
after a floating-point overflow exception is signaled.
that are shared with the INTC as requests for those
services. The bits are taken from _IN[7:0].
_ See Bit Inputs.
26
IGNITE™ IP Reference Manual
Bit Outputs
Instruction Reference
The CPU contains eight general-purpose bit outputs
which can be written by the CPU. The bits are output on As a stack-based CPU architecture, the IGNITE
O
_U_T
_ [7:0]. See Bit Outputs. PROCESSOR CPU instructions have documentation
requirements similar to other stack-based systems, such as
the Java Virtual Machine (JVM) and American National
Standard Forth (ANS Forth). Not surprisingly, many of
the JVM and ANS Forth operations are instructions on the
IGNITE CPU. As a result, the JVM and ANS Forth stack
notation used for language documentation is useful for
Table 34 Instructions that Hold-off Pre-fetch describing IGNITE CPU instructions. The basic
notation adapted for the IGNITE CPU is:
The CPU issues bus requests ordered to optimize ( input_operands -- output_operands )
execution. To keep executing instructions as much as ( L: input_operands -- output_operands )
possible, the next group of instructions are fetched while where “--” indicates the execution of the instruction.
the current group executes. This is referred to as “Input_operands” and “output_operands” are lists of
instruction pre-fetch. Instruction pre-fetch begins as soon values on the operand stack (the default) or local register
as an instruction group begins to execute unless it is held stack (preceded by “L:”). These are similar, though not
off. Pre-fetch is held off if the executing instruction group always identical, to the source and destination operands
contains one of the instruction in Table 34. ld and st only that can be represented within instruction mnemonics. The
hold-off pre-fetch if they occur as the first instruction in value held in the top-of-stack register (s0 or r0) is always
the executing instruction group. Knowing which on the right of the operand list with the values held in the
instruction hold-off pre-fetch is useful when programming higher ordinal registers appearing to the left (e.g., s2 s1
bus configuration information. s0). The only items in the operand lists are those that are
pertinent to the instruction; other values may exist under
Posted-Write these on the stacks. All of the input_operands are
considered to be popped off the stack, the operation
The CPU supports a one-level posted write. This performed, and the output_operands pushed on the stack.
allows CPU execution to continue unimpeded after the For example, a notational expression of:
write is posted. To maintain memory coherency, posted n1 n2 -- n3
writes have the highest priority of all CPU bus requests. represents two input operands, n1 and n2, and one output
This guarantees that memory reads following a posted operand, n3. For the instruction add, n1 (taken from s1) is
write will always retrieve the most up-to-date data. added to n2 (taken from s0), and the result is n3 (left in
s0). If the name of a value on the left of either diagram is
On-Chip Resources the same as the name of a value on the right, then the
value was required, but unchanged. The name represents
The non-CPU hardware features of the CPU are the operand type. Numeric suffixes are added to indicate
generally accessed by the CPU through a set of 8 different or changed operands of the same type. The
registers located in their own address space. Using a values may be bytes, integers, floating-point numbers,
separate address space simplifies implementation, addresses, or any other type of value that can be placed in
preserves opcodes, and prevents cluttering the normal a single 32-bit cell.
memory address space with peripherals. Collectively addr address
known as the On-Chip Resources, these registers allow byte character or byte (upper 24 bits zero)
access to the bit inputs, bit outputs, INTC and system n integer or 32 arbitrary bits
configuration. These registers and their functions are other text integer or 32 arbitrary bits
referenced throughout this manual and are described in
detail in On-Chip Resource Registers.
27
IGNITE™
IGNITE™ IP Reference Manual
ANS Forth defines other operand types and operands memory cycle. Operations that wait on the completion
that occupy more than one stack cell; those are not used of instruction pre-fetch are labeled “Mprefetch.” These
here. are distinct in that pre-fetch occurs in parallel with
Note that typically all stack action is described by execution so the wait time is probably not a full
the notation and is not explicitly described in the text. If memory cycle.
there are multiple possible outcomes then the outcome
options are on separate lines and are to be considered as ANS Forth Word Equivalents
individual cases. If other registers or memory variables Those IGNITE CPU instructions that are exact
are modified, then that effect is documented in the text. equivalents of ANS Forth words are indicated in the
Also on the stack diagram line is an indication of body text for the instruction. Many additional ANS
the effect on carry, if any, as well as the opcode and Forth words simply require a short instruction sequence,
execution time at the right margin. but these are not indicated.
A timing with an “M” indicates the specified
number of bus requests and bus transactions (memory Java Byte Code Equivalents
cycles) for the instruction to complete. The value used Those IGNITE CPU instructions that are exact
for “M” includes both the bus request and bus equivalents of Java byte codes are indicated in the body
transaction times and depends on the memory interface text for the IGNITE CPU instruction. Many additional
implemented. Java byte codes simply require a short instruction
Timings do not include implied memory cycles sequence, though the most complex byte codes require
such as stack spills and refills required to maintain the a subroutine call. For detailed information contact
state of the stack caches. Any operation that pushes or PTSC.
pops a stack, or references a local register could cause a
28
IGNITE™ IP Reference Manual
add
add ( n1 n2 -- n3 ) carry± 1100 0000
0xC0
1 CPU-clock
Add n1 and n2 giving the sum n3. carry is set if there is a carry out of bit 31 of the sum and cleared otherwise.
adda
Add Address
addc
Add with Carry
29
IGNITE™
IGNITE™ IP Reference Manual
addexp
Add Exponents
CoCPUte as described above. Clear the exponent field bits and sign bit and set the hidden bit of n1 and n2, giving n3
and n4, respectively. n5 is the result of the coCPUtation. After completion, if the exponent-field calculation result
equaled or exceeded the maximum value of the exponent field (exponent field result 255 for single, exponent field
result 2047 for double) an overflow exception is signaled. If the exponent-field calculation result is less than or
equal to zero an underflow exception is signaled. When an exception is signaled, the exponent field of n5 contains as
many low-order bits of the coCPUted exponent as it will hold.
and
Bitwise AND
30
IGNITE™ IP Reference Manual
bkpt
Breakpoint
31
IGNITE™
IGNITE™ IP Reference Manual
b
Branch if Condition
The instruction adds the two's-complement cell offset encoded within and following the br opcode to pc, and
transfers execution to the resulting cell-aligned address.
Equivalent to the run-time for the ANS Forth words AGAIN, AHEAD, ELSE.
If n is zero the instruction adds the two's-complement cell offset encoded within and following the bz opcode to pc,
and transfers execution to the resulting cell-aligned address. If n is non-zero execution continues with the next
instruction group.
Equivalent to the run-time for the ANS Forth words IF, UNTIL, WHILE.
32
IGNITE™ IP Reference Manual
The instruction decrements ct by one. If the resulting ct is non-zero the instruction then adds the two's-complement
cell offset encoded within and following the dbr opcode to pc, and transfers execution to the resulting cell-aligned
address. If the resulting ct is zero execution continues with the next instruction group.
cache
Fill/Empty Stack Cache
The cache instructions are used to optimize program execution, or to make program execution more deterministic. Stack
cache spills and refills can be caused to occur at preferential times, and to occur in bursts to optimize memory access.
Executing the instruction with both n and n-14 (n>0) ensures that an exact number of items are in the stack cache.
Pushing dummy values onto the stack (one value for the local-register stack, three values for the operand stack) and then
executing the instruction with n = -14 causes all previously held data to be spilled to memory. Note that if stack-page
exceptions are enabled, a trap might occur and change the state of the stacks from that set by the cache instruction. See
Stack-Page Exceptions on page ?.
If n < 0 (two's complement), ensure that at least n cells can be added to the local-register stack without causing
local-register stack cache spills. Cells are spilled from the stack cache to memory if required. (-14 n -1).
If n < 0 (two's complement), ensure that at least n cells can be added to the operand stack without causing
operand stack cache spills. Cells are spilled from the stack cache to memory if required. (-14 n -1)
33
IGNITE™
IGNITE™ IP Reference Manual
call
Call Subroutine
Transfer execution to offset cells from the beginning of the current instruction group. addr is the cell-aligned address
of the next instruction group.
The instruction pushes addr on the local-register stack and then adds the two's-complement cell offset encoded with-
in and following the call opcode to pc, and transfers execution to the resulting cell-aligned address. The offset is in
the same form and follows the same rules as those for branches.
Replace the value in pc with addr1 to transfer execution there. addr2 is the byte-aligned address of the next
instruction following call []. Note that addr1 is an absolute address and not an offset.
cmp
Compare
copyb
Copy Byte Across Cell
34
IGNITE™ IP Reference Manual
dec
Decrement
denorm
Denormalize
Shifting is performed by bytes or bits to minimize CPU-clock cycles required. If the count in the exponent bits of ct
is larger than the width in bits of the significand field + 3 (for the guard_bit, round_bit and the hidden bit), the
sticky_bit is set and the other bits are cleared, and execution requires one CPU-clock cycle.
35
IGNITE™
IGNITE™ IP Reference Manual
depth
Depth of Stack
Note that if stack-page exceptions are enabled, a trap might occur and change the state of the stacks from that returned.
See Stack-Page Exceptions on page ?.
di
Disable Interrupts
di ( -- ) 1011 0111
0xB7
1 CPU-clock
Globally disable interrupts, clearing interrupt_en. The ioie bits are not changed.
divu
Divide Unsigned
36
IGNITE™ IP Reference Manual
ei
Enable Interrupts
ei ( -- ) 1011 0110
0xB6
1 CPU-clock
Globally enable interrupts, setting interrupt_en. The ioie bits are not changed.
eqz
Equal Zero
expdif
Exponent Difference
extexp
Extract Exponent
37
IGNITE™
IGNITE™ IP Reference Manual
extsig
Extract Significand
38
IGNITE™ IP Reference Manual
frame
Allocate On-Chip Stack Frame
If n < 0, discard n cells, xn x1, from the top of the local-register stack cache. This causes r0 through r( n -1) to
be discarded, r n to become r0, r( n +1) to become r1, etc. (-15 n -1). Each cell discarded that is not in the
stack cache requires one CPU-clock cycle.
If n < 0, discard n cells, xn x1, from within the operand stack cache after s0 and s1. This causes s2 through
s( n +1) to be discarded, s( n +2) to become s2, s( n +3) to become s3, etc. (-15 n -1). Each cell
discarded that is not in the stack cache requires one CPU-clock cycle.
39
IGNITE™
IGNITE™ IP Reference Manual
iand
Bitwise Invert then AND
inc
Increment
40
IGNITE™ IP Reference Manual
ld
Load Indirect from Memory
41
IGNITE™
IGNITE™ IP Reference Manual
ldo
Load Indirect from On-Chip Resource
42
IGNITE™ IP Reference Manual
mloop_
Micro Loop on Condition
An mloop re-executes the current instruction group, beginning with the first instruction in the group, up to the mloop_
instruction, until a specified condition is not met or until ct is decremented to zero. When either termination condition
occurs, execution continues with the instruction following the mloop_ opcode.
mloopn
mloopnp ( n -- n ) 0011 1010
Micro Loop if Negative/Not Positive 0x3A
1 CPU-clock
Decrement ct by one. If ct is non-zero and n is negative (neither positive nor zero) transfer execution to the
beginning of the current instruction group. If ct is zero or n is not negative (either positive or zero) continue
execution with the instruction following mloopn or mloopnp.
43
IGNITE™
IGNITE™ IP Reference Manual
mloopnn
mloopp ( n -- n ) 0011 1110
Micro Loop if Not Negative/Positive 0x3E
1 CPU-clock
Decrement ct by one. If ct is non-zero and n is not negative (either positive or zero) transfer execution to the
beginning of the current instruction group. If ct is zero or n is negative (neither positive nor zero) continue execution
with the instruction following mloopnn or mloopp.
mulfs
Multiply Fast Signed
The program must supply n1 in bit-order-reversed form (e.g., the binary value for decimal 13 is 01101 and bit-order
reversed is 10110; note that the original high-order bit is zero as a sign bit and must be included.) The program must
also load ct with the bit count and push a zero for n2. For the example number above, the count would be 5. n3 is
typically discarded.
n2 could be non-zero but its use in this form is questionable. The effect of n2 on the result is that the value of n2
shifted left by the bit count value in ct is added to the result, n4. n3 contains the low cell of the value remaining after
n2n1 is shifted right by the number of bits in ct. Instruction execution time is limited to 65 CPU-clock cycles by the
instruction expiration counter.
44
IGNITE™ IP Reference Manual
muls
Multiply Signed
mulu
Multiply Unsigned
mxm
Maximum
neg
Two's-Complement Negation
45
IGNITE™
IGNITE™ IP Reference Manual
nop
No Operation
norml
Normalize Left
In both steps, bits shifted into bit zero of n1 come from the GRS extension.
When the operation is complete, if shifting was required and the decremented field in ct reached or passed all zero
bits during the processing, an underflow exception is signaled. If no shifting is required an underflow exception is
not signaled. Then, if any bit in the GRS extension is set, a normalize exception is signaled. The location of the
exponent field depends on fp_precision. If both traps are processed, the underflow trap has higher priority.
Instruction execution time is limited to 65 CPU-clock cycles by the instruction expiration counter.
46
IGNITE™ IP Reference Manual
normr
Normalize Right
When the operation is complete, if shifting was required and the incremented field in ct reached or passed all one
bits during the processing, an overflow exception is signaled. If no shifting is required an overflow exception is not
signaled. Then, if the GRS extension is set, a normalization exception is signaled. The locations of the exponent field
and hidden bit depend on fp_precision. If both traps are processed, the overflow trap has higher priority.
notc
Complement Carry
47
IGNITE™
IGNITE™ IP Reference Manual
or
Bitwise OR
pop
pop ( n -- ) 1011 0011
0xB3
1 CPU-clock
Discard n.
1 CPU-clock
Remove n from the operand stack and push it onto the local-register stack (into r0). The previous contents of r0 are
placed in r1, the previous contents of r1 are placed in r2, and so on.
If ri is in the local-register stack cache (i ldepth) the value in ri is replaced with n. If ri is not currently in the local-
register stack cache (i > ldepth), cells starting at r(ldepth+1) are read from memory sequentially to fill the cache until
ri is reached. ri is then replaced with the value n.
Equivalent to Java byte codes astore_0, astore_1, astore_2, astore_3, fstore_0, fstore_1, fstore_2, fstore_3, istore_0,
istore_1, istore_2, istore_3.
Equivalent when executed twice to Java byte codes dstore_0, dstore_1, dstore_2, dstore_3, lstore_0, lstore_1,
lstore_2, lstore_3.
Equivalent for indexes up to fourteen (almost all actual cases) to Java byte codes astore (vindex), fstore (vindex),
istore (vindex).
Equivalent when executed twice for indexes up to thirteen (almost all actual cases) to Java byte codes dstore
(vindex), lstore (vindex).
49
IGNITE™
IGNITE™ IP Reference Manual
push
push ( n -- n n ) 1001 0010
0x92
1 CPU-clock
Duplicate n.
50
IGNITE™ IP Reference Manual
If ri is in the local-register stack cache (i ldepth) the value in ri is pushed onto the operand stack. If ri is not
currently in the local-register stack cache (i > ldepth), cells starting at r(ldepth+1) are read from memory sequentially
until ri is reached. The value in ri is then pushed onto the operand stack.
Equivalent to Java byte codes aload_0, aload_1, aload_2, aload_3, fload_0, fload_1, fload_2, fload_3, iload_0,
iload_1, iload_2, iload_3.
Equivalent when executed twice to Java byte codes lload_0, lload_1, lload_2, lload_3, dload_0, dload_1, dload_2,
dload_3.
Equivalent for indexes up to fourteen (almost all actual cases) to Java byte codes aload (vindex), fload (vindex),
iload (vindex).
Equivalent when executed twice for indexes up to thirteen (almost all actual cases) to Java byte codes dload
(vindex), lload (vindex).
51
IGNITE™
IGNITE™ IP Reference Manual
Equivalent to Java byte codes aconst_null, fconst_0, iconst_m1, iconst_0, iconst_1, iconst_2, iconst_3, iconst_4,
iconst_5.
Equivalent for some values to Java byte code bipush.
Equivalent when executed twice to Java byte codes dconst_0, lconst_0, lconst_1.
52
IGNITE™ IP Reference Manual
replb
Replace Byte
replw
Replace Word
replexp
Replace Exponent
53
IGNITE™
IGNITE™ IP Reference Manual
ret
Return
Pop addr from the local-register stack into pc to transfer execution to addr.
Pop addr from the local-register stack into pc to transfer execution to addr. Clear the current interrupt under-service
bit.
rev
Revolve Operand Stack
Equivalent to the run-time for the ANS Forth words FROT, ROT.
54
IGNITE™ IP Reference Manual
rnd
Round
If the value of n2 is different from n1, a rounded exception is signaled. The exception is detected as a change in the
value of bit zero.
sexb
Sign-extend byte
55
IGNITE™
IGNITE™ IP Reference Manual
sexw
Sign-extend word
shift_
The number of CPU-clock cycles required to shift the specified number of bits depends on the number of bits requested.
While the count eight the value (single or double) is shifted eight bits each CPU-clock cycle. When the count becomes
less than eight the shifting is finished at one bit per CPU-clock cycle. For instance, the worst-case useful shift is 31 bits
(either left or right) and takes eleven CPU-clock cycles—three 8-bit shifts and seven 1-bit shifts plus one CPU-clock
cycle for setup. A 32-bit shift would take five CPU-clock cycles. The counts are modulo 64 in sign-magnitude
representation using only the six least-significant bits for the magnitude and bit 31 for the sign. A zero in the six least-
significant bits represents zero. (Sign-magnitude representation here is a positive integer count in the six least-significant
bits, the middle bits ignored, and bit 31 indicating the sign, zero is positive, one is negative).
56
IGNITE™ IP Reference Manual
shl_
Shift Left
57
IGNITE™
IGNITE™ IP Reference Manual
shr_
Shift Right
58
IGNITE™ IP Reference Manual
skip
Skip if Condition
skip conditionally or unconditionally skips execution of the remainder of the instruction group. If the condition is
true, skip the remainder of the instruction group and continue execution with the following instruction group. If
condition is false, continue execution with the next instruction.
WARNING: Do not skip a push.l #. Since the CPU will not have executed the push.l # opcode, the corresponding
literal cell is not skipped. The result will be the CPU executing the literal cell.
skipn
skipnp ( n -- ) 0011 0010
Skip if Negative/Not Positive 0x32
1 (not neg) Mprefetch (neg) CPU-clocks
If n is negative (neither positive nor zero), skip the remainder of the instruction group and continue execution with
the next instruction group; otherwise, continue execution with the next instruction.
skipnn
skipp ( n -- ) 0011 0110
Skip if Not Negative/Positive 0x36
1 (neg) Mprefetch (not neg) CPU-clocks
If n is not negative (either positive or zero), skip the remainder of the instruction group and continue execution with
the next instruction group; otherwise, continue execution with the next instruction.
59
IGNITE™
IGNITE™ IP Reference Manual
If n is not zero, skip the remainder of the instruction group and continue execution with the next instruction group;
otherwise, continue execution with the next instruction.
If n is zero, skip the remainder of the instruction group and continue execution with the next instruction group;
otherwise, continue execution with the next instruction.
split
Split Cell
60
IGNITE™ IP Reference Manual
st
Store Indirect to Memory
61
IGNITE™
IGNITE™ IP Reference Manual
step
Single-Step Processor
sto
Store Indirect to On-Chip Resource
62
IGNITE™ IP Reference Manual
sub
Subtract
subb
Subtract with Borrow
subexp
Subtract Exponents
Compute as described above. Clear the exponent-field bits and sign bit and set the hidden bit of n1 and n2 giving n3
and n4, respectively. n5 is the result of the computation. After completion, if the exponent-field calculation result
equaled or exceeded the maximum value of the exponent field (exponent result 255 for single, exponent result
2047 for double) an overflow exception is signaled. If the exponent-field calculation result is less than or equal to
zero an underflow exception is signaled. When an exception is signaled, the exponent field of n5 contains as low-
order many bits of the result as it will hold.
63
IGNITE™
IGNITE™ IP Reference Manual
testb
Test Bytes for Zero
testexp
Test Exponent
xcg
Exchange
64
IGNITE™ IP Reference Manual
xor
Bitwise Exclusive OR
65
IGNITE™
IGNITE™ IP Reference Manual
66
IGNITE™ IP Reference Manual
67
IGNITE™IP Reference Manual
68
IGNITE™ IP Reference Manual
ISR Processing
69
IGNITE™IP Reference Manual
70
IGNITE™ IP Reference Manual
sampling are disabled the inputs read read in the same CPU Usage
manner and behave conventionally. Bits in ioin are read and written by the CPU as a
group with ldo [ioin] and sto [ioin], or are read and
written individually with ldo.i [ioXin_i] and sto.i
[ioXin_i]. Writing zero bits to ioin has the same effect as
though the external bit inputs had transitioned low for one
sampling cycle, except that there is no sampling delay.
This allows software to simulate events such as external
interrupt requests. Writing one bits to ioin, unlike data
from external inputs when the bits are zero-persistent,
releases persisting zeros to accept the current sample. The
written data is available immediately after the write
completes. The CPU can read ioin at any time, without
regard to the designations of the ioin bits, and with no
Table 38 Code Example: Bit Input Without Zero- effect on the state of the bits. The CPU does not consume
Persistence the state of ioin bits during reads. See the code examples
in Table 39.
Interrupt Usage
An ioin bit is configured as an interrupt request
source when the corresponding ioie bit is set. While an
interrupt request is being processed, until its ISR
terminates by executing reti, the corresponding ioin bit is
not zero-persistent and follows the sampled level of the
external input. Specifically, for a given interrupt request,
while its ioie bit is set, and its ioip bit or ioius bit is set, its
ioin bit is not zero-persistent. This effect can be used to
disable zero-persistent behavior on non-interrupting bits
(see below).
General-Purpose Bits
If an ioin bit is not configured for interrupt requests
then it is a zero-persistent general-purpose ioin bit.
Alternatively, by using an effect of the INTC, general-
purpose ioin bits can be configured without zero-
persistence. Any bits so configured should be the lowest-
priority ioin bits to prevent blocking a lower-priority
interrupt. They are configured by setting their ioie and
ioius bits. The ioius bit prevents the ioin bit from zero-
persisting and from being prioritized and causing an
interrupt request. See the code example in Table 38.
71
IGNITE™IP Reference Manual
72
IGNITE™ IP Reference Manual
Bit Outputs sto. at the bit level (for those registers that have bit
addresses). On other processors, resources of this type
are often either memory-mapped or opcode-mapped. By
Eight general-purpose bit outputs can be set high or
using a separate address space for these resources, the
low by the CPU. The bits are available in the bit output
normal address space remains uncluttered, and opcodes
register, ioout.
are preserved. Except as noted, all registers are readable
and writable. Areas marked “Reserved Zeros” contain
Resources no programmable bits and always return zero. Areas
marked “Reserved” contain unused programmable bits.
The bit outputs consist of a register and pins. These Both areas might contain functional programmable bits
resources include: in the future.
• Bit output register, ioout: bits that were last written
by the CPU. See Figure 15. The first several registers are bit addressable in
• Bit outputs, out[7:0] addition to being register addressable. This allows the
CPU to modify individual bits without corrupting
other bits that might be changed concurrently by
On-Chip Resource Registers INTC logic.
The on-chip resource registers comprise portions of The bits are read and written by the CPU as a
various functional areas on the CPU including the CPU, group with ldo [ioout] and sto [ioout], or are read and
INTC, and bit inputs. The registers are addressed from written individually with ldo.i [ioXout_i] and sto.i
the CPU in their own address space using the [ioXout_i]. When written, the new values are available
instructions ldo and sto at the register level, or ldo. and immediately after the write completes.
73
IGNITE™IP Reference Manual
Contains sampled data from inputs[7:0]. ioin is the source of inputs for all consumers of bit inputs. Bits are zero-
persistent: once a bit is zero in ioin it stays zero until consumed by the INTC, or written by the CPU with a one.
Under certain conditions bits become not zero-persistent. See Bit Inputs. The bits can be individually read, set and
cleared to prevent race conditions between the CPU and the interrupt controller logic.
Contains interrupt requests that are waiting to be serviced. Interrupts are serviced in order of priority (0 = highest, 7
= lowest). An interrupt request from an I/O-channel transfer or from int occurs by the corresponding pending bit being
set. Bits can be set or cleared to submit or withdraw interrupt requests. When an ioip bit and corresponding ioie bit are
set, the corresponding ioin bit is not zero-persistent. See Interrupt Controller. The bits can be individually read, set and
cleared to prevent race conditions between the CPU and the interrupt controller logic.
74
IGNITE™ IP Reference Manual
Contains the current interrupt service request and those that have been temporarily suspended to service a higher-
priority request. When an ISR executable-code vector for an interrupt request is executed, the ioius bit for that interrupt
request is set and the corresponding ioip bit is cleared. When an ISR executes reti, the highest-priority interrupt under-
service bit is cleared. The bits are used to prevent interrupts from interrupting higher-priority ISRs. When an ioius bit and
corresponding ioie bit are set, the corresponding ioin bit is not zero-persistent. See Interrupt Controller.
The bits can be individually read, set and cleared to prevent race conditions between the CPU and INTC logic.
Contains the bits from CPU bit-output operations. Bits appear on OUT[7:0] immediately after writing.
The bits can be individually read, set and cleared.
75
IGNITE™IP Reference Manual
Allows a corresponding zero bit in ioin to request the corresponding interrupt service. When an enabled interrupt
request is recognized, the corresponding ioip bit is set and the corresponding ioin bit is no longer zero-persistent. See
Interrupt Controller, page 79. The bits can be individually read, set and cleared. Bit addressability for this register is
an artifact of its position in the address space, and does not imply any race conditions on this register can exist.
Register is read-only.
di d
When a memory page-fault exception occurs during a memory read or write, mfltaddr contains the address that
caused the exception. The contents of mfltaddr and mfltdata are latched until the first read of mfltaddr after the fault.
After reading mfltaddr, the data in mfltaddr and mfltdata are no longer valid.
Register is read-only.
di d
When a memory page-fault exception occurs during a memory write, mfltdata contains the data to be stored
at mfltaddr. The contents of mfltaddr and mfltdata are latched until the first read of mfltaddr after the fault.
76
IGNITE™ IP Reference Manual
Mnemonic Description
If set, enables a one-level CPU posted-write buffer, which allows the CPU to continue executing after a write to
memory occurs. A posted write has precedence over subsequent CPU reads to maintain memory coherency. If clear, the
CPU must wait for writes to complete before continuing.
77
IGNITE™IP Reference Manual
This section of the document provides all of the information a designer will require designing the logic to interface with
memory and other peripheral devices for the Ignite CPU processor core embodied as a net-list in EDIF file format.
Bus Interface
The bus interface of the Ignite CPU is relatively simple. There are no special requirements other than depicted in the
timing diagrams.
Posted Writes
The Ignite CPU supports a one-deep posted write to allow it to continue execution while the write to the external device
is in progress. Typically CPU execution will subsequently stall waiting for the next bus operation to start.
When asserted active (low), completely initializes the CPU. When de-asserted, CPU execution begins at the address
0x80000008. This signal is internally synchronized with the CPU clock.
78
IGNITE™ IP Reference Manual
The *Reset signal must stay activate for at least 4 clock cycles for the processor to reach its quiescent state.
There is no phase lock loop built into the Ignite IP and therefore all operations within the Ignite IP run off this clock input
Baring a few, all instructions run in a single cycle clock as mentioned in the Ignite Reference Manual.
The address bus provides non-multiplexed address for current CPU bus access. The rising edge of request signal
indicates the start of bus read/write transfer cycle, which also indicates a valid address on the bus.
The address remains valid until the end of the rising edge of the CPU clock following a data valid dval input going active.
The two least-significant bits of the address are ignored when fetching or writing cell-wide data. The first valid address
after a reset has been active is the CPU reset address.
Provides 32 bit data input when write is inactive. Provides 32 bit data output when write is active.
The rising edge of Request signal indicates valid write data.
The write data remains valid until the end of the rising edge of the CPU clock following a data valid dval input going
active. For read operations the read data needs to meet the setup and hold time with respect to rising edge of CPU clock
after Data valid signal dval goes active.
The interface to the ignite_ip EDIF file logic consists of a 32-bit data in bus mdi<31:0> and a 32-bit data out bus
mdo<31:0>. The bi-directional pin driver of the FPGA combines these to form MDR <31:0>.
Bit inputs can be used for general-purpose inputs or as interrupt requests. These inputs are accessible by the CPU through
ioin register. These inputs need to be synchronized with the CPU clock before presenting to the Ignite IP FPGA device.
Bit outputs for general-purpose use. These bits are accessible by the CPU through the ioout register.
When active, indicates that the current bus cycle is a write cycle. When inactive, indicates the current bus cycle is a read
cycle. This signal is active concurrent with the REQ signal that signifies the start of a bus transfer cycle. This signal goes
active at the rising edge of the CPU clock.
79
IGNITE™ IP Reference Manual
This signal goes active at the rising edge of the CPU clock indicating the beginning of a bus transfer cycle.
This signal generated by external logic indicates to the Ignite CPU as to when it is time to complete the current bus
transfer cycle. This active High signal is sampled by the rising edge of the CPU clock. If there is a pending bus cycle,
then the CPU will immediately start the next transfer on the rising edge of the CPU clock.
If the pin *faultb is asserted (active low), and memory fault traps are enabled, following a request at the beginning
of a bus transfer cycle, then the CPU will immediately transfer execution to the memory fault trap location to handle
the memory fault. This signal is provided by an external logic implementing a memory manager function. Memory
fault traps are enabled by bit 27 of the mode register. The address and write-data that caused the memory fault saved
in internal registers and are retrieved allowing memory fault recovery. The *faultb going active has a required
setup time and should also be driven inactive after the invalid memory cycle completes. The memory manager
generating the *faultb signal must also generate dval to complete the current cycle.
If *faultb is asserted, and memory fault traps are not enabled, operation will be unaffected, provided that
*faultb is removed in a timely manner.
The *faultb signal might be generated by external logic because of either memory errors detected by parity
circuitry or memory non-availability caused by memory page swapping.
Bus Interface
The bus interface for the Ignite CPU employs a very simple request/acknowledge protocol that has been the traditional
mechanism for most embedded processors.
There are two modes of bus transaction that are intended for single and multiple access mode of access respectively.
The Ignite processor IP is a completely synchronous design. All timing information will be stated with respect to the
clock edge, period or duty cycle of the clock that it is operated from.
Timing Information
The timing specifications for the part as mentioned in the IP data sheet were derived post synthesis using TSMC library
of parts for the 0.18-micron technology, and will be different for other technologies.
All inputs have a setup time with respect to the clock input of the device. All outputs have a clock to output time delay
referenced to the clock input of the device.
80
IGNITE™ IP Reference Manual
CPU Clk
1 2
3 4
Request
5
Write
7
Data Valid 8
CPU State Bus Idle Rea d X fer Bus Idle Read X fer Bus Idle
Ignite CP U Read
Notes:
Note1
T_clkperiod refers to the clock period of the CPU clock. This is an absolutely critical parameter to meet for 1
cycle memory access
Note 2
These parameters in this row are defined by the Foundry provided library for a specific semiconductor geometry
and process
Note 3
This is the delay as specified by the component library for clock High to output High
Note 4
This is the delay as specified by the component library for clock High to output Low
Note 5
This is the Setup time before the clock active signal as specified by component library
Note 6
This is the Hold time after the clock active signal as specified by component library
81
IGNITE™ IP Reference Manual
CPU Clk
Request
10 12
11 13
Write
7
Data Valid 8
CPU State Bus Idle Write Xfer Bus Idle Write Xfer Bus Idle
Notes:
Note1
T_clkperiod refers to the clock period of the CPU clock. This is an absolutely critical parameter to meet for 1
cycle memory access
Note 2
These parameters in this row are defined by the Foundry provided library for a specific semiconductor geometry
and process
Note 3
This is the delay as specified by the component library for clock High to output High
Note 4
This is the delay as specified by the component library for clock High to output Low
Note 5
This is the Setup time before the clock active signal as specified by component library
Note 6
This is the Hold time after the clock active signal as specified by component library
Note7
This is the input to high-impedance delay as specified by component library
82
IGNITE™ IP Reference Manual
CPU Clk
Request
Write
Data Valid
CPU State Idle Read Xfer Read Xfer Read Xfer Idle
CPU Clk
Request
Write
Data Valid
CPU State Idle Write Xfer Write Xfer Write Xfer Bus Idle
83
IGNITE™ IP Reference Manual
CPU Clk
Request
Data Valid 8
FAULTB* 8
CPU State Bus Idle Read/Write Xfer Bus Idle Read Xfer for FAULTB Vector Bus Idle
Notes:
Note1
T_clkperiod refers to the clock period of the CPU clock. This is an absolutely critical parameter to meet for 1
cycle memory access
Note 6
This is the Hold time after the clock active signal as specified by component library
84