BlackfinReferenceSheet 14by8 12nov2015 UPDATED

Blackfin (ADSP-BFXXX) Reference
th
V2.3.3 12 November 2015, smithmr@ucalgary.ca

PROGRAMMING MODEL
R0 to R7
P0 to P5
FP
A0, A1
Data registers
Pointer registers
Frame pointer
Accumulator registers
R0, R1, R2, R3 volatile (ENCM369 temporary)

P0, P1. P2 volatile (ENCM369 temporary)
SP
Stack pointer
LC0, LC1 Loop counters
DSP REGISTERS ** NOT SURE WHICH ARE VOLATILE **

I0 to I3 index registers (Ireg)
M0 to M3 modify registers (Mreg)
B0 to B3 base registers
L0 to L3 length registers
Breg start of circular buffer of length Lreg using post-increment register Mreg with index
register Ireg
ASM INSTRUCTION NOTATION CONVENTION

imm
imm3
reg
dreg
statbit:
reg_lo
signed immediate
uimm unsigned immediate
-4 to +3
uimm3 0 to 7
Any register R0 to R7, P0 to P5
Any data register R0 to R7 Preg
Any pointer register P0 to P5
AZ, AN, AC0, AC1, V, VS, AV0, AV0S, AV1, AV1S, AQ
low part of register (R0.L) reg_hi high part of register (P0.H)
PARAMETER PASSING Method used in labs shown later.

#define INPAR4_ON_STACK
#define INPAR3_SPACE_ON_STACK
#define RETS_LOCATION_ON_STACK
#define OLD_FP_LOCATION_ON_STACK
#define SAVED_P3
#define SAVED_P4
#define OUTPAR4_ON_STACK
#define OUTPAR3_SPACE_ON_STACK
.extern _Somewhere;
section program;
.global _Foo;
20
16
12
8
4
0
20
16
12
8
4
0
// NOT IN R3
// inPar3 originally in R2
// IN STUFF relative to FP
// This is relative to SP
// Passed on STACK NOT IN R3
// outPar3 overwrites R2
.extern _Subroutine;
// void Foo(INPAR1, INPAR2, INPAR3, INPAR4)
_Foo: LINK 24;

// 16 spaces for new stack + 2 saved registers
[SP + SAVED_P4] = P4;
// Save non-volatile registers on the stack
P4.L = lo(_Somewhere);
// Point to memory location _Somewhere
P4.H = hi(_Somewhere);
// Reference resolved by linker since .extern
[FP + INPAR1_SPACE_ON_STACK] = R0; // Save for later
[FP + INPAR3_SPACE_ON_STACK] = R2; // Save for later
R0 = [FP + INPAR4_ON_STACK];
// OUTPAR4 = INPAR4
[SP + OUTPAR4_ON_STACK] = R0;
R2 = -6 (X);
// Sign extend OUTPAR3value = 0xFFFFFFFA;
//
R1 = R1;
// OUTPAR2 = INPAR2
R0 = -6 (Z);
// Zero extend OUTPAR1 value = 0x0000FFFA;
CALL _Subroutine;
// Subroutine(0xFFFF, INPAR2, 0xFFFF, INPAR4)
W[P4] = R0;
// Store return value as 16-bit
P4.L = lo(FIO_FLAG_D); P4.H = hi(FIO_FLAG_D); // Constant from
// <defsBF533.h> requires hi/lo macros
P4 = [SP + SAVED_P4];
Also see P0 = [FP + 4]; // Get RETS
UNLINK
UNLINK
RTS
JUMP (P0); // Faster?
PROGRAM FLOW INSTRUCTIONS
COMPARE (CONTINUED)
JUMP User_Label PC replaced by address of User_Label

JUMP (Preg)
PC replaced by value in P-register
IF CC Jump UserLabel if CC = 1 PC replaced by address of User_Label
IF !CC Jump UserLabel if CC = 0 PC replaced by address of User_Label
IF CC Jump UserLabel (bp) IF !CC Jump UserLabel (bp) are versions where the branch
is predicted to be taken. Correctly predicting branches improves pipeline performance
Compare Pointer Registers -- Not parallel (16-bit)
CALL User_Label PC replaced by address of User_Label next instruction RETS

CALL (Preg)
PC replaced by value in P-register
next instructions RETS
RTS return from subroutine (RETS) (ENCM369 LR) RTI return from interrupt (RETI) ,
RTX return from exception (RETX) RTN return from NME (RETN)
RTE return from emulation (RETE) Name of teturn register used in brackets
Loop loop_name loopcounter; HARDWARE LOOP COUNTER
Loop_begin loop_name; 1st instr. Loop_end loop_name; last instruction
Lsetup(Label_1stinstruction, Label_last) loopcounter;
Can use Loopcounter, Loopcounter = Preg or Loopcounter = Preg >> 1
LTn, LBn, LCn (Loop_Top, Loop_Bottom, Loop_Counter) can be set directly
LOAD / STORE INSTRUCTIONS

reg_lo = uimm16; reg_hi = uimm16; half-word loads
reg = uimm16 (Z); zero extended to 32 bits
reg = imm16 (X); signed extended to 32 bits (also imm7 version)
Loading 32 bit values
reg.L = uimm32 & 0xFFFF; reg.H =(uimm32 >>16) & 0xFFFF;
BUT .IMPORT value; reg.L = value; reg.H = value; (half-word correct)
Preg = [ indirect_address ]; [indirect_address] = Pref;
where indirect address is Preg, Preg++, Preg--, Preg + offset, Preg offset, FP offset Offsets
factor of 4
Dreg = [ indirect address ]; [indirect_address] = Dreg; where indirect address is Preg,
Preg++, Preg--, Preg + small / large offset, Preg large offset, FP offset. Preg ++ Preg, Ireg,
Ireg++. Ireg--, Ireg ++ Mreg
Dreg = W [ indirect address ] (Z); zero-extend half word fetch
Dreg = W [ indirect address ] (X); sign-extend half word fetch
Dreg = B[indirect_address] (Z); Dreg = B[indirect_address] (X) where indirect address is
Preg, Preg++, Preg--, Preg + offset, Preg - offset,
Word access only Preg ++ Preg offset factor of 2
Dreg_lo = W[indirect_address]; Dreg_hi = W[indirect_address];
W[indirect_addres] = Dreg_lo; W[indirect_address] = Dreg_hi;
where indirect address is Ireg, Ireg++, Ireg--, Preg, Preg ++ Preg
COMPARE INSTRUCTIONS
CC = Operand_1
CC = Operand_1
CC = Operand_1
CC = Operand_1
CC = Operand_1
== Operand_2;
<= Operand_2;
signed compare
<= Operand_2 (UI); unsigned compare
< Operand_2;
signed compare
< Operand_2 (UI); unsigned compare
Compare Data Registers -- Not parallel (16-bit)

Operand_1 Dreg
Operand_2 Dreg or small constant where small constant is imm3 (-4 to
+ 3) or uimm3 (0 to 7 )
Operand_1 Preg
uimm3
Operand_2 Preg or small constant where small constant is imm3 or
Compare Accumulator Registers -- Not parallel (16-bit)

Operand_1 A0
Operand_2 A1
Always signed compares
MOVE CC INSTRUCTIONS
Dest OP CC Dest Dreg, statbit
OP =, |=, &=, ^= e.g. R0 |= CC;
IF CC DPreg = DPreg ;
IF ! CC DPreg = DPreg ;
CC OP Source; Source Dreg, statbit

Note: CC = Dreg, CC = 1 if Dreg != 0
/* move if CC = 1 */
/* move if CC = 0 */
Dreg, Preg, SP, FP

Dreg, Preg, SP, FP
NEGATE CC INSTRUCTIONS
CC = ! CC;
MOVE REGISTER TO RGISTER INSTRUCTIONS

genreg = genreg ;
genreg = dagreg ;
dagreg = genreg ;
dagreg = dagreg ;
genreg = USP ;
USP = genreg ;
Dreg = sysreg ; /* sysreg to 32-bit D-register */
sysreg = Dreg ; /* 32-bit D-register to sysreg */
sysreg = Preg ; /* 32-bit P-register to sysreg */
sysreg = USP;
A0 = A1 ;
/* move 40-bit Accumulator value */
A1 = A0 ;
/* move 40-bit Accumulator value */
A0 = Dreg ;
/* 32-bit D-register to 40-bit A0, sign extended */
A1 = Dreg ;
/* 32-bit D-register to 40-bit A1, sign extended */
Accumulator to D-register Move:
Dreg_even = A0 (opt_mode) ; /* move 32-bit A0.W to even Dreg */
Dreg_odd = A1 (opt_mode) ; /* move 32-bit A1.W to odd Dreg */
Dreg_even = A0, Dreg_odd = A1 (opt_mode) ;
/* move both Accumulators to a register pair */
Dreg_odd = A1, Dreg_even = A0 (opt_mode) ;
/* move both Accumulators to a register pair */
Dreg = Dreg_lo (Z) ;
Dreg = Dreg.B (Z);
Acc.X = Dreg_lo;
Dreg_lo = Acc.X;
Acc.L = Dreg_lo;
Dreg_lo = Acc.L;
Acc.H = Dreg_hi;
Dreg_hi = Acc.H;
Dreg = Dreg_lo (X) ;

Dreg = Dreg.B (X); lowest 8 bits
Least significant 8-bits moved
8 bits moved, sign extended
Least significant 16-bits moved
16 bits moved
Most significant 16-bits moved
16 bits moved
Accumulator to Half D-register Move supports the following options

Signed fraction format (default).
Unsigned fraction format (saturated) (FU).
Signed and unsigned integer formats (IS) (IU).
Signed fraction with truncation (T),
Signed fraction with scaling and rounding (S2RND),
Signed integer with scaling (ISS2),
Signed integer with high word extract (IH) MORE INFO TO BE ADDED
STACK INSTRUCTIONS -- SP point to next used location

[ -- SP] = allreg; (Suggest before link)
allreg = [SP ++]; (Suggest after link)
[ -- SP] = ( R7 : Dreglim, P5 : Preglimit) or Dreg and Preg on their own
LINK uimm (Manual says minimum value is 8, but LINK 0 and LINK 4 seem OK)
Saves RETS and FP on stack, copies SP into FP and then decrements SP
UNLINK causes FP SP then Mem[SP ++] FP, Mem[SP++] RETS
Use when calling C++ or ASM routines from inside ASM code
SHIFT / ROTATE INSTRUCTIONS

dest_pntr = (dest_pntr + src_reg) << 1; Down shift not allowed
dest_pntr = (dest_pntr + src_reg) << 2;
dest_reg = (dest_reg + src_reg) << 1;
dest_reg = (dest_reg + src_reg) << 2;
dest_pntr = adder_pntr + ( src_pntr << 1 );
dest_pntr = adder_pntr + ( src_pntr << 2 );
ARITHMETIC SHIFT ASHIFT or >>>
LOGICAL INSTRUCTIONS
Dreg = Dreg1 LOGICAL_OP Dreg2;
LOGICAL_OP - &, |, ^
Dreg = ~Dreg1; complement
Also BXOR and BXORSHIFT -- more later
BIT INSTRUCTIONS
BitInstruction(Dreg, bit position) where bit_position is 0 to 31
BitInstruction is BITCLR (clear), BITSET (set), BITTGL (toggle),
CC =BITTST (Dreg, bit position)
Bit test
CC = !BITTST (Dreg, bit position) Bit test
R0 = R1.B(X); R0 = R1.B(Z); // Extract and sign extend a byte value
// CANT DO MATH ON A BYTE VALUE DIRECTLY
Dreg = DEPOSIT ( backgroundDreg, foregroundDreg ) ;
Dreg = DEPOSIT ( Dreg, Dreg ) (X) ; /* sign-extended */
dest_reg >>>= shift_magnitude;

dest_reg = src_reg >>> shift_magnitude (opt_sat);
dest_reg = src_reg << shift_magnitude (S);
accumulator = accumulator >>> shift_magnitude;
dest_reg = ASHIFT src_reg BY shift_magnitude (opt_sat);
accumulator = ASHIFT accumulator BY shift_magnitude;
NOTE ASHIFT using registershave to use register.L R0 = ASHIFT R1 BY R2.L;
LOGICAL SHIFT
LSHIFT or >>
dest_pntr = src_pntr >> 1;

dest_pntr = src_pntr << 1;
dest_pntr = src_pntr >> 2;
dest_pntr = src_pntr << 2;
dest_reg >>= shift_magnitude; dest_reg <<= shift_magnitude;
dest_reg = src_reg >> shift_magnitude;
dest_reg = src_reg << shift_magnitude;
dest_reg = LSHIFT src_reg BY shift_magnitude;
ARITHMETIC INSTRUCTIONS
dest_reg = ABS src_reg;
dest_reg = src_reg_1 + src_reg_2;
NOTE: dest_reg.LorH = src_reg1.LorH + src_reg2.LorH (mode); mode = (NS) or (S)
// Arithmetic is saturating or non-saturating (normal math is NS)
NOTE: dest_reg = src_reg_1 +|- srec_reg_2; H + H and L + L operations both done
// Can also do + | +, + | -, - | +, - | Dreg_lo_hi = Dreg + Dreg (RND20) ; STEP 1: Downshift by 4 and then
Dreg_lo_hi = Dreg - Dreg (RND20) ; STEP 2: perform operation, round top 16 bits
STEP 3: and use top 16 bits fractional number
Dreg_lo_hi = Dreg + Dreg (RND12) ; STEP 1: Upshift by 4 and then
Dreg_lo_hi = Dreg - Dreg (RND12) ; STEP 2: perform operation,
STEP 3: round and use top 16 bits
Dreg = MAX ( Dreg , Dreg ) ;
Dreg = MIN ( Dreg , Dreg ) ;
Preg -= Preg ;
Ireg -= Mreg ;
Preg += Preg (BREV) ;
Ireg += Mreg (opt_brev) ;
dest_reg = src_reg_0 * src_reg_1 (opt_mode) (16 bit mult)
Dreg *= Dreg ; (32 bit mult)
accumulator = src_reg_0 * src_reg_1 (opt_mode)
accumulator += src_reg_0 * src_reg_1 (opt_mode)
accumulator = src_reg_0 * src_reg_1 (opt_mode)
dest_reg_half = (accumulator = src_reg_0 * src_reg_1) (opt_mode)
dest_reg_half = (accumulator += src_reg_0 * src_reg_1) (opt_mode)
dest_reg_half = (accumulator = src_reg_0 * src_reg_1) (opt_mode)
dest_reg = (accumulator = src_reg_0 * src_reg_1) (opt_mode)
dest_reg = (accumulator += src_reg_0 * src_reg_1) (opt_mode)
dest_reg = (accumulator = src_reg_0 * src_reg_1) (opt_mode)
dest_reg = src_reg;
dest_accumulator = src_accumulator
dest_reg = src_reg (RND) (32 bit to 16 bit round and saturate)
accumulator = accumulator (S)
dest_reg = SIGNBITS sample_register
dest_reg = src_reg_1 - src_reg_2;
Ireg -= 2 ;
Ireg -= 4 ;
ROTATE
dest_reg = ROT src_reg BY rotate_magnitude;
accumulator_new = ROT accumulator_old BY rotate_magnitude;
PARALLEL OPERATION EXAMPLES

32-bit ALU/MAC instruction || 16-bit instruction || 16-bit instruction ;
saa (r1:0, r3:2) || r0=[i0++] || r2=[i1++] ;
mnop || r1 = [i0++] || r3 = [i1++] ;
r7.h=r7.l=sign(r2.h)*r3.h + sign(r2.l)*r3.l || i0+=m3 || r0=[i0] ;
NOTE: If two parallel memory operations, only one can involve a Preg
NOTE: If two parallel memory operations, then only one can be a write
EXTERNAL EVENT MANAGEMENT

NOP 16-bit NOP
MNOP 32-bit NOP e.g. MNOP || NOP || NOP ;
IDLE;
CSYNC; (core sync),
SSYNC; (system sync) will force a memory write,
CLI Dreg (clear interrupts,and save old interrupts to Dreg.
STI Dreg (set interrupts from Dreg),
RAISE uimm4 (force interrupt effectively software interrupt of any interrupt)
EXCPT uimm4 (force exception effectively software interrupt of any exception)
TESTSET (Preg) The Test and Set Byte (Atomic) instruction loads an indirectly addressed
memory byte, tests whether it is zero, then sets the most significant bit of the memory byte
without affecting any other bits. If the byte is originally zero, the instruction sets the CC bit. If
the byte is originally nonzero the instruction clears the CC bit.
The sequence of this memory transaction is atomic meaning it cant be blocked by interrupts
as would the sequence
Read memory into R0, test R0, if CC zero then set R0 = 1, Store R0 back to memory.
VIDEO PIXEL INSTRUCTIONS

ALIGN8, ALIGN16, ALIGN24, DISALGNEXCPT, BYTEOP3P (Dual 16-Bit Add / Clip),
Dual 16-Bit Accumulator Extraction with Addition, BYTEOP16P (Quad 8-Bit Add),
BYTEOP1P (Quad 8-Bit Average Byte), BYTEOP2P (Quad 8-Bit Average Half-Word),
BYTEPACK (Quad 8-Bit Pack), BYTEOP16M (Quad 8-Bit Subtract), SAA (Quad 8-Bit
Subtract-Absolute-Accumulate), BYTEUNPACK (Quad 8-Bit Unpack)
VECTOR INSTRUCTIONS basically 2 16 bit ops

Add on Sign, VIT_MAX (Compare-Select), Vector Arithmetic Shift, Vector Logical Shift,
Vector MIN, Vector Multiply, Vector Multiply and Multiply-Accumulate, Vector Negate
(Twos Complement), Vector PACK, Vector SEARCH
Example Vector Add / Subtract
dest = src_reg_0 +|+ src_reg_1;
Example Vector MAX
dest_reg = MAX ( src_reg_0, src_reg_1 ) (V)
Example Vector ABS dest_reg = ABS source_reg (V)
TEST DRIVEN DEVELOPMENT SYNTAX EXAMPLES
// This function is developed bool WaitForAWhileASM(short int time_wanted);
// Assume this function is already written long int CalculateTwiceAccuracy(short int time1,
short int time2); This function returns the timing accuracy 100 * (2* time1 time2) / time2
TEST_CONTROLT(TEST_GROUP_NAME);
TEST(Q2_TESTS) {
// If the parameter time is less than 0 return false (because it cant be done);
CHECK( WaitForAWhileASM(-1) = = false );
CHECK( WaitForAWhileASM(1000) = = true );
// When you make the parameter time bigger, then the subroutine takes longer to return
// Needs to be checked for both a long and a short time
// 2 * Time for WaitForAWhileASM(X) = time for WaitForAWhileASM(2X) ;
// The accuracy of timing should be better than 2% or 2 part in 100.
unsigned long int time1, time2;
time1 = MEASURE_EXECUTION_TIME(WaitForAWhileASM(400));
CHECK( CalculateTwiceAccuracy(time1, time2) < 2); // 1 is 1%, 2 is 2%
CHECK( CalculateTwiceAccuracy(time1, time2) < 2); // 1 is 1%, 2 is 2%
}
TEST(Q4_TEST) {
StopCoreTimer( );
// Stop the timer
InitializeCoreTimer(0x200000, 0x200000, 1);
// Set some sensible values into the
core timer registers
StartCoreTimer(3 );
// Start the timer
long int time1 = ReadCoreTimerAndResetASM(0x2000, 0x2000);
// Assert statement #1 check that the core timer value has got smaller since started
CHECK(time1 < 0x200000);
// Assert statement #2 From the code we expect 2 writes and 1 read (in total) to occur
// when we run the ReadCoreTimerAndResetASM( ) function
WatchDataClass<unsigned long int> coretimer_access( 2,
(unsigned long int *) pTCOUNT, (unsigned long int *) pTPERIOD);
// NOTE that the 2 in this line means we are specifying 2 memory locations for the
// WatchDataClass to keep track on.
// NOTE that the WatchDataClass and WATCH_MEMORY_RANGE keep track of ALL
// memory read and write operations that occur in the memory locations between the
// addresses for TCOUNT and TPERIOD
WATCH_MEMORY_RANGE(coretimer_access,
ReadCoreTimerAndResetASM (0x3000, 0x3000));
CHECK(coretimer_access.getReadsWrites() = = 3);
// Assert statement #3 get the final value of the core-timer registers
// getFinalValue(0) would get the final value of the TCOUNT register
// getFinalValue(1) would get the final value of the TPERIOD register
// It is the order in the WatchDataClass line that counts, not the parameter order in
// ReadCoreTimerAndResetASM( )
CHECK(coretimer_access.getFinalValue(1) = = 0x3000);
}
TEST_FILE_RUN_NOTIDICATION(TEST_GROUP_NAME);
E-Unit functions and constants

Header information
#define EMBEDDEDBEDUNIT_LITE
#include <EmbeddedUnit/EmbeddedUnit.h>
Basic EmbeddedUnit
TEST_CONTROL(TEST_GROUP_NAME);
TEST(TEST_NAME) ;
TEST_FILE_RUN_NOTIFICATION(TEST_GROUP_NAME);
CHECK(expression) CHECK(a == b);

XF_CHECK(expression)
CHECK_EQUAL(expected, actual)
XF_CHECK_EQUAL(expected, actual)
CHECK_CLOSE(expected, actual, tolerance)
CHECK_ARRAY_EQUAL(expected, actual, count)
CHECK_ARRAY_CLOSE(expected, actual, count,
tolerance)
CHECK_ARRAY2D_CLOSE(expected, actual, rows,
columns, tolerance)
CHECK_THROW(expression, ExpectedExceptionType)
CHECK_ASSERT(expression)
REPORT(msg)
TIME_CONSTRAINT(ms), TIME_CONSTRAINT_US
TIME_CONSTRAINT_EXEMPT()
MEASURE_EXECUTION_TIME(time)
HARD_TIME_CONSTRAINT_TRY(ms),
HARD_TIME_CONSTRAINT_CATCH()
HARD_TIME_CONSTRAINT_END()
MEMORY_CONSTRAINT(maxChange)
MEMORY_CONSTRAINT_EXEMPT( )
void CodeCoverageStartLogging(int loopCompress = 2)
void CodeCoverageStopLogging(void)
Time Triggered Co-operative Scheduler functions

void uTTCOS_Dispatch_Tasks(void) -- This is the 'dispatcher'
function.
When a task (function)
is due to run,
uTTCOS_Dispatch_Tasks() will run it. This function must be
called (repeatedly) from the main loop.
unsigned char uTTCOS_AddThread(Do_X, delay, period)
Causes a thread (function) to be executed at regular intervals after a
fixed delay.
unsigned char uTTCOS_DeleteThread(const unsigned char
TASK_INDEX) -- Removes a thread from the scheduler. Note
that this does *not* delete the associated function from memory:
it simply means that it is no longer called by the scheduler.
void uTTCOS_Init_Scheduler(void) -- Scheduler initialisation
function. Prepares scheduler data structures and sets up timer
interrupts at required rate. You must call this function before using
the scheduler.
void uTTCOS_Start_Scheduler(void) -- Starts the scheduler, by
enabling interrupts. NOTE: Usually called after all regular threads
are added, to keep the thread synchronised. NOTE: ONLY THE
SCHEDULER INTERRUPT SHOULD BE ENABLED IN
uTTCOS!!!
void
uTTCOS_Update(void) -- This is the scheduler ISR
Callback. It is called at a rate determined by the timer settings for
the interrupt.
From -- <MockDevice/CoffeePotFunctions.h>
COFFEEPOT DETAILS
#define INITandSTAYPOWEREDON_BIT (1 << 0)
#define LEDPOWERON_BIT
#define WATERPOWERON_BIT
#define HEATERPOWERON_BIT
#define DEVICE_READY_BIT_RO
(1 << 1)
(1 << 2)
(1 << 3)
(1 << 4)
// 0x10 -- RO
typedef struct COFFEEPOT_DEVICE_REGISTER_STRUCTURE {

unsigned short int controlRegister // A BIT pattern -- OFFSET 0
bytes
unsigned short int TIMER_upper; // Value -- OFFSET 2 bytes
unsigned short int TIMER_lower
// Value-- OFFSET 4 bytes
// unsigned int TIMER = (TIMER_upper << 16) + TIMER_lower;
unsigned char heaterRegister
// Value -- OFFSET 6 bytes
unsigned char waterInFlowRegister; // Value -- OFFSET 7 bytes
unsigned char heaterBoostRegister; // Value -- OFFSET 8 bytes
} COFFEEPOT_DEVICE;
PARAMETER PASSING EXAMPLE 2 -- See earlier for STACK passed example

#define INPAR4_ON_STACK
28
// NOT IN R3
24
20
16
#define SAVED_P3
12
#define SAVED_P4
8
#define RETS_LOCATION_ON_STACK
4
#define OLD_FP_LOCATION_ON_STACK 0
// IN STUFF relative to FP
#define ARRAY_2START
28
// This is relative to SP
#define ARRAY_1 START
16
#define OUTPAR4_ON_STACK
12
// Passed on STACK NOT IN R3
8
4
0
.section _L1_data;
.byte4_staticVariable = 0;
// This is how you declare a functions static variable
.byte2 _globalArray[4] = {0, 0, 0, 0}; // unsigned short int globalArray[4] = {0, 0, 0, 0};
. section program;
.global _Foo;
// void Foo(INPAR1, INPAR2, INPAR3, INPAR4)
_Foo: [--SP] = P3;
[--SP] = P4; // Save non-volatile register
// int array1[3];
int array2[5];
LINK 48;
// 16 spaces for new stack + 2 arrays of size 4
// See above
// static int staticVariable = 0;
.extern _Somewhere;
P4.L = lo(_Somewhere);
// Point to memory location _Somewhere
P4.H = hi(_Somewhere);
// Reference resolved by linker since .extern
[FP + INPAR1_SPACE_ON_STACK] = R0; // Save inpar1_R0 for later
[FP + INPAR3_SPACE_ON_STACK] = R2; // Save inpar2_R1 for later
R0 = [FP + INPAR4_ON_STACK];
// OUTPAR4 = INPAR4
[SP + OUTPAR4_ON_STACK] = R0;
[SP + ARRAY_2START] = R0;
// array2[0] = INPAR1 (local storage)
R2 = -6 (X);
// Sign extend OUTPAR3value = 0xFFFFFFFA;
R1 = R1;
// OUTPAR2 = INPAR2
R0 = -6 (Z);
// Zero extend OUTPAR1 value = 0x0000FFFA;
.extern _Subroutine_NM;
CALL _Subroutine_NUM;
// Subroutine(0xFFFF, INPAR2, 0xFFFF, INPAR4)
P4.L = lo(FIO_FLAG_D); P4.H = hi(FIO_FLAG_D); // Constant from
// <defsBF533.h> requires hi/lo macros
W[P4] = R0;
// Store return value as 16-bit (output to GPIO)
ssync;
// Force Blackfin to write
// Also see P0 = [FP + 4]; // Get RETS
UNLINK
UNLINK
P4 = [SP++]; P3 = [SP++];
RTS
JUMP (P0); // Faster?
PROGRAMMABLE FLAGS (PF) REGISTERS
PROGRAMMABLE FLAGS (PF) REGISTERS
NOTE: FIO_FLAG_D BITS ARE SET DURING EDGE-TRIGGERED

INTERRUPTS AND MUST BE CLEARED INSIDE THE INTERRUPT SERVICE
ROUTINE
There are also other FIO Mask registers with a similar format
FIO_MASKA_C -- Write 1 to Clear
FIO_MASKA_T Write 1 to toggle
FIO_MASKB_D Write 1 to enable, write 0 to clear
FIO_MASKB_S Write 1 to Set
FIO_MASKB_C Write 1 to Clear
FIO_MASKB_T Write 1 to toggle
TO ENABLE A PF INTERRUPT , SET THE CORRESPONDING FLAG MASK BIT

TO FORCE AN INTERRUPT (AFTER ENABLE) SET THE CORRESPONDING
INTERRUPT SET BIT
NOTE: THE FOLLOWING INTERRUPT REGISTERS HAVE A SIMILAR FORMAT
FIO_MASKA_C (CLEAR W1C) FIO_MASKA_T (TOGGLE W1T)
WATCH-DOG TIMER
WATCH-DOG STATUS REGISTER WDOG_STAT

The 32-bit Watchdog Status register (WDOG_STAT) contains the current
count value. Reads to WDOG_STAT return the current count value. When
watchdog enabled, WDOG_STAT decrements by 1 on SCLK. When reaches
0, t timer stops counting and the event selected (WDOG_CTL) is generated.
Timer disabled, write WDOG_CNT to pre-load WDOG_STAT. Timer
enabled, write WDOG_STAT register loads it with WDOG_CNT.
WATCH-DOG TIMER
INTERRUPT CONTROL
INTERRUPT CONTROL
IPENd HAS SAME FORMAT AS ILAT BUT IS READ ONLY
EXAMPLE 68K CODE

Conditional operations
SIC_ISR SYSTEM INTERRUPT STATUS REGISTER HAS THE SAME FORMAT

EXCEPT THAT 0 MEANS DEASSERTED (NO INTERRUPT), 1 MEANS
ASSERTED ALL BITS READ ONLY
SUBROUTINE AND PARAMETER PASSING
MOTOROLA BRANCHES AFTER CMP D0, D1

BEQ (D0 == D1) ) BNE (D0 != D1)
BGE (D1 >= D0) BGT (D1 > D0) BLE (D1 <= D0)
BHS (D1 >= D0 UI) BHI (D1 > D0 UI)
BLS (D1 <= D0 UI) BLO (D1 < D0 UI)
CORE TIMER
GENERAL PURPOSE
TIMER0, TIMER1, TIMER2
ENABLE BITS ARE W1S

DISABLE BITS ARE W1C
Also Core Timer TPERIOD
Also TIMERx_WIDTH
and TIMERx_PERIOD
which are 32-bit registers
GENERAL PURPOSE
TIMER0, TIMER1, TIMER2
All three GP timers have equivalent registers
e.g. THUS TIMERX_WIDTH MEANS REPLACE X BY 0, 1, 2
TIMER0_WIDTH, TIMER1_WIDTH, TIMER2_WIDTH
TIMERx_COUNTER, x = 0, 1, 2
TIMERx_PERIOD, x = 0, 1, 2
TIMERx_WIDTH, x = 0, 1, 2
All have following format
DYNAMIC POWER MANAGEMENT
Relationship between PERIOD, WIDTH and COUNT
SPI INTERFACE
SPI HARDWARE
C++ STACK FRAME

THERE IS ALSO READ ONLY SPI_RDBR (RECEIVE)
AND SPI_SHADOW (COPY OF SPI_RDBR FOR TEST)
Build the stack using LINK X;

Where X = sum of the number of bytes of
outgoing arguments (16)
+ 4 * number of registers saved
+ 4 * number of long local variables
+ 2 * number of short local variables
+ 1 * number of char variables
where X is round UP to nearest 4
NOTE: only the space for outgoing arguments has a
special location on the stack. The order of local
variables and save space is the choice of the
programmer
BLACKFIN PIPELINE DURING

EX_INTERRUPT_HANDLER ( )
1`
Note: ALL other interrupts blocked not good)

If you save RETI to the stack ([--SP] = RETI) as part of your
interrupt handler then the Blackfin hardware automatically
allows higher interrupts to interrupt lower interrupts.
From C++ EX_REENTRANT_HANDLER( )

BlackfinReferenceSheet 14by8 12nov2015 UPDATED

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

BlackfinReferenceSheet 14by8 12nov2015 UPDATED

Uploaded by

Copyright:

Available Formats

Blackfin (ADSP-BFXXX) Reference

V2.3.3 12 November 2015, smithmr@ucalgary.ca

R0, R1, R2, R3 volatile (ENCM369 temporary)

DSP REGISTERS ** NOT SURE WHICH ARE VOLATILE **

ASM INSTRUCTION NOTATION CONVENTION

PARAMETER PASSING Method used in labs shown later.

_Foo: LINK 24;

PROGRAM FLOW INSTRUCTIONS

JUMP User_Label PC replaced by address of User_Label

Compare Pointer Registers -- Not parallel (16-bit)

CALL User_Label PC replaced by address of User_Label next instruction RETS

LOAD / STORE INSTRUCTIONS

Compare Data Registers -- Not parallel (16-bit)

Operand_2 Preg or small constant where small constant is imm3 or

Compare Accumulator Registers -- Not parallel (16-bit)

Always signed compares

CC OP Source; Source Dreg, statbit

Dreg, Preg, SP, FP

MOVE REGISTER TO RGISTER INSTRUCTIONS

Dreg = Dreg_lo (X) ;

Accumulator to Half D-register Move supports the following options

STACK INSTRUCTIONS -- SP point to next used location

SHIFT / ROTATE INSTRUCTIONS

ARITHMETIC SHIFT ASHIFT or >>>

dest_reg >>>= shift_magnitude;

dest_pntr = src_pntr >> 1;

PARALLEL OPERATION EXAMPLES

EXTERNAL EVENT MANAGEMENT

VIDEO PIXEL INSTRUCTIONS

VECTOR INSTRUCTIONS basically 2 16 bit ops

E-Unit functions and constants

CHECK(expression) CHECK(a == b);

Time Triggered Co-operative Scheduler functions

typedef struct COFFEEPOT_DEVICE_REGISTER_STRUCTURE {

PARAMETER PASSING EXAMPLE 2 -- See earlier for STACK passed example

PROGRAMMABLE FLAGS (PF) REGISTERS

PROGRAMMABLE FLAGS (PF) REGISTERS

NOTE: FIO_FLAG_D BITS ARE SET DURING EDGE-TRIGGERED

TO ENABLE A PF INTERRUPT , SET THE CORRESPONDING FLAG MASK BIT

WATCH-DOG STATUS REGISTER WDOG_STAT

IPENd HAS SAME FORMAT AS ILAT BUT IS READ ONLY

EXAMPLE 68K CODE

SIC_ISR SYSTEM INTERRUPT STATUS REGISTER HAS THE SAME FORMAT

SUBROUTINE AND PARAMETER PASSING

MOTOROLA BRANCHES AFTER CMP D0, D1

ENABLE BITS ARE W1S

Also Core Timer TPERIOD

Relationship between PERIOD, WIDTH and COUNT

C++ STACK FRAME

Build the stack using LINK X;

BLACKFIN PIPELINE DURING

Note: ALL other interrupts blocked not good)

You might also like

DSP REGISTERS NOT SURE WHICH ARE VOLATILE