Professional Documents
Culture Documents
MCA
UNIT 8 NOTES INTRODUCTION TO 16/32 BIT PROCESSORS 8.1 FEATURES 32 bit processors execute 32 bit operation at a time in single cycle 32 bit processor having a subset of instructions for 16 bit coding is called 16/32 bit processor These processors are required in sophisticated and precision applications Normally superscalar architecture is implemented. It means that multiple instructions can be executed in a single cycle using pipelining techniques In pipelining ,instruction fetch, instruction decode and instruction execution for different instructions is performed simultaneously in single cycle RISC architecture is implemented RISC has the following characteristics (features) o Fixed length instruction o o o o o o Standard execution time for instructions (single cycle) Fewer number of instructions Fewer number of addressing modes and instruction formats Large register set Hardware implementation Easier implementation of pipeline
8.2 RISC architecture implements the following design technique Register windows by diving register file into groups called windows and assigning them to procedures Pipelining Delayed branch by rearranging instructions Score boarding to overcome data dependency problems Dual cache Instruction level parallelism (ILP) issuing N instructions per cycle and results may be obtained per CPU cycle 8.3 ARM Architecture and Organization ARM stands for Advanced Risc Machines ARM designed a family of Risc super scalar processor architectures based on VLSI
Faculty : S V Altaf Page 1
MCA
ARM has various architectures like ARM2, ARM3, ARM6, ARM8, ARM9 It has many variants like V1, V 2, V 3, V 3M, V 4, V 4T, V 5 etc V3 onwards full 32 bit addressing for both data and code is implemented M variables have long multiply and multiply and accumulate facility T variants have thumb instruction set. E variants have enhanced DSP instructions
HCMOS (High performance Complementary N channel Metal Oxide Semiconductor) technology Die size of .25 Mm or less Low power consumption High performance 300 MIPS 32 bit data bus
Page 2
MCA
32 bit address bus hence 4GB memory capacity Princeton architecture 8 bit / 16 bit / 32 bit data type 32 bit ARM instruction set and 16 bit thumb instruction subset 16, 32 bit registers 32 bit RALU High performance multipliers Co processor interface On chip JTAG interface All RISC features Fast high priority interrupt request (FIQ) and interrupt request (IRQ)
8.6 ARM based MCUs Several companies have designed MCUs based on ARM architecture STR710 or STR720 by ST Microprocessors LPC 2114/24 by Philips S32C 2410 X 01 by Samsung Above microprocessors have CPU based on ARM architecture plus various other facilities STR 710 has fast flash, high speed SRAM (16/64 KB), Five SPI + UART + I 2 C, USB IF and CAN IF and HMC IF LPC 2114 has upto 256 kb flash, 16 kb high speed SRAM, 8 kb cache for Inst and Data, MMU, Four SPI + UART + I 2 C, USB IF and CAN IF S32C 2410 X 01 has flash boot loader, 16 kb cache for inst, 16kb cache for data, PLL +RTC+ WDT, Two SPI + 3 channel UART, 2 Port USB host / 1 port USB device, Four DMA channels and 8 channel 10 bit ADC 8.7 ARM / THUMB Programming Model
Faculty : S V Altaf
Implements both Princeton architecture (ARM7) and Harvard Architecture (ARM9) Word alignment can be big endian or little endian 16, 32 bit general purpose register (r0 to r15) GPR can be used for data computation or as index register RALU operates operations through registers only CPSR is current program status register. It is in addition to 16 GPRs. It contains flags N,Z,C and V in bits 31, 30, 29 and 28 R14 is used as return address pointer R15 is used as program counter R13 is used as implicit stack pointer Thumb programming model implements the following:
Page 3
MCA
o o o o o o
Eight registers r0 to r7 are used for most of the instructions Memory is half word aligned 16 bit instruction format is used T bit of CPSR is used to switch from ARM to Thumb and vice versa It does not have condition field ( 4 bits) in instruction format It does not have shift/ rotate field (12 bits) in instruction format
8.8 UNITS of ARM Architecture RALU General purpose registers 16 nos r13 is used as SP r14 contains return address, r15 is used as PC Current program status register (CPSR) It contains flags N (negative), Zero (Z), Carry (C) and overflow (V) Saved program status register (SPSR) It preserves the value of CPSR when exception occurs or call is made Instruction register (IR) Instruction decoder (ID) Condition test and branch logic it is associated with RALU and decides the program flow 32 X 32 multiply and 32 X 32 + 64 multiply and accumulate unit (MAC) It improves performance for multiply, Multiply and accumulate operation, necessary for DSP and control application
8.9 ARM addressing modes Addressing modes for data processing operands ( direct and obsolete addressing not available) o Immediate -> Immediate operand is specified as shift operator. First source operand may or may not be present Example .MOV R1, #0 .Add R1, R3, #8 0 -> R1 R3 + 8 -> R1
Register -> Shifter operand is specified as register. First source operand may or may not be present Example MOV R0, R2 R2 -> R0
Page 4
Faculty : S V Altaf
MCA
R2 + R3 -> R4
Faculty : S V Altaf
Page 5
MCA
In these addressing modes, shifter operand is a register whose contents are shifted by an immediate value or value specified in other register Type of shifts / rotates are as follows Logical shift left Logical shift right Arithmetic shift right Rotate right
Examples .ADD R9, R4, R4, LSL #2; .SUB R8, R6, R5, LSR #4; R9 = R4+R4 X 4 R8 = R6 R5/16 .MOV R12, R4, ROR R2; R12 = R4 rotated right by value of R2
Addressing modes for load / store word or unsigned byte (Direct and absolute addressing not provided) o o o In these modes address of memory location is to be calculated Base register (any one of GPRs) is used and its contents are modified in various ways OFFSET 1. Immediate Offset Address of memory location is calculated by adding or subtracting the value of an immediate 12 bit offset to/from value of base register Rn Examples: .LDR R1, [R2]; .LDR R3 [R4, #8], 2. Register Offset Address of memory location is calculated by adding or subtracting the value of index register from to/or from value of base register Rn Example: LDR R2, (R3-R4); 3. Scaled register Offset Address if memory location is calculated by adding or subtracting the shifted or rotated values of index register Rm to/ from value of base register Rn Example .LDR R1, R2-R3, LSL#3; R1 <- (R2 8X R3) R2 <- [R3-R4] load R1 from addr pointed by R2 load R3 from address [ R4+8]
Faculty : S V Altaf
Page 6
MCA
Pre Indexed or Pre auto Indexed Address calculation same as 1,2,3 of above. If the condition field specified in the inst matches with condition code status, the calculated address is written back to base register Rm Example .LDR R3, [R4, #8]! ! mark is added to distinguish it from OFFSET type
Post Indexed or post auto Indexed Here, the address of memory location is the content of base register Rn, However Rn is modified in after Inst execution if condition specified n instruction matches with condition code status. Modification is as per above. Three rules 1,2,3, of OFFSET described above Example: LDR R3, [R4], #8 R3 <- [R4] and then R4 +R8 <- R4 if condition is sastisfied
8.10 ARM/THUMB instruction set Data transfer instruction o MOV It moves shifter operand to destination register RD Examples: MOV R0, #5; 5-> R0 MOV R1, R2 ; R2-> R1 MOV R1, R2, LSL #4 ; MOV R1, R2, R0R R4 ; o MVN (Move not) It moves compliment of shifter operand to destination register o LDR (Load register) it loads a word, from memory address as specified by addressing mode, into destination reg Rd Examples: LDR R1, [R0]; LDR R4, [R2, #5]; [R0] -> R1 [R2+5] -> R4 [R2 x 16] -> R5 16 x R2 -> R1 Rotated R2 by value if R4 -> R1
MCA
LDRH; Load half word LDRSH; load signed half word LDRB; load byte LDRSB; load signed byte singed half word means extend the sign bit 0 or I at bit 15
Faculty : S V Altaf
Page 8
MCA
STR (Store register) It stores a word from register into memory locator. It does not operate as signed byte or signed half word
POP < reg list {,R} It retrieves register list from stack. R13 is SP
PUSH < reg. list {,R} It saves register list on stack R13 is SP reg-list -> specified by four bits R=0 -> R14 is included in operation R=1 -> R15 is included in operation
Arithmetic Operations
Faculty : S V Altaf
Page 9
MCA
Examples MUL MLA SMULL R1, R2, R3; R6, R5, R3, R4; R1, R2, R3, R4; R2 x R3 -> R1 R6 = R5 x R3 + R4 R1 = Bits 0 to 31 of R3 x R4 R2 = Bits 32 to 63 of R3 x R4 UMLAL R1, R2, R3, R4; R2, R1 = R3 x r4 + R2, R1
Logical Operation ORR AND EOR Logical OR two operands in regs, Result goes to Rd Logical AND two operands in regs, Result goes to Rd Logical Exclusive OR two operands in regs, Result goes to Rd
Faculty : S V Altaf
Page 10
MCA
No separate instructions for this. Data processing insts ( Data transfer, Arith, Logical) can be used to perform shift and rotate operation on reg. operand Example: MOV R1, R1, LSL#2; R1 x 4 -> R1
Various operations possible are ASR LSL LSR ROR RR x -> Rotate right with extended (extended to carry bit) Comparison and test instructions CMP -> Compare It performs subtraction operand on two source operands both operands remain intact. Flags of CPSR affected CMM -> Compare two source operands, second one after negation TST -> Logical AND operation. Operands remain intact only flags affected TEQ -> Logic XOR operation. Operands remain intact, only flags affected Program Flow (Branch) instructions
Conditional or unconditional jump Branch to label conditionally or unconditionally if code field is absent
BL <cond> label
Conditional or unconditional call Branch to label conditionally or unconditionally and store return addr in link Reg (R14)
BX <cond> <rm>
Conditional or unconditional jump and exchange Branch to address held in Rm, conditionally or unconditionally. Bit 0 of Rm selects target ARM instruction or thumb instruction
MCA
Branch to address held in register Rm conditionally or unconditionally. Store return addresses in link register R14. Bit of Rm selects target ARM or Thumb instruction BLX Label unconditional CALL and exchange Branch to address specified as label. Store return address in R14. Always execute Thumb instruction at target Return There is no return instruction as such. Return can be achieved by using MOV PC, LR inst ie MOV R15, R14. Another Inst can be BX R14 Calculation of Sign extend the 24 bit signed immediate to 32 bits Label or target Shift result left by two bits (because next instruction is way after 4 bytes) address Add this to contents of PC, which contains the address of branch inst plus 8 Thus a branch to +/- 32M takes place. In other words a branch backwards or forward by upto 32 MB with respect to PC is achieved. It is 32 MB because 24 bits signed value gives a range of +/- 8 MB multiply it by 4 to get +/- 32 MB Examples: B label ; Branch un conditionally BCC label ; Branch if carry flag is set BEQ label ; Branch if Zero flag is set BL func ; subroutine call to Func 8.11 Thumb Instruction Subset It is re encoded subset of ARM instruction set It increases performance as 16 bit data bus is used Compact code is generated by compiler T variants incorporate both ARM and thumb inst Thumb does not alter programmers model. It has advantage of 32 bit ARM processor All Thumb data processing instruction operate on full 32 bit values or 32 bit addresses are produced
Faculty : S V Altaf Page 12
MCA
Encoding of 16 bit is achieved due to following Four bit condition field is not used Only 8 general purpose regs are used R0 to R7 by majority of insts Shifter operand not used No access to CSPR or SPSR Thumb Mode is entered using ARM BX inst. It can also be initiated by setting T bit in SPSR Examples LDR Rn, # offset 5 ADD/SUB Rd, Rn, Rm ADD/SUB Rd, Rn, # CMP Rd, Rn, Rm AND Rd, Rn, # 8.12 Exception handling Exceptions occur while a program is running. These are different from interrupts. Interrupts occur due to an execution of SW Instruction or due to request from hardware devices, internal or external. Exceptions occur due to run time conditions, For example detection of illegal opcode, and access to protected memory, division by zero, array out of bound or URL not found while connecting to Web Like interrupt vectors, exception vectors are available, as per CPU architecture, for servicing them. These vectors point to exception handling routines Exceptions have different priorities like interrupts The detection of exception and interrupting CPU is called catching exception Catching followed by vectoring to exception handler is called throwing the exception Processor status just before handling exception should be preserved Multiple exception can occur simultaneously ARM provides for interrupts and as well as exceptions There are total of seven exceptions and interrupts given below: Type of Exception High Vector / Interrupt 1. Reset Address FFFF0000H CPU enters supervisory mode. Fast/ Normal interrupt disabled. PC is set to
Faculty : S V Altaf Page 13
Use/Function
MCA
FFFF0000H configured
Faculty : S V Altaf
Page 14
MCA
2. Undefined Instruction
FFFF0004H
CPU saves PC and CPSR. It disables normal Interrupts. PC is set to FFFF0004H. If co processor instructions is encountered And there is no HW for this then SW evaluation is done on its occurrence
FFFF0008H
CPU enters supervisory mode. PC and CPSR Particular function is executed on its occurrence
saved. Normal interrupts disabled FFFF000CH CPU saves PC & CSPR, disables normal interrupts. Fetched instruction marked as invalid and pre fetch abort is flagged when this instruction reaches execution stage After fixing the reason for abort, return from handler
Type of Exception High Vector / Interrupt 5. Data Abort (Data Access memory abort) Address FFFF0010H
Use/Function CPU saves PC and CSPR, disable normal Interrupts. Activating it in response to data access marks the data invalid. After fixing the reason return from handler. it occurs before any following inst has altered the CPU state
CPU saves PC and CPSR Generated externally by asserting IRQ pin at CPU Disables further IRQs Executes ISR and returns IRQs / FIQs can be re enabled if needed
FFFF001CH
CPU saves PC and CPSR Disables FIQs and IRQs Generated externally by asserting FIQ pin at CPU FIQ has sufficient private registers hence no no need of saving register so context switch overhead is minimized FIQ handler can re enable IRQs / FIQs
Faculty : S V Altaf
Page 15
MCA
Features of exception handling o Handles exception after completely executing current instruction o o o o o CPSR and PC saved to SPSR and R14 To return, after handling exception, SPSR is moved to CPSR and R14 to PC through a suitable instruction. Eg. SUBS Pc, R14, #4 Multiple exceptions allowed at same time Exception handler code should not cause further exception SWI interrupt handler can call another SWI interrupt handler
Exception / Interrupt Priorities Exception/ Interrupt Reset Data Abort FIQ IRQ Prefetch Abort Illegal Inst / SWI | Lowest Priority Highest | | |
8.13 Development tools for ARM products Various development tools designed by various vendors to support ARM based products ARM based MCUs developed by ST micro electronics, Philips, Atmel and Net silicon are supported Important tools are listed below: o RTOS (Real time operating system) with C complier. It is multi tasking RTOS source hex file generated using linker/locator o o o o o Assembler : Source hex file generated for assembly level program using macros. It uses linker/locator Assembler on C Compiler : Both are available and generate hex file Assembler with IDE : It adds the facility of project manager, editor. Make features and online help for automatic detection and correction of errors IDE with Debugger : It interfaces debugger with a target and facilitates HW level testing and debugging. It allows modular testing of progress Complete set of development tools
Faculty : S V Altaf
Page 16
MCA
In circuit Emulator : A circuit which emulates target system, IO, timers, serial port and device functions
Faculty : S V Altaf
Page 17
MCA
KEIL SW INC manufactures the KEIL ARM tools. It has an assembler, C compiler, debugger and simulator. It supports most popular ARM derivatives Atmel offers development boards (Hardware) device drivers. Its development tools include complier, Debugger, JTAG, ICE and RTOS ST microelectronics dev tools include compiler, targeted GUI debugger, ICE and JTAG. The popular developer suite is called ARM Real View. JTAG is an interface which allows high speed access to MCU internal units like registers and control unit
Faculty : S V Altaf
Page 18