Professional Documents
Culture Documents
Qu es procesamiento de seal?
Seal de Entrada (Analgica o Digital) Operacin, Transformacin Procesamiento Seal de salida (Analgica o Digital)
Ejemplo de seales:
Analgicas: Voz, msica, fotos, Video, radar, sonar, En el dominio Discreto/Digital:
Voz digitalizada, msica, imgenes, video, radar , ...
Operaciones, Transformaciones sobre seales digitales (usando una computadora o algn otro dispositivo Especializado en manejo de seales digitales)
Y las seales...
Seal Analgica A/D Procesamiento Digital D/A
Ejemplos
Porqu Digital??
Aplicacin Tpica
Paso 1: Un sensor analgico capta la seal (ej: microfono) Paso 2: Conversin A/D Paso 3: DSP procesa la informacin digital (ej., compresin, supresin de ruido) Paso 4: D/A para recuperar la seal analgica original
Requiere un tratamiento, o transformacin de la seal Lo ms rapidamente posible para mantener una cierta Sincronizacin con los eventos de entrada.
Ejemplo:
Procesador a120 MHz, puede realizar 120MIPS
Frec . De Muestreo = 48KHz (Digital Audio Tape - DAT) numero de inst. por muestra = (120 x 106)/(48 x 103) = 2500. fs = 8KHz (Banda de voz, telefona) nmero de instrucciones por muestra= 15000. fs = 75MHz (CIF 360x288 Video a 30 cuadros por segundo) nmero de instrucciones por muestra= 1.6.
Tcnicas Digitales III 6
Desafo:
Cdigo; Compacto para ser ejecutado en tiempo real. Buena cantidad de instrucciones entre muestras.
Qu es DSP?
DSP = Proc. Digital de seales O DSP = Procesador Digital de seales DSP es usado para ambos
El significado se deduce del contexto en el cual la palabra DSP es usada.
Repetibilidad
Misma performance de una unidad a otra. No cambia su performance por cambios de temperatura o envejecimiento.
Procesamiento de audio
Compresion Reproduccin 3-D
10
Procesamiento de Imgenes
Procesamiento
Compresin Reconocimiento de Patrones Cancelacin de fantasmas (Ghost cancellation) Reduccin de ruido Seguimiento de objetos Fusin de imgenes
11
Telefona Celular
Compresin de voz
Software de Radio
12
DSP : Pager
Controlado por la unidad de manejo de la potencia (Power Management Unit) RF RF Receiver Receiver Microcontroller Microcontroller Chip Chip Pager Pager Peripherals Peripherals
ADC ADC
DAC DAC
http://www.motorola.com/FLEX
13
DSP : Pager
14
RF RF Codec Codec
16
17
18
20%
24% 40% Texas Instruments Motorola Agere 8% Analog Devices Other 16% 12%
43% 9%
14% 14%
19
6% 8%
11% 68%
20
PCs portables
26 M unid./ao en 2002 Creci 14% de 1999 al 2002
Celular
Trep a 500 M unid./ao en el 2002
21
P e r f o r m a n c e
P o w e r
Costo efectivo
Ao: 2003
Ao: 1999
Tcnicas Digitales III
Tiempo
22
23
h[k ].x[n k ]
Solo multiplicar Y acumular (MAC)
En lenguaje C
24
11 12 3
X
11 24 9
Clr Clr Loop Mov Mov Mpy Add Inc Inc Dec Tst Jnz Mov A B *R0, Y0 *R1,X0 X0,Y0,A A,B R0 R1 N N Loop B,*R2 ;Clear Accumulator A ; Clear Accumulator B
R2
44
R1
1 2 3
; Move data from memory location 1 to register Y0 ; Move data from memory location 2 to register X0 ;X0*Y0 ->A ;A + B -> B ;R0 + 1 -> R0 ;R1 + 1 -> R1 ;Dec N (initially equals to 3) ;Test for the value ;Different than zero loop again ;Move result to memory
25
11 24 9
Clr Rep MAC A N *(R0)+, *(R1)+, A ;Clear Accumulator A ; Rep N times the next instruction
R2
44
1 2 3
; Fetch the two memory locations pointed by R0 and R1, multiply them together and add the result to A, the final result is stored back in A ; Move result to memory
Mov
A, *R2
26
Desventajas de un PPG
Memory
Register 1
Register 2
ALU
28
ALU
Accumulator
29
Estructura de Memoria
30
DSP vs PPG
RISC vs. CISC
RISC Emphasis on software Single-clock, reduced instruction only Includes multi-clock complex instructions large code size Spends more transistors on memory registers Transistors used for storing complex instructions
CISC
Emphasis on hardware
Arquitectura de memoria
Arquitectura de memoria Harvard vs. Von Neuman
Transferencias de datos concurrentes
MAC
Multiply ACcumulate instruction
DSP vs PPG
Multiple unidades en paralelo
Multiplicar y acumular (posiblemente varias unidades) Clculos de direcciones en paralelo para procesar Registo circular ALU especial para calculo de direcciones Bit reversed addressing Direccionamiento circular Software looping: escrito en cdigo assembly para mejorar los saltos Hardware looping: hardware dedicado usando lazos con registros contadores
Accesos a memoria
32
Memoria de datos
DM Core PM
Memoria de programa
33
Mayor paralelismo
Incrementando el nmero de operaciones que se puede realizar en cada instruccin
Adicionando ms unidades de ejecucin(ej: Multiplicadores)
Incrementando el nmero de instrucciones que pueden ser emitidas y ejecutadas en cada ciclo.
Por qu considerar un DSP como alternativa de diseo? Los sistemas Wireless requieren una muy alta performance y elevado ancho de banda.
Performance
3G 2.5G 2G
~100MIPS 8-13 Kbps ~100,000MIPS 384-2000 Kbps ~10,000MIPS 64-384 Kbps
Bit Rate
35
Operaciones aritmticolgicas Multiplicacin y suma permite una rpida ejecucin de operaciones iterativas
Aplicacin de circuitos integrados especficos (ASIC) Field Programmable Gate Array (FPGA)
Tcnicas Digitales III 37
ASIC - Ventajas
Velocidad Consumo de potencia bajo costo/performance Flexibilidad para diseo
38
ASIC- Desventajas
FPGA
39
Qu es un FPGA?
Es una red de hardware configurable con interconexiones reconfigurables controladas por el switcheo de una matriz de control. Histricamente se usaron para prototipos Recientemente incluyen caractersticas de DSP
La mayor Compania de DSP + FPGA: ALTERA(ej: Stratex) & XILINX (ej: Virtex II)
Tcnicas Digitales III 40
FPGA - Ventajas
Mayor flexibilidad que un ASIC Alta Performance en algunas aplicaciones Reusabillidad de Hardware para diferentes aplicaciones
41
FPGA - Desventajas
Largo ciclo de desarrollo Caro comparado con un DSP Mayor consumo de potencia comparado con un DSP
42
Tipos de DSP
Low End Fixed Point
TMS320C2XX, ADSP21XX, DSP56XXX
Floating Point
TMS320C3X, C67XX, ADSP210XX, DSP96000, DSP32XX
44
Es ms complejo crear un cdigo eficiente en C en uno de punto fijo que en uno de punto flotante
45
46
56800
DSP56F801 DSP56F802 DSP56F803 DSP56F805 DSP56F807 DSP56F826 DSP56F827
56800E
DSP56852 DSP56853 DSP56854 DSP56855 DSP56857
56300
DSP56301 DSP56303 XC56309 XC56L307 DSP56311 DSP56321 DSPB56362 DSPB56364 DSPB56366 DSPA56367 DSPA56371
MSC8100
MSC8101 MSC8103
DSP56858
MC56F8322 MC56F8323 MC56F8345 MC56F8346 MC56F8356 MC56F8357
47
Features
Single-instruction cycle 16-bit x 16-bit parallel multiply-accumulator Two 36-bit accumulators including extension bits Single-instruction 16-bit barrel shifter Parallel instruction set with unique DSP addressing modes Low-power wait and stop modes Operating frequency down to DC 16-bit Timer Module Synchronous serial interface module (SSI) Serial peripheral interface (SPI) Programmable general-purpose I/O
Applications
Motion Control Smart appliances Environmental controls Instrumentation Industrial Uninterruptable power supplies Noise cancellation/suppression Temperature control HVAC Inverters and AC-to-DC conversion Lighting Automation Transportation Instrumentation
48
Features
40K x 16-bit Program SRAM 24K x 16-bit Data SRAM 1K x 16-bit Boot ROM Access up to 2M words of program memory or 8M data memory Six (6) independent channels of DMA
Applications
Telephony Telco interface Codecs LCD and Keypad support Client-side IP phone Internet Audio Internet Audio decoding Internet Audio standalone player Voice Processing
Includes Also the MC56F300 Series which contains on chip Flash memory
Two (2) Enhanced Synchronous Serial Interfaces (ESSI) Two (2) Serial Communication Interfaces (SCI) Serial Port Interface (SPI) 8-bit Parallel Host Interface General Purpose 16-bit Quad Timer JTAG/Enhanced On-Chip Emulation (OnCE) for unobtrusive, real-time debugging Computer Operating Properly (COP)/Watchdog Timer Time-of-Day (TOD) Up to 47 GPIO
49
Features
Object code compatible with the DSP56000 core with highly parallel instruction set Data Arithmetic Logic Unit (Data ALU) with fully pipelined 24 x 24-bit parallel Multiplier-Accumulator (MAC) Direct Memory Access (DMA) with six DMA channels supporting internal and external accesses Digital Phase Lock Loop (DPLL) allows change of low-power Divide Factor (DF) without loss of lock Hardware debugging support including On-Chip Emulation (OnCETM) module, Joint Test Action Group (JTAG) Test Access Port (TAP) Two Enhanced Synchronous Serial Interfaces (ESSI0 and ESSI1 Serial Communications Interface (SCI) Triple timer module Up to 34 GPIO
Applications
Multimedia Telecommunciation Video conferencing Base transceiver stations Packet telephony
50
Features
Four 250/275 MHz StarCore SC140 DSP extended cores 16 ALUs on a chip deliver up to 4000/4400 MMACS Performance equivalent to a 1.0/1.1 GHz SC140 Core Industry's largest on-chip SRAM memory
Applications
2.5G Wireless System 3G Wireless System IP Telephony Compression G.7xx speech coders
1436 KB of internal memory Efficient multi-level memory hierarchy Dual external industry-standard 60xcompatible buses 9.6 Gbps peak bus throughput Four independent Time-Division Multiplex (TDM) Interfaces 400 Mbps peak serial data throughput Accesses various external memories, including SDRAMs, SRAMs, SSRAMs, EPROMs, and Flash
51
rbol de la familia TI
Ref: TI DSP Selection Guide http://focus.ti.com/lit/ml/ssdv004m /ssdv004m.pdf
C2000
C3000
C24x
F2407, F2406 F2403, F2402 F2401, C2406 C2404, C2402 C2401, F243 F241, C242 F240
C28x
F2810 F2812
C3x
C33 C32 C31 C30
C55x
C5510
C5509 C5502 C5501
C64x
C6416 C6415 C6414 C6412 C6411 DM640 DM641 DM642
C67x
C6713 C6712 C6711 C6701
52
Features
375-ns (minimum conversion time) analog-to-digital (A/D) converter Dual 10-bit A/D converters Up to four 16-bit general-purpose timers Watchdog timer module Up to 16 PWM channels Up to 41 GPIO pins Five external interrupts Up to 32K words on-chip sectored Flash I/O Modules Controller Area Network (CAN) interface module Serial communications inter-face(SCI) Serial peripheral interface (SPI) Boot ROM (LF240x and LF240xA devices)
Applications
Appliances Compressors Industrial automation Uninterruptible power (UPS) systems Automotive braking steering systems Electric metering Printers and copiers Hand-held power tools Electronic cooling Intelligent sensors Tunable lasers Consumer goods Fuel pumps Industrial frequency Remote monitoring ID tag readers
53
Features
Ultra-fast 2040 ns service time to any interrupts 32-/64-bit saturation, single-cycle read-modify-write instructions, and 64/32 and 32/32 modulus division High-performance ADC 32 32 single-cycle fixed-point MAC Dual 16 16 single-cycle fixed-point MACs On Chip flash memory I/O modules: SPI, SCI, CAN
Applications
Lighting Optical networking (ONET) Power supplies Industrial automation Consumer goods
54
Features
Parallel multiply and arithmetic/logical operations on integer or floating-point numbers in a single cycle Eight extended-precision registers
Applications
Digital audio Laser printers, copiers, scanners Bar-code scanners Videoconferencing Industrial automation and robotics Voice/facsimile Servo and motor control
55
Features
Integrated Viterbi accelerator 40-bit adder and two 40-bit accumulators to support parallel instructions 40-bit ALU with a dual 16-bit configuration capability for dual onecycle operations 17 17 multiplier allowing 16-bit signed or unsigned Multiplication Four internal buses and dual address generators enable multiple program and data fetches and reduce memory bottleneck Single-cycle normalization and exponential encoding Eight auxiliary registers and a software stack enable advanced fixed-point DSP C compiler Power-down modes for battery powered applications
Applications
Digital cellular communications Personal communications systems (PCS) Pagers Personal digital assistants Digital cordless communications Wireless data communications Networking Computer telephony Voice over packet Portable Internet audio Modems
56
Features
TMS320C54x DSP core subsystem 100-MIPS operation 72 kwords RAM Two multi-channel buffered serial ports (McBSPs) Direct memory access (DMA) controller Phase-locked loop External memory interface ARM port interface (API) ARM7TDMI RISC core subsystem 47.5-MHz operation 16 KByte zero-wait-state SRAM Memory interface (SDRAM, SRAM, ROM, Flash) Single-port 10/100 Base-T Ethernet Interface (C5471 DSP only) 36 general-purpose I/O (ARMI/O) Two UARTs (one IrDA) Serial peripheral interface (SPI) I 2 C interface
Applications
wireless data Smart pen pads Text-to-speech Voice recognition Vommand control Access point controller Networked security Industrial control and emergency radio
57
TMS320C55x DSP Generation, 16-bit Fixed Point Most Power Efficient DSP
Specifications
C55x DSP core delivers 300 MHz for up to 600-MIPS performance 1.6-volt core and 3.3-volt peripherals
Features
Advanced automatic power management Configurable idle domains to extend your battery life Shortened debug for faster time-tomarket 144-MHz/200-MHz clock rate 256-KB RAM, 64-KB ROM Three McBSPs, I 2 C, watchdog timer, general-purpose timers USB 2.0 full-speed (12 Mbps) 10-bit ADC real-time clock (RTC)
Applications
Feature-rich, miniaturized personal and portable products 2G, 2.5G and 3G cell phones and basestations Digital audio players Digital still cameras Electronic books Voice recognition GPS receivers Fingerprint/Pattern recognition Wireless modems Headsets Biometrics
58
Features
150-MHz TI-enhanced ARM925 16 KB instruction cache and 8 KB data cache Data and instruction MMUs 32-bit and 16-bit instruction sets 150-MHz TMS320C55x DSP 12 KW (24 KB) instruction cache 80 KW (160 KB) SRAM 16 KW (32 KB) ROM Two 16-bit memory interfaces for SDRAM and flash Nine-channel system DMA controller LCD controller USB 1.1 host and client MMC/SD card interface Seven serial ports plus three UARTs, Nine timers, Keyboard interface Less than 250 mW at 1.6 V
Applications
Internet appliances Applications processing Enhanced gaming Webpad Point-of-sale Medical devices Industry-specific PDAs Telematics Digital media processing Military and government cellular
59
Features
C6000 DSP Platform VelociTI advanced architecture Up to eight 32-bit instructions executed each cycle Eight independent, multi-purpose functional units thirty-two 32-bit registers Industrys most advanced C compiler and Assembly Optimizer maximize efficiency and performance
Applications
Pooled modems Digital Subscriber Line (xDSL) Wireless basestations Central office switches Private Branch Exchange (PBX) Digital imaging Call processing 3D graphics Speech recognition Voice over packet
60
Features
C6000 DSP Platform VelociTI advanced architecture Up to eight 32-bit instructions executed each cycle Eight independent, multi-purpose functional units thirty-two 32-bit registers Industrys most advanced C compiler and Assembly Optimizer maximize efficiency and performance IEEE floating-point format Up to 1350 MFLOPS at 225 Two new multi-channel serial ports (McASP) (C6713 DSP) can support up to stereo channels of I2S (Inter IC Sound) and compatible with S/PDIF transmit protocol. Note I2S is a protocol for transmitting 2 channels of digital audio over a single serial connection
Applications
Pooled modems Digital Subscriber Line (xDSL) Wireless basestations Central office switches Private Branch Exchange (PBX) Digital imaging Call processing 3D graphics Speech recognition Voice over packet
61
Features
C6000 DSP Platform VelociTI advanced architecture Up to eight 32-bit instructions executed each cycle Eight independent, multi-purpose functional units thirty-two 32-bit registers Industrys most advanced C compiler and Assembly Optimizer maximize efficiency and performance
Applications
DSL and pooled modems Basestation transceivers Wireless LAN Enterprise PBX Multimedia gateway Broadband video transcoders Streaming video servers and clients Highspeed raster image processing (RIP)
62
TI Families Summary
C24x and C28x families: low performance 16-bit fixed point used for control purpose C54x family: mid-range performance 16-bit fixed point C55x family: mid-range performance 16-bit fixed point with reduced power consumption and increased parallelism C5000 + RISC microprocessor: used for embedded applications such as cell phone and PDAs C62x: high-range performance 16-bit fixed point supporting VLIW architecture C64x: very high performance 16-bit fixed point with extension capabilities of C62x with higher clock frequency (>2500 MIPS) C3x: first generation low performance 32-bit floating point C67xx family: very high performance 32-bit floating point
63
Qu chip selecciono?
Motorola DSP56858
Family: DSP56800E Kit: DSP56858EVM Software: Metrowerks CodeWarrior
Metrowerks is a Motorola company in charge of developing the software
Applications
Telephony Client side IP phone Internet Audio Voice Processing
TI TMS320C5510
Family: TMS320C55xx Kit: TMS320C5510DSK Software: TI Code Composer Studio v2.1 Applications
64
Cdigo
Escribir cdigo en C Compilar para crear cdigo en asembler Ensamblar el cdigo para crear el cdigo objeto y linkear Usar el simulador para testeo de la velocidad del cdigo Si el cdigo no es lo suficientemente rpido reescribir el cdigo en C y volverlo a testear. Si an no es lo suficientemente rpido , escribirlo en lenguaje Assembly.
65
El cdigo en C puede ser 3 a 30 veces ms lento que el mejor cdigo en ensamblador posible. Especialmente en la parte de procesamiento de seal del cdigo. El problema es an mayor para DSPs de punto fijo.
Tcnicas Digitales III 66
Reescribe el cdigo en C para producir mejor cdigo assembly Probar el cdigo para saber cul parte del software requiere mayor tiempo de CPU. Limita el cdigo ensamblador a subrutinas:
Que el programa gaste la mayor cantidad de tiempo en ellas De esta forma nos beneficiamos de las funciones especiales del DSP como MACs y ejecucin paralela.
67
Evitar divisiones o modularizacionmes de las operaciones Usar (&) y shift cuando sea posible
Usar la regla de 5%/80%
Programando en ensamblador el 5% de las lineas de cdigo de un proyecto deben ocupar el 80% de la carga de la CPU. Trata de cambiar el cdigo para que encajen en routinas en assembly existentes.
68
DSP56858EVM Kit
DSP56858 chip USB interface 1 Mbit EEPROM/Data Flash FSRAM (256K) Parallel interface On board 6 debugging LED Boot Mode Selector RS232 Interface Audio in/out (stereo)
69
Qu chip selecciono?
DSP56800E Family General Purpose 16-bit fixed point (six members). DSP56800E Introduced in 2000 as improved version of DSP568500 family
Lower Power Consumption Enhanced peripherals Higher MIPS
Many Peripherals:
SCI to communicate with devices using RS232 SPI to communicate with CODEC or EEPROM (needs a clock). DMA to communicate between memory and external device
70
71
Modos de direccionamiento
Inmediato
LD #31, A ; carga acumulador A con 31
Absoluto
LD *(X), A; carga A con el nmero almacenadoen la direccin X
Directo
Parte de la direccin es dada opcode(cdigo de operacin), la otra por un registro interno.
72
Modos de direccionamiento
Direccionamiento indirecto
Usado para acceder a una secuencia de nmeros almacenados en una zona consecutiva de la memoria de datos.
X=8000H STM LD ADD ADD STL #X, R1 ; carga R1 con la direccin X *R1+, A ; carga acumulador A con 31 *R1+, A ; suma -14 to A ->A *R1+,A ; 6 + A ->A A, *R1 ; almacena la salida
Tcnicas Digitales III
31 -14 6 out
73
Buffer circular
Es un buffer normal exceptuando cuando alcanza el final del lazo, entonces vueve al punto de partida.
Definir el punto de partida Define el tamao del buffer
74
y[ n] x[ n] h[n]
y[ n]
x[k ]h[n k ]
75
Convolucin
Shiftin g 0 1 2 3 4 5 0 0 0.2 -3 2 0.5 -3 2 -3 0.2 5 0.5 2 -3 0.2 5 0.5 2 -3 0.5 2 -3 0.5 2 0.5 0.2 5 0 0
77
Buffer circular
A cada valor de y[n0] el puntero para los coeficientes x[-k] debe apuntar a la ltima entrada (ej: x{-2} = -3)
X[-2] X[-1] X[0] -3 2 0.5 Al trmino de cada muestra computada vuelve atrs a x[-2]
Tcnicas Digitales III
buffer Circular = 3
Cdigo ensamblador
.text _main: STM #Inputs,AR5 STM #Coeff, AR2 STM #Output, AR3 STM #3, BK STM #5, AR4
STM #1, AR0
; Pone el cdigo en la seccin text localizada en memoria ; Comienzo de la subr. Ppal. ; Apunta al array de Entradas ; Apunta al array Coeff ; Apunta a la salida (Output) ; Define el buffer circular ; Define variable contadora
; Incremento para el buffer circular
79
Cdigo ensamblador
loop RPTZ A, #2 ; Borra Acumulador A y repite la sig. instruccion 3 veces MAC *AR5+0%, *AR2+ ; Multiplica input*coeff + A ->A Incrementa puntero AR2 en uno e incrementa AR5 por AR0 usando el buffer circular STL A, *AR3+ ; Almacena resultado en memoria MAR *AR2MAR *AR2- ; Decrementa 2 veces AR2 BANZ loop, *AR4- ; Para que el lazo compute cada salida, saltando seis veces
RET
Tcnicas Digitales III 80