You are on page 1of 130

Chapter 2 TMS320C6000 Architectural Overview

Learning Objectives

Describe C6000 CPU architecture. Introduce some basic instructions. Describe the C6000 memory map. Provide an overview of the peripherals.

Chapter 2, Slide 2

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

General DSP System Block Diagram


Internal Memory
Internal Buses

External Memory

Central
Processing Unit

P E R I P H E R A L S

Chapter 2, Slide 3

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Implementation of Sum of Products (SOP)


It has been shown in Chapter 1 that SOP is the key element for most DSP algorithms. So lets write the code for this algorithm and at the same time discover the C6000 architecture.
N

Y =
n = 1

an * xn

= a1 * x1 + a2 * x2 +... + aN * xN

Two basic operations are required

for this algorithm.


(1) Multiplication (2) Addition

Therefore two basic


instructions are required

Chapter 2, Slide 4

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Implementation of Sum of Products (SOP)


So lets implement the SOP algorithm! The implementation in this module will be done in assembly.
N

Y =
n = 1

an * xn

= a1 * x1 + a2 * x2 +... + aN * xN

Two basic operations are required

for this algorithm.


(1) Multiplication (2) Addition

Therefore two basic


instructions are required

Chapter 2, Slide 5

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Multiply (MPY)
N

Y =
n = 1

an * xn

= a1 * x1 + a2 * x2 +... + aN * xN

The multiplication of a1 by x1 is done in assembly by the following instruction: MPY a1, x1, Y

This instruction is performed by a multiplier unit that is called .M

Chapter 2, Slide 6

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Multiply (.M unit)


40

Y =
n = 1

an * xn

.M
The . M unit performs multiplications in hardware MPY .M a1, x1, Y

Note: 16-bit by 16-bit multiplier provides a 32-bit result. 32-bit by 32-bit multiplier provides a 64-bit result.
Chapter 2, Slide 7 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Addition (.?)
40

Y =
n = 1

an * xn

.M

.?

MPY ADD

.M .?

a1, x1, prod Y, prod, Y

Chapter 2, Slide 8

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Add (.L unit)


40

Y =
n = 1

an * xn

.M

.L

MPY ADD

.M .L

a1, x1, prod Y, prod, Y

RISC processors such as the C6000 use registers to hold the operands, so lets change this code.
Chapter 2, Slide 9 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Register File - A
Register File A A0 A1 A2 A3 prod Y a1 x1
40

Y =
n = 1

an * xn

.M

. . .

.L

MPY ADD

.M .L

a1, x1, prod Y, prod, Y

A15

32-bits

Let us correct this by replacing a, x, prod and Y by the registers as shown above.
Chapter 2, Slide 10 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Specifying Register Names


Register File A A0 A1 A2 A3 prod Y a1 x1
40

Y =
n = 1

an * xn

.M

. . .

.L

MPY ADD

.M .L

A0, A1, A3 A4, A3, A4

A15

32-bits

The registers A0, A1, A3 and A4 contain the values to be used by the instructions.
Chapter 2, Slide 11 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Specifying Register Names


Register File A A0 A1 A2 A3 prod Y a1 x1
40

Y =
n = 1

an * xn

.M

. . .

.L

MPY ADD

.M .L

A0, A1, A3 A4, A3, A4

A15

32-bits

Register File A contains 16 registers (A0 -A15) which are 32-bits wide.
Chapter 2, Slide 12 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Data loading
Register File A A0 A1 A2 A3 prod Y a1 x1

Q: How do we load the operands into the registers? .M

. . .

.L

A15

32-bits

Chapter 2, Slide 13

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Load Unit .D
Register File A A0 A1 A2 A3 prod Y a1 x1

Q: How do we load the operands into the registers? .M

. . .

.L
.D

A: The operands are loaded into the registers by loading them from the memory using the .D unit.

A15

32-bits

Data Memory
Chapter 2, Slide 14 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Load Unit .D
Register File A A0 A1 A2 A3 prod Y a1 x1

It is worth noting at this stage that the only way to access memory is through the .D unit. .M

. . .

.L
.D

A15

32-bits

Data Memory
Chapter 2, Slide 15 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Load Instruction
Register File A A0 A1 A2 A3 prod Y a1 x1

Q: Which instruction(s) can be used for loading operands from the memory to the registers? .M

. . .

.L
.D

A15

32-bits

Data Memory
Chapter 2, Slide 16 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Load Instructions (LDB, LDH,LDW,LDDW)


Register File A A0 A1 A2 A3 prod Y a1 x1

Q: Which instruction(s) can be used for loading operands from the memory to the registers? .M A: The load instructions.

. . .

.L
.D

A15

32-bits

Data Memory
Chapter 2, Slide 17 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Using the Load Instructions


Before using the load unit you have to be aware that this processor is byte addressable, which means that each byte is represented by a unique address. Also the addresses are 32-bit wide. Data
address 00000000 00000002 00000004 00000006 00000008

FFFFFFFF 16-bits

Chapter 2, Slide 18

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Using the Load Instructions


The syntax for the load instruction is: LD *Rn,Rm Where: Rn is a register that contains the address of the operand to be loaded and Data
address 00000000 00000002 00000004 00000006 00000008

a1 x1
prod Y

Rm is the destination register.


16-bits

FFFFFFFF

Chapter 2, Slide 19

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Using the Load Instructions


The syntax for the load instruction is: LD *Rn,Rm The question now is how many bytes are going to be loaded into the destination register? Data
address 00000000 00000002 00000004 00000006 00000008

a1 x1
prod Y

FFFFFFFF 16-bits

Chapter 2, Slide 20

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Using the Load Instructions


The syntax for the load instruction is: LD *Rn,Rm The answer, is that it depends on the instruction you choose:
LDB: loads one byte (8-bit)
LDH: loads half word (16-bit) LDW: loads a word (32-bit) LDDW: loads a double word (64-bit)

Data

address 00000000 00000002 00000004 00000006 00000008

a1 x1
prod Y

Note: LD on its own does not exist.

FFFFFFFF 16-bits

Chapter 2, Slide 21

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Using the Load Instructions


The syntax for the load instruction is: LD *Rn,Rm
Example:
If we assume that A5 = 0x4 then: (1) LDB *A5, A7 ; gives A7 = 0x00000001 (2) LDH *A5,A7; gives A7 = 0x00000201 (3) LDW *A5,A7; gives A7 = 0x04030201 (4) LDDW *A5,A7:A6; gives A7:A6 = 0x0807060504030201

Data
1
0xA

0
0xB

address 00000000 00000002 00000004 00000006 00000008

0xC
0x2 0x4 0x6 0x8

0xD
0x1 0x3 0x5 0x7

FFFFFFFF 16-bits

Chapter 2, Slide 22

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Using the Load Instructions


The syntax for the load instruction is: LD *Rn,Rm Data
0xA 0xB

address
0xD
0x1 0x3 0x5 0x7

0xC
0x2 0x4 0x6 0x8

Question:
If data can only be accessed by the load instruction and the .D unit, how can we load the register pointer Rn in the first place?

00000000 00000002 00000004 00000006 00000008

FFFFFFFF 16-bits

Chapter 2, Slide 23

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Loading the Pointer Rn

The instruction MVKL will allow a move of a 16-bit constant into a register as shown below: MVKL
(a is a constant or label)

.?

a, A5

How many bits represent a full address?


32 bits So why does the instruction not allow a 32-bit move? All instructions are 32-bit wide (see instruction opcode).

Chapter 2, Slide 24

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Loading the Pointer Rn

To solve this problem another instruction is available:

MVKH
ah al x a A5

eg.

MVKH

.?

a, A5

(a is a constant or label)

ah

Finally, to move the 32-bit address to a register we can use:


MVKL MVKH a, A5 a, A5
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Chapter 2, Slide 25

Loading the Pointer Rn

Always use MVKL then MVKH, look at the following examples:

Example 1 A5 = 0x87654321 MVKL 0x1234FABC, A5 MVKH 0x1234FABC, A5

A5 = 0xFFFFFABC (sign extension)

A5 = 0x1234FABC ; OK

Example 2 MVKH A5 = 0x12344321 0x1234FABC, A5 MVKL 0x1234FABC, A5

A5 = 0xFFFFFABC ; Wrong

Chapter 2, Slide 26

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

LDH, MVKL and MVKH


Register File A
A0 A1

a x prod Y
. . .

MVKL MVKH

pt1, A5 pt1, A5 pt2, A6 pt2, A6

A2
A3 A4

.M .L .D

MVKL MVKH

LDH
LDH MPY ADD

.D
.D .M .L

*A5, A0
*A6, A1 A0, A1, A3 A4, A3, A4

A15

32-bits
pt1 and pt2 point to some locations

Data Memory
Chapter 2, Slide 27

in the data memory.


Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Creating a loop
So far we have only implemented the SOP for one tap only, i.e. Y= a1 * x1 So lets create a loop so that we can implement the SOP for N Taps.
MVKL MVKH MVKL MVKH LDH .D pt1, A5 pt1, A5 pt2, A6 pt2, A6 *A5, A0

LDH
MPY ADD

.D
.M .L

*A6, A1
A0, A1, A3 A4, A3, A4

Chapter 2, Slide 28

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Creating a loop
So far we have only implemented the SOP for one tap only, i.e. Y= a1 * x1 So lets create a loop so that we can implement the SOP for N Taps.
With the C6000 processors there are no dedicated instructions such as block repeat. The loop is created using the B instruction.

Chapter 2, Slide 29

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

What are the steps for creating a loop


1. Create a label to branch to.
2. Add a branch instruction, B. 3. Create a loop counter. 4. Add an instruction to decrement the loop counter. 5. Make the branch conditional based on the value in the loop counter.

Chapter 2, Slide 30

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

1. Create a label to branch to


MVKL MVKH MVKL MVKH pt1, A5 pt1, A5 pt2, A6 pt2, A6

loop

LDH LDH MPY ADD

.D .D .M .L

*A5, A0 *A6, A1 A0, A1, A3 A4, A3, A4

Chapter 2, Slide 31

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

2. Add a branch instruction, B.


MVKL MVKH MVKL MVKH pt1, A5 pt1, A5 pt2, A6 pt2, A6

loop

LDH LDH MPY ADD B

.D .D .M .L .?

*A5, A0 *A6, A1 A0, A1, A3 A4, A3, A4 loop

Chapter 2, Slide 32

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Which unit is used by the B instruction?


Register File A A0 A1 A2 A3 prod Y a x MVKL MVKH pt1, A5 pt1, A5 pt2, A6 pt2, A6

.S .M .M
loop

MVKL MVKH

LDH LDH MPY ADD B

.D .D .M .L .?

*A5, A0 *A6, A1 A0, A1, A3 A4, A3, A4 loop

. . .

.L .L
.D .D

A15

32-bits

Data Memory
Chapter 2, Slide 33 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Which unit is used by the B instruction?


Register File A A0 A1 A2 A3 prod Y a x MVKL .S MVKH .S pt1, A5 pt1, A5 pt2, A6 pt2, A6

.S .M .M
loop

MVKL .S MVKH .S

LDH LDH MPY ADD B

.D .D .M .L .S

*A5, A0 *A6, A1 A0, A1, A3 A4, A3, A4 loop

. . .

.L .L
.D .D

A15

32-bits

Data Memory
Chapter 2, Slide 34 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

3. Create a loop counter.


Register File A A0 A1 A2 A3 prod Y a x MVKL .S MVKH .S pt1, A5 pt1, A5 pt2, A6 pt2, A6 count, B0 *A5, A0 *A6, A1 A0, A1, A3 A4, A3, A4 loop

.S .M .M
loop

MVKL .S MVKH .S MVKL .S LDH LDH MPY ADD B .D .D .M .L .S

. . .

.L .L
.D .D

A15

32-bits

B registers will be introduced later

Data Memory
Chapter 2, Slide 35 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

4. Decrement the loop counter


Register File A A0 A1 A2 A3 prod Y a x MVKL .S MVKH .S pt1, A5 pt1, A5 pt2, A6 pt2, A6 count, B0 *A5, A0 *A6, A1 A0, A1, A3 A4, A3, A4 B0, 1, B0

.S .M .M
loop

MVKL .S MVKH .S MVKL .S LDH LDH MPY ADD SUB .D .D .M .L .S

. . .

.L .L
.D .D

A15

32-bits

.S

loop

Data Memory
Chapter 2, Slide 36 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

5. Make the branch conditional based on the value in the loop counter

What is the syntax for making instruction conditional?

[condition]
e.g. [B1] B

Instruction
loop

Label

(1) The condition can be one of the following registers: A1, A2, B0, B1, B2. (2) Any instruction can be conditional.
Chapter 2, Slide 37 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

5. Make the branch conditional based on the value in the loop counter

The condition can be inverted by adding the exclamation symbol ! as follows:

[!condition]
e.g. [!B0] [B0] B B

Instruction

Label

loop ;branch if B0 = 0 loop ;branch if B0 != 0

Chapter 2, Slide 38

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

5. Make the branch conditional


Register File A A0 A1 A2 A3 prod Y a x MVKL .S2 pt1, A5 MVKH .S2 pt1, A5

.S .M .M
loop

MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 count, B0 LDH LDH MPY ADD SUB .D .D .M .L .S *A5, A0 *A6, A1 A0, A1, A3 A4, A3, A4 B0, 1, B0

. . .

.L .L
.D .D

A15

32-bits

[B0]

.S

loop

Data Memory
Chapter 2, Slide 39 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

More on the Branch Instruction (1)


With this processor all the instructions are encoded in a 32-bit.

Therefore the label must have a dynamic range of less than 32-bit as the instruction B has to be coded.
32-bit

21-bit relative address

Case 1:

B .S1

label

Relative branch. Label limited to +/- 220 offset.


Chapter 2, Slide 40 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

More on the Branch Instruction (2)


By specifying a register as an operand instead of a label, it is possible to have an absolute branch.

This will allow a dynamic range of 232.


32-bit

5-bit register code

Case 2:

B .S2

register

Absolute branch. Operates on .S2 ONLY!


Chapter 2, Slide 41 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Testing the code


MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 count, B0 loop LDH LDH MPY
However, we would like to perform: a0*x0 + a1*x1 + a2*x2 + + aN*xN

This code performs the following operations:

.D .D .M .L .S

*A5, A0 *A6, A1 A0, A1, A3 A4, A3, A4 B0, 1, B0

a0*x0 + a0*x0 + a0*x0 + + a0*x0

ADD SUB

[B0]

.S

loop

Chapter 2, Slide 42

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Modifying the pointers


MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 count, B0
The solution is to modify the pointers

loop

LDH LDH MPY ADD SUB

.D .D .M .L .S

*A5, A0 *A6, A1 A0, A1, A3 A4, A3, A4 B0, 1, B0

A5 and A6.

[B0]

.S

loop

Chapter 2, Slide 43

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Indexing Pointers
Syntax *R Description Pointer Pointer Modified No

In this case the pointers are used but not modified.

R can be any register


Chapter 2, Slide 44 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Indexing Pointers
Syntax *R *+R[disp] *-R[disp] Description Pointer + Pre-offset - Pre-offset

Pointer Modified No No No

In this case the pointers are modified BEFORE being used and RESTORED to their previous values.

[disp] specifies the number of elements size in DW (64-bit), W (32-bit), H (16-bit), or B (8-bit). disp = R or 5-bit constant. R can be any register.
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Chapter 2, Slide 45

Indexing Pointers
Syntax *R *+R[disp] *-R[disp] *++R[disp] *--R[disp] Description Pointer + Pre-offset - Pre-offset Pre-increment Pre-decrement Pointer Modified No No No Yes Yes

In this case the pointers are modified BEFORE being used and NOT RESTORED to their Previous Values.
Chapter 2, Slide 46 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Indexing Pointers
Syntax *R *+R[disp] *-R[disp] *++R[disp] *--R[disp] *R++[disp] *R--[disp] Description Pointer + Pre-offset - Pre-offset Pre-increment Pre-decrement Post-increment Post-decrement Pointer Modified No No No Yes Yes Yes Yes

In this case the pointers are modified AFTER being used and NOT RESTORED to their Previous Values.
Chapter 2, Slide 47 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Indexing Pointers
Syntax *R *+R[disp] *-R[disp] *++R[disp] *--R[disp] *R++[disp] *R--[disp]

Chapter 2, Slide 48

Description Pointer + Pre-offset - Pre-offset Pre-increment Pre-decrement Post-increment Post-decrement

Pointer Modified No No No Yes Yes Yes Yes

[disp] specifies # elements - size in DW, W, H, or B. disp = R or 5-bit constant. R can be any register.
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Modify and testing the code


MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 count, B0
This code now performs the following operations:

loop

LDH LDH MPY ADD SUB

.D .D .M .L .S

*A5++, A0 *A6++, A1 A0, A1, A3 A4, A3, A4 B0, 1, B0

a0*x0 + a1*x1 + a2*x2 + ... + aN*xN

[B0]

.S

loop

Chapter 2, Slide 49

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Store the final result


MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 count, B0
This code now performs the following operations:

loop

LDH LDH MPY ADD SUB

.D .D .M .L .S

*A5++, A0 *A6++, A1 A0, A1, A3 A4, A3, A4 B0, 1, B0

a0*x0 + a1*x1 + a2*x2 + ... + aN*xN

[B0]

B
STH

.S
.D

loop
A4, *A7

Chapter 2, Slide 50

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Store the final result


MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 count, B0 loop
The Pointer A7 has not been initialised.

LDH LDH MPY ADD SUB

.D .D .M .L .S

*A5++, A0 *A6++, A1 A0, A1, A3 A4, A3, A4 B0, 1, B0

[B0]

B
STH

.S
.D

loop
A4, *A7

Chapter 2, Slide 51

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Store the final result


MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 pt3, A7 MVKH .S2 pt3, A7 MVKL .S2 count, B0 loop LDH LDH MPY ADD SUB [B0] B STH
Chapter 2, Slide 52

The Pointer A7 is now initialised.

.D .D .M .L .S .S .D

*A5++, A0 *A6++, A1 A0, A1, A3 A4, A3, A4 B0, 1, B0 loop A4, *A7

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

What is the initial value of A4?


MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL MVKH MVKL ZERO LDH LDH MPY ADD SUB [B0] B STH
Chapter 2, Slide 53

A4 is used as an accumulator, so it needs to be reset to zero.

loop

.S2 .S2 .S2 .L .D .D .M .L .S .S .D

pt3, A7 pt3, A7 count, B0 A4 *A5++, A0 *A6++, A1 A0, A1, A3 A4, A3, A4 B0, 1, B0 loop A4, *A7

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Increasing the processing power!


Register File A A0 A1 A2 A3 A4 .S1 .M1

. . .
A15
32-bits

.L1
.D1

How can we add more processing power to this processor?

Data Memory
Chapter 2, Slide 54 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Increasing the processing power!


Register File A A0 A1 A2 A3 A4 .S1 .M1

(1) Increase the clock frequency.


(2) Increase the number of Processing units.

. . .
A15
32-bits

.L1
.D1

Data Memory
Chapter 2, Slide 55 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

To increase the Processing Power, this processor has two sides (A and B or 1 and 2)
Register File A A0 A1 A2 A3 A4 .S1 .M1 .S2 .M2 Register File B B0 B1 B2 B3 B4

. . .
A15
32-bits

.L1
.D1

.L2
.D2

. . .
B15
32-bits

Data Memory
Chapter 2, Slide 56 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Can the two sides exchange operands in order to increase performance?


Register File A A0 A1 A2 A3 A4 .S1 .M1 .S2 .M2 Register File B B0 B1 B2 B3 B4

. . .
A15
32-bits

.L1
.D1

.L2
.D2

. . .
B15
32-bits

Data Memory
Chapter 2, Slide 57 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

The answer is YES but there are limitations.

To exchange operands between the two sides, some cross paths or links are required.

What is a cross path?


A cross path links one side of the CPU to the other. There are two types of cross paths:

Data cross paths. Address cross paths.


Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Chapter 2, Slide 58

Data Cross Paths

Data cross paths can also be referred to as register file cross paths.

These cross paths allow operands from one side to be used by the other side.
There are only two cross paths:

one path which conveys data from side B to side A, 1X.


one path which conveys data from side A to side B, 2X.

Chapter 2, Slide 59

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

TMS320C67x Data-Path

Chapter 2, Slide 60

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Data Cross Paths

Data cross paths only apply to the .L, .S and .M units.

The data cross paths are very useful, however there are some limitations in their use.

Chapter 2, Slide 61

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Data Cross Path Limitations

<dst>

.L1 .M1 .S1

<src> <src>

A
2x

(1) The destination register must be on same side as unit. (2) Source registers - up to one cross path per execute packet per side.
Execute packet: group of instructions that execute simultaneously.
Chapter 2, Slide 62

1x

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Data Cross Path Limitations

<dst>

.L1 .M1 .S1

<src> <src>

A
2x

eg: ADD MPY SUB || ADD

.L1x .M1x .S1x .L1x

A0,A1,B2 A0,B6,A9 A8,B2,A8 A0,B0,A2

1x

B
(c) Texas Instruments 2004

|| Means that the SUB and ADD belong to the same fetch packet, therefore execute simultaneously. Dr. Naim Dahnoun, Bristol University, Chapter 2, Slide 63

Data Cross Path Limitations

<dst>

.L1 .M1 .S1

<src> <src>

A
2x

eg: ADD MPY SUB || ADD

.L1x .M1x .S1x .L1x

A0,A1,B2 A0,B6,A9 A8,B2,A8 A0,B0,A2

1x

B
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

NOT VALID!

Chapter 2, Slide 64

Data Cross Paths for both sides

<dst>

.L1 .M1 .S1

<src> <src>

A
2x

<dst>

.L2 .M2 .S2

<src> <src>

1x

Chapter 2, Slide 65

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Address cross paths


Data

Addr

.D1

(1) The pointer must be on the same side of the unit.

LDW.D1T1 *A0,A5 STW.D1T1 A5,*A0


Chapter 2, Slide 66 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Load or store to either side


Data1
DA1 = T1

A5

.D1

*A0

DA2 = T2

LDW.D1T1 *A0,A5 LDW.D1T2 *A0,B5 B5

Data2

Chapter 2, Slide 67

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Standard Parallel Loads


Data1
DA1 = T1

A5

.D1 .D2

*A0

DA2 = T2

*B0

B
LDW.D1T1 *A0,A5 || LDW.D2T2 *B0,B5
Chapter 2, Slide 68

B5

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Parallel Load/Store using address cross paths


Data1 DA1 = T1

A5

.D1 .D2

*A0

DA2 = T2

*B0

B
LDW.D1T2 *A0,B5 || STW.D2T1 A5,*B0
Chapter 2, Slide 69

B5

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Fill the blanks ... Does this work?


Data1
DA1 = T1

.D1 .D2

*A0

DA2 = T2

*B0

B
LDW.D1__ *A0,B5 || STW.D2__ B6,*B0
Chapter 2, Slide 70 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Not Allowed! Parallel accesses: both cross or neither cross


Data1

.D1
DA2 = T2

*A0

.D2

*B0
B5

LDW.D1T2 *A0,B5 B6 || STW.D2T2 B6,*B0


Chapter 2, Slide 71 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Conditions Dont Use Cross Paths

If a conditional register comes from the opposite side, it does NOT use a data or address cross-path. Examples: [B2] [A1] ADD LDW .L1 .D2 A2,A0,A4 *B0,B5

Chapter 2, Slide 72

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

C62x Data-Path Summary


CPU
Ref Guide

Full CPU Datapath (Pg 2-2)

Chapter 2, Slide 73

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

C67x Data-Path Summary

C67x

Chapter 2, Slide 74

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Cross Paths - Summary


Data

Destination register on same side as unit. Source registers - up to one cross path per execute packet per side. Use x to indicate cross-path.
Pointer must be on same side as unit. Data can be transferred to/from either side. Parallel accesses: both cross or neither cross.

Address

Chapter 2, Slide 75

Conditionals Dont Use Cross Paths.


Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Code Review (using side A only)


Y =
n = 1 40

an * xn ; A2 = 40, loop count ; A0 = a(n) ; A1 = x(n) ; A3 = a(n) * x(n) ; Y = Y + A3 ; decrement loop count ; if A2 0, branch ; *A7 = Y

MVK loop: LDH LDH MPY ADD SUB [A2] B STH

.S1 .D1 .D1 .M1 .L1 .L1 .S1 .D1

40, A2 *A5++, A0 *A6++, A1 A0, A1, A3 A3, A4, A4 A2, 1, A2 loop A4, *A7

Note: Assume that A4 was previously cleared and the pointers are initialised.
Chapter 2, Slide 76 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Let us have a look at the final details concerning the functional units. Consider first the case of the .L and .S units.

Chapter 2, Slide 77

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Operands - 32/40-bit Register, 5-bit Constant

Operands can be:

5-bit constants (or 16-bit for MVKL and MVKH). 32-bit registers. 40-bit Registers.

However, we have seen that registers are only 32-bit.

So where do the 40-bit registers come from?

Chapter 2, Slide 78

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Operands - 32/40-bit Register, 5-bit Constant

A 40-bit register can be obtained by concatenating two registers.

However, there are 3 conditions that need to be respected:


The registers must be from the same side. The first register must be even and the second odd. The registers must be consecutive.

Chapter 2, Slide 79

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Operands - 32/40-bit Register, 5-bit Constant

All combinations of 40-bit registers are shown below:


40-bit Reg
odd 8

40-bit Reg
odd 8

even
32

even
32

A1:A0
A3:A2 A5:A4 A7:A6

B1:B0
B3:B2 B5:B4 B7:B6

A9:A8
A11:A10 A13:A12 A15:A14
Chapter 2, Slide 80

B9:B8
B11:B10 B13:B12 B15:B14
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Operands - 32/40-bit Register, 5-bit Constant


instr .unit <src>, <src>, <dst> 32-bit Reg
< src >

32-bit Reg

5-bit Const
< src >

40-bit Reg

.L or .S
< dst >

32-bit Reg
Chapter 2, Slide 81

40-bit Reg
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Operands - 32/40-bit Register, 5-bit Constant


instr .unit <src>, <src>, <dst>

32-bit Reg

5-bit Const
< src >

32-bit Reg
< src >

40-bit Reg

.L or .S
< dst >

32-bit Reg
Chapter 2, Slide 82

40-bit Reg
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Operands - 32/40-bit Register, 5-bit Constant


instr .unit <src>, <src>, <dst>

32-bit Reg

5-bit Const
< src >

32-bit Reg
< src >

40-bit Reg

.L or .S
< dst >

OR.L1

A0, A1, A2

32-bit Reg
Chapter 2, Slide 83

40-bit Reg
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Operands - 32/40-bit Register, 5-bit Constant


instr .unit <src>, <src>, <dst>

32-bit Reg

5-bit Const
< src >

32-bit Reg
< src >

40-bit Reg

.L or .S
< dst >

OR.L1 ADD.L2

A0, A1, A2 -5, B3, B4

32-bit Reg
Chapter 2, Slide 84

40-bit Reg
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Operands - 32/40-bit Register, 5-bit Constant


instr .unit <src>, <src>, <dst>

32-bit Reg

5-bit Const
< src >

32-bit Reg
< src >

40-bit Reg

.L or .S
< dst >

OR.L1 ADD.L2 ADD.L1

A0, A1, A2 -5, B3, B4 A2, A3, A5:A4

32-bit Reg
Chapter 2, Slide 85

40-bit Reg
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Operands - 32/40-bit Register, 5-bit Constant


instr .unit <src>, <src>, <dst>

32-bit Reg

5-bit Const
< src >

32-bit Reg
< src >

40-bit Reg

.L or .S
< dst >

32-bit Reg
Chapter 2, Slide 86

40-bit Reg

OR.L1 ADD.L2 ADD.L1 SUB.L1

A0, -5, A2, A2,

A1, A2 B3, B4 A3, A5:A4 A5:A4, A5:A4

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Operands - 32/40-bit Register, 5-bit Constant


instr .unit <src>, <src>, <dst>

32-bit Reg

5-bit Const
< src >

32-bit Reg
< src >

40-bit Reg

.L or .S
< dst >

32-bit Reg
Chapter 2, Slide 87

40-bit Reg

OR.L1 ADD.L2 ADD.L1 SUB.L1 ADD.L2

A0, A1, A2 -5, B3, B4 A2, A3, A5:A4 A2, A5:A4, A5:A4 3, B9:B8, B9:B8

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Register to register data transfer

To move the content of a register (A or B) to another register (B or A) use the move MV Instruction, e.g.:
MV MV A0, B0 B6, B7

To move the content of a control register to another register (A or B) or vice-versa use the MVC instruction, e.g.:
MVC
MVC

IFR, A0
A0, IRP
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Chapter 2, Slide 88

TMS320C6000 Instruction Set

Chapter 2, Slide 89

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

'C62x Instruction Set (by category)


Arithmetic
ABS ADD ADDA ADDK ADD2 MPY MPYH NEG SMPY SMPYH SADD SAT SSUB SUB SUBA SUBC SUB2 ZERO

Logical
AND CMPEQ CMPGT CMPLT NOT OR SHL SHR SSHL XOR

Data Mgmt
LDB/H/W MV MVC MVK MVKL MVKH MVKLH STB/H/W

Program Ctrl
B IDLE NOP

Bit Mgmt
CLR EXT LMBD NORM SET

Note: Refer to the 'C6000 CPU Reference Guide for more details.
Chapter 2, Slide 90 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

'C62x Instruction Set (by unit)


.S Unit
ADD ADDK ADD2 AND B CLR EXT MV MVC MVK MVKL MVKH
MVKLH NEG NOT OR SET SHL SHR SSHL SUB SUB2 XOR ZERO

.L Unit
ABS ADD AND CMPEQ CMPGT CMPLT LMBD MV NEG NORM NOT OR SADD SAT SSUB SUB SUBC XOR ZERO

.D Unit
ADD ADDA LDB/H/W MV NEG STB/H/W SUB SUBA ZERO

.M Unit
MPY MPYH SMPY SMPYH

Other
NOP
Chapter 2, Slide 91

IDLE

Note: Refer to the 'C6000 CPU Reference Guide for more details.
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

' C6700: Superset of Fixed-Point (by unit)


.S Unit .L Unit
ABS ADD AND CMPEQ CMPGT CMPLT LMBD MV NEG NORM NOT OR SADD SAT SSUB SUB SUBC XOR ZERO ADDSP ADDDP SUBSP SUBDP INTSP INTDP SPINT DPINT SPRTUNC DPTRUNC DPSP ADD ADDK ADD2 AND B CLR EXT MV MVC MVK MVKL MVKH NEG NOT OR SET SHL SHR SSHL SUB SUB2 XOR ZERO ABSSP ABSDP CMPGTSP CMPEQSP CMPLTSP CMPGTDP CMPEQDP CMPLTDP RCPSP RCPDP RSQRSP RSQRDP SPDP

.S

.L
.D .M

C67x

.M Unit
MPY MPYH MPYLH MPYHL
NOP

.D Unit
ADD NEG ADDAB (B/H/W) STB (B/H/W) ADDAD SUB LDB (B/H/W) SUBAB (B/H/W) LDDW ZERO MV

SMPY SMPYH

MPYSP MPYDP MPYI MPYID


IDLE

No Unit Used
Note: Refer to the 'C6000 CPU Reference Guide for more details.
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Chapter 2, Slide 92

Superset of Fixed-Point
Instruction Fetch Instruction Dispatch Advanced Instruction Packing Instruction Decode Registers (A0 - A15) Registers (A16 - A31) L1 + + + + S1 + + + + M1 x x x x X X D1 + + Control Registers Emulation Advanced Emulation Interrupt Control

Registers (B0 - B15) Registers (B16 - B31) D2 + + X M2 X S2 L2 + + x + x + x + + x + +

C62x: Dual 32-Bit Load/Store C64x: Dual 64-Bit Load/Store C67x: Dual 64-Bit Load/32-Bit Store
Chapter 2, Slide 93 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

'C64x: Superset of C62x


.S
Dual/Quad Arith SADD2 SADDUS2 SADD4 Data Pack/Un PACK2 PACKH2 PACKLH2 PACKHL2 Bitwise Logical UNPKHU4 ANDN UNPKLU4 Shifts & Merge SWAP2 SPACK2 SHR2 SPACKU4 SHRU2 SHLMB SHRMB Dual Arithmetic Mem Access ADD2 LDDW SUB2 LDNW LDNDW Bitwise Logical STDW AND STNW ANDN STNDW OR XOR Load Constant MVK (5-bit) Address Calc. ADDAD Compares CMPEQ2 CMPEQ4 CMPGT2 CMPGT4 Branches/PC BDEC BPOS BNOP ADDKPC

.L

Dual/Quad Arith ABS2 ADD2 ADD4 MAX MIN SUB2 SUB4 SUBABS4 Bitwise Logical ANDN

Data Pack/Un PACK2 PACKH2 PACKLH2 PACKHL2 PACKH4 PACKL4 UNPKHU4 UNPKLU4 SWAP2/4

.D

.M
Average AVG2 AVG4 Shifts ROTL SSHVL SSHVR

Multiplies MPYHI Shift & Merge MPYLI SHLMB MPYHIR SHRMB MPYLIR Load Constant MPY2 MVK (5-bit) SMPY2 Bit Operations DOTP2 DOTPN2 BITC4 DOTPRSU2 BITR DOTPNRSU2 DEAL DOTPU4 SHFL DOTPSU4 Move GMPY4 MVD XPND2/4

Chapter 2, Slide 94

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

TMS320C6000 Memory

Chapter 2, Slide 95

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Memory size per device


Devices
C6201, C6204, C6202 C6701 C6205 P D P D

Internal
= = = = 64 kB 64 kB 256 kB 128 kB

EMIFA

EMIFB

52M Bytes (32-bits wide)

N/A

C6203 C6211 C6711 C6712

P D

= =

384 kB 512 kB
128M Bytes (32-bits wide) N/A 64M Bytes (16-bits wide) 128M Bytes (32-bits wide)

L1P L1D L2

= = =

4 kB 4 kB 64 kB

C6713

L1P L1D L2 L1P L1D L2

= = = = = =

4 kB 4 kB 256 kB 16 kB 16 kB 256 kB

N/A

C6411 DM642

128M Bytes (32-bits wide) 256M Bytes (64-bits wide)

N/A

C6414 C6415 C6416

L1P L1D L2

= = =

16 kB 16 kB 1 MB

64M Bytes (16-bits wide)

Chapter 2, Slide 96

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Internal Memory Summary


Devices
C6211 C6711 C6713 C6712 Devices C6414 C6415 C6416

Internal (L2)
64 kB 256 kB Internal (L2) 1 MB

External
512M (32-bit wide) 512M (16-bit wide) External A: 1GB (64-bit) B: 256kB (16-bit)

DM642
C6411

256 kB
256 kB

1GB (64-bit)
256MB (32-bit)

LINK: TMS320C6000 DSP Generation


Chapter 2, Slide 97 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

TMS320C6000 Peripherals

Chapter 2, Slide 98

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

'C6x System Block Diagram


Memory
External Memory
Regs (A0-A15) Internal Buses

.D1 .D2

.M1 .M2
.L1 .L2 .S1 .S2
Control Regs

P E R I P H E R A L S

Regs (B0-B15)

CPU
Chapter 2, Slide 99 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

C6x Internal Buses


Internal Memory
Program Addr Program Data Data Addr - T1 Data Data - T1 x32 x256 x32 x32/64

PC A
regs

A D
External Interface

Data Addr - T2
Data Data - T2 DMA Addr - Read

x32
x32/64 x32

A D A D
C67x
Chapter 2, Slide 100

B
regs

x32

Peripherals
x32

DMA Data - Read


DMA Addr - Write DMA Data - Write

x32
x32 x32

DMA

can perform 64-bit data loads.


Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

'C6x System Block Diagram


Memory
Internal Buses EMIF

.D1 .D2

.M1 .M2
.L1 .L2 .S1 .S2
Control Regs

Extl Memory

P E R I P H E R A L S

Regs (A0-A15)

Regs (B0-B15)

CPU
Chapter 2, Slide 101 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

'C6x System Block Diagram


Program RAM Addr Internal Buses D (32) EMIF Data Ram

.D1 .D2

Extl Memory

.M1 .M2
.L1 .L2 .S1 .S2
Control Regs

- Sync
- Async

P E R I P H E R A L S

Regs (A0-A15)

Regs (B0-B15)

CPU
Chapter 2, Slide 102 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

'C6000 Peripherals
Parallel Comm GPIO Internal Memory

External Memory

EMIF

Internal Buses

Serial DMA, EDMA (Boot)


Timers Ethernet Video Ports VCP / TCP PLL
Chapter 2, Slide 103 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

.D1 .D2
.M1 .M2 .L1 .L2

Register Set B

Register Set A

.S1 .S2
CPU

EMIF
Async
Internal Memory

SDRAM

EMIF

Internal Buses

.D1 .D2
.M1 .M2 .L1 .L2

Register Set B

Register Set A

SBSRAM

.S1 External Memory Interface (EMIF) .S2

Glueless access to async/sync memory CPU Works with PC100/133 SDRAM (cheap, fast, and easy!) Byte-wide data access 16, 32, or 64-bit bus widths
Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Chapter 2, Slide 104

HPI / XBUS / PCI


Parallel Comm Internal Memory

External Memory

EMIF

Internal Buses

.D1 .D2

Register Set B

Parallel Communication Interfaces .M2 .M1


HPI:

Dedicated, slave-only, async 16/32-bit bus allows host-P access to C6000 memory .L1 .L2

XBUS: Similar to HPI but provides .S1 .S2 Master/slave and sync modes Glueless i/f to FIFOs (up to single-cycle xfer rate) CPU PCI: Standard 32-bit, 33MHz/66MHz PCI interface These interfaces provide means to bootstrap the C6000
Chapter 2, Slide 105 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Register Set A

GPIO
Parallel Comm GPIO Internal Memory

External Memory

EMIF

Internal Buses

.D1 .D2
.M1 .M2 .L1 .L2

Register Set B

General Purpose Input/Output (GPIO) CPU


C64x and C6713 provide 8-16 bits of general purpose bit I/O Use to observe or control the signal of a single-pin
Chapter 2, Slide 106

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Register Set A

.S1 .S2

McBSP and Utopia


Multi-Channel Buffered Serial Port (McBSP) Parallel
Internal 2 (or 3) full-duplex, synchronous serial-ports Comm Memory Up to 100 Mb/sec performance GPIO Supports multi-channel operation (T1, E1, MVIP, )

External Memory

EMIF

Internal Buses

Serial

.D1 .D2

Register Set B

Multi-Channel Audio Serial Port.M1 .M2 (McASP)


McBSP features plus more Up to 8 stereo lines (16 channels) IIC support On DM642, C6713

Register Set A

.L1 .L2

.S1 .S2
CPU

Utopia (C64x)
ATM connection 50 MHz wide area network connectivity
Chapter 2, Slide 107

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

DMA / EDMA
Parallel Comm GPIO Internal Memory

External Memory

EMIF

Internal Buses

Serial DMA, EDMA (Boot)

.D1 .D2
.M1 .M2

Register Set B

.L1 .L2 Direct Memory Access (DMA / EDMA)


Transfers any set of memory locations to another .S1 4 / 16 / 64 channels (transfer parameter sets) .S2 Transfers can be triggered by any interrupt (sync) CPU Operates independent of CPU On reset, provides bootstrap from memory

Chapter 2, Slide 108

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Register Set A

Timer/Counter
Parallel Comm GPIO Internal Memory

External Memory

EMIF

Internal Buses

Serial DMA, EDMA (Boot)


Timers

.D1 .D2
.M1 .M2 .L1 .L2

Register Set B

Register Set A

Timer / Counter

.S1 .S2
CPU

Two (or three) 32-bit timer/counters Can generate interrupts Both input and output pins
Chapter 2, Slide 109

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Ethernet MAC
Parallel Comm GPIO Internal Memory

External Memory

EMIF

Internal Buses

Serial DMA, EDMA (Boot)


Timers Ethernet Video Ports Ethernet (DM642 only) VCP / TCP 10/100 Ethernet MAC

.D1 .D2
.M1 .M2 .L1 .L2

Register Set B

Register Set A

.S1 .S2
CPU

Pins are muxed with PCI PLL TCP/IP stack available from TI
Chapter 2, Slide 110 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Video Ports
Parallel Comm GPIO Internal Memory

External Memory

EMIF

Internal Buses

Serial Video Ports (DM642 only)

Each configurableEDMA DMA, for Capture or Display .D1 .D2 Dual 8/10-bit BT656 or raw modes (Boot) .M1 .M2 16/20-bit raw modes and 20-bit Y/C for high definition Timers Horz Scaling and Chroma Resampling Support for 8-bit modes .L1 .L2 Supports transport interface mode Ethernet
Register Set B Register Set A Video Ports VCP / TCP PLL

.S1 .S2
CPU

Chapter 2, Slide 111

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

VCP / TCP -- 3G Wireless


Parallel Comm GPIO Internal Memory

External Internal Buses EMIF Memory Coprocessor (TCP) (C6416 only) Turbo
Supports 35 data channels at 384 kbps McBSPs 3GPP / IS2000 Turbo coder .D1 .D2 Utopia Programmable parameters include mode, rate and frame length Register Set B DMA, EDMA Viterbi Coprocessor (VCP) (C6416 only) .M1 .M2 (Boot) Supports > 500 voice channels at 8 kbps .L1 .L2 Programmable decoder parameters include constraint length, Timers code rate, and frame length .S1 .S2 Video Ports VCP / TCP CPU PLL
Chapter 2, Slide 112 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Register Set A

Phase Locked Loop (PLL)


Parallel Comm GPIO Internal Memory

External Memory

EMIF

Internal Buses

Serial

PLL

DMA, EDMA (Boot) DMA, EDMA

Input .D1 .D2


CLKIN .M1 .M2 Register Set A

Register Set B

Timers (Boot) Clock multiplier Ethernet Reduces EMI and cost Timers Rate is Pin selectable Video Ports VCP / TCP PLL
Chapter 2, Slide 113

Output .L2 .L1

CLKOUT1 .S1 CLKOUT2.S2 (reduced rate clkout) CPU

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Clock Cycle
What is a clock cycle? The time between successive instructions
C6000
CLKOUT1 (C6000 clock cycle) CLKIN

PLL

CLKOUT2 (, , or 1/6 CLKOUT1)


When we talk about cycles ...

CLKIN (MHz)
60
30

PLL Rate
x12
x10

CPU Clock Frequency


720 MHz
300 MHz

CPU Clock Cycle Time


1.39 ns
3.33 ns

MIPs (max)
5760
2400

50
25
Chapter 2, Slide 114

x4
x4

200 MHz
100 MHz

5 ns
10 ns

1600
800

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

'C6000 Peripherals Summary


Parallel Comm GPIO Internal Memory

External Memory

EMIF

Internal Buses

Serial DMA, EDMA (Boot)


Timers Ethernet Timers Video Ports VCP / TCP PLL
Chapter 2, Slide 115 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

.D1 .D2
.M1 .M2 .L1 .L2

Register Set B

Register Set A

.S1 .S2
CPU

C6x Family Part Numbering

Example = TMS320LC6201PKGA200

TMS320 L C6 2 01 PKG A 200

= TI DSP = Place holder for voltage levels = C6x family = Fixed-point core = Memory/peripheral configuration = Pkg designator (actual letters TBD) = -40 to 85C (blank for 0 to 70C) = Core CPU speed in Mhz

Chapter 2, Slide 116

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Module 1 Exam
1. Functional Units

a. How many can perform an ADD? Name them.


b. Which support memory loads/stores?

.M 2. Memory Map

.S

.D

.L

a. How many external ranges exist on C6201?

Chapter 2, Slide 117

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

3. Conditional Code
a. Which registers can be used as condl registers?
b. Which instructions can be conditional?

4. Performance
a. What is the 'C6711 instruction cycle time?
b. How can the 'C6711 execute 1200 MIPs?

Chapter 2, Slide 118

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

5. Coding Problems
a. Move contents of A0-->A1

Chapter 2, Slide 119

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

5. Coding Problems
a. Move contents of A0-->A1

b. Move contents of CSR-->A1 c. Clear register A5

Chapter 2, Slide 120

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

5. Coding Problems (contd)


d. A2 = A02 + A1

e. If (B1 0) then B2 = B5 * B6 f. A2 = A0 * A1 + 10

g. Load an unsigned constant (19ABCh) into register A6.

Chapter 2, Slide 121

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

5. Coding Problems (contd)


h. Load A7 with contents of mem1 and postincrement the selected pointer.

Chapter 2, Slide 122

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Module 1 Exam (solution)


1. Functional Units

a. How many can perform an ADD? Name them.


six; .L1, .L2, .D1, .D2, .S1, .S2 b. Which support memory loads/stores?

.M 2. Memory Map
Four

.S

.D

.L

a. How many external ranges exist on C6201?

Chapter 2, Slide 123

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

3. Conditional Code
a. Which registers can be used as condl registers?
A1, A2, B0, B1, B2 b. Which instructions can be conditional? All of them

4. Performance
a. What is the 'C6711 instruction cycle time? CLKOUT1
b. How can the 'C6711 execute 1200 MIPs? 1200 MIPs = 8 instructions (units) x 150 MHz

Chapter 2, Slide 124

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

5. Coding Problems
a. Move contents of A0-->A1 MV .L1 ADD .S1 MPY .M1 A0, A1 A0, 0, A1 A0, 1, A1

or or

(whats the problem with this?)

Chapter 2, Slide 125

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

5. Coding Problems
a. Move contents of A0-->A1 MV .L1 ADD .S1 MPY .M1 A0, A1 A0, 0, A1 A0, 1, A1

or or

(A0 can only be a 16-bit value)

b. Move contents of CSR-->A1

MVC CSR, A1 c. Clear register A5


or or or or or
Chapter 2, Slide 126

ZERO .S1 SUB .L1 MPY .M1 CLR .S1 MVK .S1 XOR .L1

A5 A5, A5, A5 A5, 0, A5 A5, 0, 31, A5 0, A5 A5,A5,A5


Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

5. Coding Problems (contd)


d. A2 = A02 + A1 MPY.M1 ADD.L1 [B1] MPY.M2 A0, A0, A2 A2, A1, A2 B5, B6, B2

e. If (B1 0) then B2 = B5 * B6 f. A2 = A0 * A1 + 10 MPY A0, A1, A2 ADD 10, A2, A2 g. Load an unsigned constant (19ABCh) into register A6. value .equ 0x00019abc mvkl .s1 0x00019abc,a6 mvkl.s1 value,a6 mvkh .s1 0x00019abc,a6 mvkh.s1 value,a6
Chapter 2, Slide 127 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

5. Coding Problems (contd)


h. Load A7 with contents of mem1 and postincrement the selected pointer.
x16 mem A7 mem1 10h

load_mem1:

MVKL
MVKH LDH

.S1
.S1 .D1

mem1,
mem1,

A6
A6

*A6++, A7

Chapter 2, Slide 128

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Architecture

Links:

C6711 data sheet: tms320c6711.pdf C6713 data sheet: tms320c6713.pdf C6416 data sheet: tms320c6416.pdf User guide: spru189f.pdf Errata: sprz173c.pdf

Chapter 2, Slide 129

Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Chapter 2 TMS320C6000 Architectural Overview - End -

You might also like