You are on page 1of 6

Computer Architecture

Processor Structure Before we look at basic processor structure, we need to briefly touch on two concepts: von Neumann machines and pipelined, clocked logic systems. von Neumann machines In the early 1950s, John von Neumann proposed the concept of a stored program computer - an architecture which has become the foundation for most commercial processors used today. In a von Neumann machine, the program and the data occupy the same memory. The machine has a program counter (PC) which points to the current instruction in memory. The PC is updated on every instruction. When there are no branches, program instructions are fetched from sequential memory locations. (A branch simply updates the PC to some other location in the program memory.) Except for a handful of research machines and a very small collection of commercial devices, all of today's commercial processors work on this simple principle. Later, we will examine some non-von Neumann architectures. Synchronous Machines Again, with a very few exceptions - a handful of research and a small number of commercial systems - most machines nowadays are synchronous, that is, they are controlled by a clock.
Datapaths

Registers and combinatorial logic blocks alternate along the data-paths through the machine. Data advances from one register to the next on each cycle of the global

clock: as the clock edge clocks new data into a register, its current output (processed by passing through the combinatorial block) is latched into the next register in the pipeline. The registers are master-slave flip-flops which allow the input to be isolated from the output, ensuring a "clean" transfer of the new data into the register. (Some very high performance machines, eg DEC's Alpha, use dynamic latches here to reduce propagation delays, cf Dobberpuhl et al.) In a synchronous machine, the slowest possible propagation delay, tpdmax, through any combinatorial block must be less than the smallest clock cycle time, tcyc - otherwise a pipeline hazard will occur and data from a previous stage will be clocked into a register again. If tcyc < tpd for any operation in any stage of the pipeline, the clock edge will arrive at the register before data has propagated through the combinatorial block.

Of course, there may also be feedback loops - in which the output of the current stage is fed back and latched in the same register: a conventional state machine. This sort of logic is used to determine the next operation (ienext microcode word or next address for branching purposes). Basic Processor Structure Here we will consider the basic structure of a simple processor. We will examine the flow of data through such a simple processor and identify bottlenecks in order to understand what has guided the design of more complex processors.

Here we see a very simple processor structure - such as might be found in a small 8bit microprocessor. The various components are: ALU Arithmetic Logic Unit - this circuit takes two operands on the inputs (labelled A and B) and produces a result on the output (labelled Y). The operations will usually include, as a minimum:

add, subtract and, or, not shift right, shift left

ALUs in more complex processors will execute many more instructions. Register File A set of storage locations (registers) for storing temporary results. Early machines had just one register - usually termed an accumulator. Modern RISC processors will have at least 32 registers. Instruction Register The instruction currently being executed by the processor is stored here. Control Unit

The control unit decodes the instruction in the instruction register and sets signals which control the operation of most other units of the processor. For example, the operation code (opcode) in the instruction will be used to determine the settings of control signals for the ALU which determine which operation (+,-,^,v,~,shift,etc) it performs. Clock The vast majority of processors are synchronous, that is, they use a clock signal to determine when to capture the next data word and perform an operation on it. In a globally synchronous processor, a common clock needs to be routed (connected) to every unit in the processor. Program counter The program counter holds the memory address of the next instruction to be executed. It is updated every instruction cycle to point to the next instruction in the program. (Control for the management of branch instructions - which change the program counter by other than a simple increment - has been omitted from this diagram for clarity. Branching instructions and their effect on program execution and efficiency will be examined extensively later. Memory Address Register This register is loaded with the address of the next data word to be fetched from or stored into main memory. Adress Bus This bus is used to transfer addresses to memory and memorymapped peripherals. It is driven by the processor acting as a bus master. Data Bus This bus carries data to and from the processor, memory and peripherals. It will be driven by the source of data, ie processor, memory or peripheral device. Multiplexed Bus Of necessity, high performance processors provide separate address and data buses. To limit device pin counts and bus complexity, some simple processors multiplex address and data onto the same bus: naturally this has an adverse affect on performance. See multiplexed buses. Executing Instructions Let's examine the steps in the execution of a simple memory fetch instruction, eg
In this, and most following, examples, we'll use the MIPS This instruction tells the processor to instruction set. take the address stored in register 2, add 0 to it and load the word found at This is chosen because that address in main memory into it's simple, register 1. it exists in one widely available range of machines
101c16: lw $1,0($2)

produced by SGI and there is a public domain simulator for MIPS machines, which we will use for some performance studies.

As the next instruction to be executed (our lw instruction) is at memory address 101c16, the program counter contains 101c.

For convenience, most numbers - especially memory addresses and instruction contents - will be expressed in hexadecimal. When orders of magnitude and performance are being discussed, decimal numbers will be used: this will generally be obvious from the context and the use of exponent notations, eg 5 x 1012.

Execution Steps 1. The control unit sets the multiplexor to drive the PC onto the address bus. 2. The memory unit responds by placing 8c41000016 - the lw $1,0($2) instruction as encoded for a MIPS processor - on the data bus from where it is latched into the instruction register. 3. The control unit decodes the instruction, recognises it as a memory load instruction and directs the register file to drive the contents of register 2 onto the A input of the ALU and the value 0 onto the B input. At the same time, it instructs the ALU to add its inputs. 4. The output from the ALU is latched into the MAR. The controller ensures that this value is directed onto the address bus by setting the multiplexor. 5. When the memory responds with the value sought, it is captured on the internal data bus and latched into register 1 of the register file. 6. The program counter is now updated to point to the next instruction and the cycle can start again. As another example, lets assume the next instruction is an add instruction:
102016: add $1,$3,$4

This instruction tells the processor to add the contents of registers 3 and 4 and place the result in register 1.

1. The control unit sets the multiplexor to drive the PC onto the address bus. 2. The memory unit responds by placing 0023202016 - the encoded add $1,$3,$4 instruction - on the data bus from where it is latched into the instruction register. 3. The control unit decodes the instruction, recognises it as an arithmetic instruction and directs the register file to drive the contents of register 1 onto the A input of the ALU and the contents of register 3 onto the B input. At the same time, it instructs the ALU to add its inputs. 4. The output from the ALU is latched into the register file at register address 4.

5. The program counter is now updated to point to the next instruction. Key terms von Neumann machine A computer which stores its program in memory and steps through instructions in that memory. pipeline A sequence of alternating storage elements (registers or latches) and combinatorial blocks, making up a datapath through the computer. program counter A register or memory location which holds the address of the next instruction to be executed. synchronous (system/machine) A computer in which instructions and data move from one pipeline stage to the next under the control of a single (global) clock.
Continue on to Bottlenecks John Morris, 1998 Back to the Table of Contents

You might also like