You are on page 1of 22

CHAPTER 1 William Stallings Computer Organization and Architecture 8th Edition Objectives By the end of this module.

To differentiate the concepts between Computer Architecture and Computer Organization. To describe and draw the Basic Architecture of a Computer using its Basic components. To define, describe and illustrate graphically the main functions of a computer. To draw graphically using Top-Down Approach decomposition of a Computer from Main Level to CPU Level down to Control Level. How do you define a modern day computer ? What is a Computer ?

An electronic device, operating under the control of instructions stored in its own memory unit, that can accept data (input), process data arithmetically and logically, produce output from the processing, and store the results for future use. Definition of a Computer System A computer system consists of a computer and its peripherals. Computer peripherals include input devices, output devices, and secondary memories. Basic Principles of Computers Von Neumann Architecture Characteristics Von Neumann Architecture Data and instructions are stored in a single read/write memory. The contents of this memory are addressable by location, without regard to what are stored there. Instructions are executed sequentially unless the order is explicitly modified. Why von Neumann Architecture? General-purpose and programmable. They can solve very different problems by executing different programs.

Automatic execution of Instruction. It can be built with very simple electronics components: Data processing function is performed by Data storage function is provided by memory Data communication is achieved by electrical electronic gates. cells. wires.

5 Functional Units of a Computer

A computer has 5 functional independent main parts: input, memory (storage unit), arithmetic and logic, output, and control units. Basic Functions of a Computer System System Bus Model (Von Neuman Model) System Bus Model Most important to the system bus model is the communications among the components by means of a shared pathway called the system bus data bus which carries the information being transmitted address bus which identifies where the information is being sent Control bus which describes aspects of how the information is being sent, and in what manner. power bus for electrical power to the components, not shown, but its presence is understood. Some Definitions Processor ALU + Control Unit Storage Main Memory + Secondary Memory RAM + cache + HardDisks I/O Input / Output Information (Machine) Instructions or Data Instructions commands that transfer information or the Arithmetic and Logic Operations Program List of instructions Data piece of information to be processed which is in the form of number, characters, image, audio, video and the likes. Data Path ALU + registers + MUXes + Data Lines Bits Binary digit. 0 and 1.

Types or Classes of Computers Desktop / Personal

computers dedicated for individual use at home, Education, Business, Medical communities and the likes. Servers. Embedded computers integrated into another system for automatic monitor or control process. SuperComputers & Grid Computers Servers & Enterprise Systems computers that are meant to be shared by a potentially large number of users such as DataBase

computers used for the highly demanding computations needed in weather forecasting, engineering design and simulation, and scientific work. Cloud Computing

computers with widely distributed computing and storage server resources for individual, independent, computing needs. Computer Architecture Levels of Abstraction of a Digital Computer Design

I System Design Processor-Memory-Switch level (PMS level) The highest is the processor-memory-switch level. This is the level at which an architect views the system. It is simply a description of the system components and their interconnections. The components are specified in the form of a block diagram. Instruction Set Level The next level is instruction set level. It defines the function of each instruction. The emphasis is on the behavior of the system rather than the hardware structure of the system. Register Transfer Level Next to the ISA (instruction set architecture) level is the register transfer level. Hardware structure is visible at this level. In addition to registers, the basic elements at this level are multiplexers, decoders, buses, buffers etc. Levels of Abstraction of a Digital Computer Design

II Logic Design Logic Design Level

The logic design level is also called the gate level. The basic elements at this level are gates and flipflops. The behavior is less visible, while the hardware structure predominates. III Circuit Design Circuit Level The key elements at this level are resistors, transistors, capacitors, diodes etc. Mask Level The lowest level is mask level dealing with the silicon structures and their layout that implement the system as an integrated circuit. Computer Architecture Architecture & Organization 1 Architecture & Organization 1 Architecture & Organization 1 The Rest of Computer Architecture: Implementation of a computer has 2 components:

organization and hardware. Organization includes the high-level aspects of a computers design, such as the memory system, the memory interconnect, and the design of the internal processor or CPU (central processing unitwhere arithmetic, logic, branching, and data transfer are implemented). Hardware refers to the specifics of a computer, including the detailed logic design and the packaging technology of the computer. The Rest of Computer Architecture: Architecture covers all three aspects of computer design

1. instruction set architecture (ISA) 2. organization 3. Hardware Computer architects must design a computer to meet functional requirements availability goals.

as well as price, power, performance, and

Other Definitions for Computer Architecture The science and art of selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals. The theory behind the design of a computer.

The conceptual design and fundamental operational structure of a computer system. The arrangement of computer components and their relationships. Architecture & Organization 2 Structure & Function Structure & Function Top-Down Approach Main Structure - Top Level Functions: main components at Top Level Central processing unit (CPU): Controls the operation of the computer and performs its data processing functions; often simply referred to as processor. Main memory: Stores data. I/O: Moves data between the computer and its external environment. System interconnection: Some mechanism that provides for communication among CPU, main memory, and I/O. An example is the system bus, consisting of a number of conducting wires to which all the other components attach. Structure - The CPU Functions: components at CPU Level

Control unit: Controls the operation of the CPU and hence the computer Arithmetic and logic unit (ALU): Performs the computers data processing functions Registers: Provides storage internal to the CPU CPU interconnection: Some mechanism that provides for communication among the control unit,ALU, and registers Structure - The Control Unit Functional View Function Function Function Function Function

Function Operations (a) Data movement Operations (b) Storage Operation (c) Processing from/to storage Operation (d) Processing from storage to I/O The Computer: Top-Level Structure ISA : Instruction Set Architecture 7 dimensions of an ISA:

[5] Operations The general categories of operations are data transfer, arithmetic, logical, control, and floating point. MIPS is a simple and easy-topipeline instruction set architecture, and it is representative of the RISC architectures being used in 2006. [6] Control flow instructions Virtually all ISAs, including 80x86 and MIPS, support conditional branches, unconditional jumps, procedure calls, and returns. Both use PC-relative addressing, where the branch address is specified by an address field that is added to the PC. [7] Encoding an ISA There are two basic choices on encoding:fixed length and variable length. All MIPS instructions are 32 bits long, which simplifies instruction decoding. The 80x86 encoding is variable length, ranging from 1 to 18 bytes. Implementation Technologies

If an instruction set architecture (ISA) is to be successful, it must be designed to survive rapid changes in computer technology. Integrated circuit logic technology transistor count on a chip of about 40% to 55% per year. DRAM (dynamic random-access memory) Capacity increases by about 40% per year. Magnetic disk technology

Prior to 1990, density increased by about 30% per year, doubling in three years. increased to 100% per year in 1996. Since 2004, it has dropped back to 30% per year.

Network technology Network performance depends both on the performance of switches and on the performance of the transmission system.

END

Introduction CHAPTER 2 William Stallings Computer Organization and Architecture 8th Edition 1st Generation Computer - ENIAC ENIAC - background ENIAC - details Technology Moores Law (1965) The number of transistors on a microchip doubles about every 18-24 months,

The speed of a microprocessor doubles about every 18-24 months, The price of a microchip drops about 48% every 18-24 months, assuming the performance metric (processor speed or memory capacity) of the chip stays the same. Official Definition of Moores Law: http://www.mooreslaw.org/ Moores Law Technology Evolution Generation 1: ENIAC THE VON NEUMANN MACHINE To resolve the problem of human intervention Introduced the concept of Stored-Program. The programming process (program) could be stored in memory alongside the data. The computer could get its instructions by reading them from memory, and a program could be set or altered by setting the values of a portion of memory.

The first publication of the idea was in a 1945 proposal by von Neumann for a new computer, the EDVAC (Electronic Discrete Variable Computer). von Neumann/ Alan Turing IAS computer In 1946, von Neumann and his colleagues began the design of a new stored program computer, referred to as the IAS computer, at the Princeton Institute for Advanced Studies. The IAS computer, although not completed until 1952, is the prototype of all subsequent generalpurpose computers. Structure of von Neumann machine Structure of IAS detail IAS - details IAS Computer : Von Neuman Model Commercial Computers IBM Transistors Transistor Based Computers Microelectronics Generations of Computer Moores Law Growth in CPU Transistor Count IBM 360 series DEC PDP-8 DEC - PDP-8 Bus Structure Semiconductor Memory Intel Speeding it up Performance Balance Logic and Memory Performance Gap

Solutions I/O Devices Typical I/O Device Data Rates Key is Balance Improvements in Chip Organization and Architecture Problems with Clock Speed and Login Density Intel Microprocessor Performance Increased Cache Capacity More Complex Execution Logic Diminishing Returns New Approach Multiple Cores x86 Evolution (1) x86 Evolution (2) x86 Evolution (3) Embedded Systems ARM ARM evolved from RISC design Used mainly in embedded systems Used within product Not general purpose computer Dedicated function E.g. Anti-lock brakes in car Embedded Systems Requirements Different sizes Different constraints, optimization, reuse Different requirements Safety, reliability, real-time, flexibility, legislation Lifespan Environmental conditions

Static v dynamic loads Slow to fast speeds Computation v I/O intensive Descrete event v continuous dynamics Possible Organization of an Embedded System ARM Evolution Designed by ARM Inc., Cambridge, England Licensed to manufacturers High speed, small die, low power consumption PDAs, hand held games, phones E.g. iPod, iPhone Acorn produced ARM1 & ARM2 in 1985 and ARM3 in 1989 Acorn, VLSI and Apple Computer founded ARM Ltd. ARM Systems Categories Embedded real time Application platform Linux, Palm OS, Symbian OS, Windows mobile Secure applications Performance Assessment Clock Speed System Clock Instruction Execution Rate Millions of instructions per second (MIPS) Millions of floating point instructions per second (MFLOPS) Heavily dependent on instruction set, compiler design, processor implementation, cache & memory hierarchy Benchmarks Programs designed to test performance Written in high level language

Portable Represents style of task Systems, numerical, commercial Easily measured Widely distributed E.g. System Performance Evaluation Corporation (SPEC) CPU2006 for computation bound 17 floating point programs in C, C++, Fortran 12 integer programs in C, C++ 3 million lines of code Speed and rate metrics Single task and throughput SPEC Speed Metric Single task Base runtime defined for each benchmark using reference machine Results are reported as ratio of reference time to system run time Trefi execution time for benchmark i on reference machine Tsuti execution time of benchmark i on test system SPEC Rate Metric Measures throughput or rate of a machine carrying out a number of tasks Multiple copies of benchmarks run simultaneously Typically, same as number of processors Ratio is calculated as follows: Trefi reference execution time for benchmark i N number of copies run simultaneously Tsuti elapsed time from start of execution of program on all N processors until completion of all copies of program Again, a geometric mean is calculated Amdahls Law

Gene Amdahl [AMDA67] Potential speed up of program using multiple processors Concluded that: Code needs to be parallelizable Speed up is bound, giving diminishing returns for more processors Task dependent Servers gain by maintaining multiple connections on multiple processors Databases can be split into parallel tasks Amdahls Law Formula Conclusions f small, parallel processors has little effect N ->, speedup bound by 1/(1 f) Diminishing returns for using more processors Computer Performance Using Amdahls law CHAPTER 3 William Stallings Computer Organization and Architecture 8th Edition Top Level View of Computer 3.1 Computer Components 3.2 Computer Function Instruction Fetch and Execute Interrupts I/O Function 3.3 Interconnection Structures 3.4 Bus Interconnection Bus Structure Multiple-Bus Hierarchies

Elements of Bus Design Program Concept Hardwired systems are inflexible General purpose hardware can do different tasks, given correct control signals Instead of re-wiring, supply a new set of control signals

What is a program? A sequence of steps For each step, an arithmetic or logical operation is done For each operation, a different set of control signals is needed

Function of Control Unit For each operation a unique code is provided e.g. ADD, MOVE A hardware segment accepts the code and issues the control signals We have a computer!

Components The Control Unit and the Arithmetic and Logic Unit constitute the Central Processing Unit Data and instructions need to get into the system and results out Input/output Temporary storage of code and results is needed Main memory Computer Components: The logic circuit for the processor can be divided into two parts: the datapath and the control unit, as shown in Figure The datapath is responsible for the actual execution of all data operations performed by the processor which includes the ALU, registers, MUXes , Data lines and some other Logic Gates. Top Level View Von Neumann Architecture Instruction Cycle Two steps:

Fetch Execute

Fetch Cycle Program Counter (PC) holds address of next instruction to fetch Processor fetches instruction from memory location pointed to by PC Increment PC Unless told otherwise Instruction loaded into Instruction Register (IR) Processor interprets instruction and performs required actions

Execute Cycle Processor-memory data transfer between CPU and main memory Processor I/O Data transfer between CPU and I/O module Data processing Some arithmetic or logical operation on data Control Alteration of sequence of operations e.g. jump Combination of above

Computer Architecture (Model) Instruction Set (Hypothetical) Other Computer Models Example of Program Execution Instruction Cycle State Diagram Interrupts Mechanism by which other modules (e.g. I/O) may interrupt normal sequence of processing

Program e.g. overflow, division by zero

Timer Generated by internal processor timer Used in pre-emptive multi-tasking

I/O from I/O controller

Hardware failure e.g. memory parity error

Program Flow Control Interrupt Cycle Added to instruction cycle Processor checks for interrupt Indicated by an interrupt signal If no interrupt, fetch next instruction If interrupt pending: Suspend execution of current program Save context Set PC to start address of interrupt handler routine Process interrupt Restore context and continue interrupted program Transfer of Control via Interrupts Instruction Cycle with Interrupts Program Timing Short I/O Wait Program Timing Long I/O Wait Instruction Cycle (with Interrupts) - State Diagram

Multiple Interrupts Added to instruction cycle Processor checks for interrupt Indicated by an interrupt signal If no interrupt, fetch next instruction If interrupt pending: Suspend execution of current program Save context Set PC to start address of interrupt handler routine Process interrupt Restore context and continue interrupted program Multiple Interrupts - Sequential Multiple Interrupts Nested Time Sequence of Multiple Interrupts Connecting All the units must be connected Different type of connection for different type of unit Memory Input/Output CPU

Computer Modules Memory Connection Receives and sends data Receives addresses (of locations) Receives control signals Read Write Timing

Input/Output Connection(1) Similar to memory from computers viewpoint Output Receive data from computer Send data to peripheral Input Receive data from peripheral Send data to computer

Input/Output Connection(2) Receive control signals from computer Send control signals to peripherals e.g. spin disk Receive addresses from computer e.g. port number to identify peripheral Send interrupt signals (control)

CPU Connection Reads instruction and data Writes out data (after processing) Sends control signals to other units Receives (& acts on) interrupts

Buses There are a number of possible interconnection systems Single and multiple BUS structures are most common e.g. Control/Address/Data bus (PC) e.g. Unibus (DEC-PDP)

What is a Bus? A communication pathway connecting two or more devices Usually broadcast Often grouped A number of channels in one bus e.g. 32 bit data bus is 32 separate single bit channels Power lines may not be shown

Data Bus Carries data Remember that there is no difference between data and instruction at this level Width is a key determinant of performance 8, 16, 32, 64 bit

Address bus Identify the source or destination of data e.g. CPU needs to read an instruction (data) from a given location in memory Bus width determines maximum memory capacity of system e.g. 8080 has 16 bit address bus giving 64k address space

Control Bus Control and timing information Memory read/write signal Interrupt request Clock signals

Bus Interconnection Scheme

Big and Yellow? What do buses look like? Parallel lines on circuit boards Ribbon cables Strip connectors on mother boards e.g. PCI

Sets of wires

Physical Realization of Bus Architecture Single Bus Problems Lots of devices on one bus leads to: Propagation delays Long data paths mean that co-ordination of bus use can adversely affect performance If aggregate data transfer approaches bus capacity

Most systems use multiple buses to overcome these problems

Traditional (ISA) (with cache) High Performance Bus Bus Types Dedicated Separate data & address lines Multiplexed Shared lines Address valid or data valid control line Advantage - fewer lines Disadvantages More complex control

Ultimate performance

Bus Arbitration More than one module controlling the bus e.g. CPU and DMA controller Only one module may control bus at one time Arbitration may be centralised or distributed

Centralised or Distributed Arbitration

Centralised Single hardware device controlling bus access Bus Controller Arbiter

May be part of CPU or separate Distributed Each module may claim the bus Control logic on all modules Timing Co-ordination of events on bus Synchronous Events determined by clock signals Control Bus includes clock line A single 1-0 is a bus cycle All devices can read clock line Usually sync on leading edge Usually a single cycle for an event

Synchronous Timing Diagram

Asynchronous Timing Read Diagram Asynchronous Timing Write Diagram PCI Bus Peripheral Component Interconnection Intel released to public domain 32 or 64 bit 50 lines

PCI Bus Lines (required) Systems lines Including clock and reset Address & Data 32 time mux lines for address/data Interrupt & validate lines Interface Control Arbitration Not shared Direct connection to PCI bus arbiter Error lines

PCI Bus Lines (Optional) Interrupt lines Not shared Cache support 64-bit Bus Extension Additional 32 lines Time multiplexed 2 lines to enable devices to agree to use 64-bit transfer JTAG/Boundary Scan For testing procedures PCI Commands

Transaction between initiator (master) and target Master claims bus Determine type of transaction e.g. I/O read/write

Address phase One or more data phases

PCI Read Timing Diagram PCI Bus Arbiter PCI Bus Arbitration Foreground Reading