Computer Architecture

Page |1
UNIT-I Computer Architecture: The art of assembling logical elements into a computing
device, the specification of relation between parts of a computer system.
Parallel Computer:
The computer which can capable of parallel computing technology is known as Parallel Computer.
Parallel Computing:
Parallel Computing is a form of computation in which many calculations are carried out simultaneously. These are operating on the principle that large problems can after be divided into smaller which is then solved concurrently (parallel . !n parallel computing, multiple compute resources are utili"ed to solve a computational problem. !t means that problems are run using multiple CP#$s. !n this a problem is broken into discrete parts that can be solved concurrently. %ach part is further broken down into a series of instructions. !nstructions from each part e&ecute simultaneously on different CP#$s.
!n Parallel Computing, the compute resources might be' i. ( single computer with multiple processors)simply ( *ultiprocessor Computer+ (n arbitrary number of computers connected by a network)simply ( *ulticomputer+ ( combination of both. Save time and/or money: !n theory, throwing more resources at a task will shorten its time to completion, with potential cost savings. Parallel computers can be built from cheap, commodity components.
Advantages of Parallel Computing:
Page |2
ii.
Solve larger pro lems: *any problems are so large and,or comple& that it is impractical or impossible to solve them on a single computer, especially given limited computer memory. !n parallel computing we can easily solve them by dividing them into smaller problems.
iii. iv.
Provide concurrency: ( single compute resource can only do one thing at a time. *ultiple computing resources can be doing many things simultaneously. Use of non-local resources: !f the systems are connected by a network we can access the resources of other system also. -ocal resource means the information resides in present working system. .on)local resource means the information resides in another system.
Computer !evelopment "ilestones:

The history of computer development is often referred to in reference to the different generations of computing devices. %ach of the five generations of computers is characteri"ed by a ma/or technological development that fundamentally changed the way computers operate, resulting in increasingly smaller, cheaper, and more powerful and more efficient and reliable devices. #irst $eneration %&'()-&')(* +acuum Tu es: The first computers used vacuum tubes for circuitry and magnetic drums for memory, and were often enormous, taking up entire rooms. They were very e&pensive to operate and in addition to using a great deal of electricity, generated a lot of heat, which was often the cause of malfunctions. *agnetic drum is a direct)access, or random)access, storage device. ( magnetic drum, also referred to as drum, is a metal cylinder coated with magnetic iron)o&ide material on which data and programs can be stored. 0irst generation computers relied on machine language, the lowest)level programming language understood by computers, to perform operations, and they could only solve one problem at a time. !nput was based on punched cards and paper tape, and output was displayed on printouts. The #.!1(C and %.!(C computers are e&amples of first)generation of computing devices. The #.!1(C was the first commercial computer delivered to a business client, the #.2. Census 3ureau in 4564. ,NIAC: (%lectronic .umerical !ntegrator (nd Calculator was developed during 4578)79. 0ollowing are the features' #sed 4:;;; vacuum tubes #sed separate memory blocked for the program and data Performs addition, subtraction, multiplication, division and s<uare root
Page |3
=ave results on electronic type writer on punched cards #sed >;)electronic memory units(>; ac?s #sed numbers that were stored as decimal digits. %ach digit had 4; valves off(; and on (4 . The ;;;;;;;;;4 2tate of valves meant 5 ;;;;;;;;4; meant : )))))4;;;;;;;;; meant ;
@ 4 wordA7; bits @4kA4;>7 words @ >;kA>;@4;>7 words ,!+AC: (%lectronic Biscrete 1ariable (utomatic Computers was built in 4564. 0ollowing are the features' #sed 65;; vacuum tubes #sed Cstored programmed conceptC !t means that initially the program and data are stored in the main memory then they are fetched and decoded and finally e&ecuted by the processor. #sed common main memory block of 4;>7 words main memory allows the system to access the data <uicker for an instruction #sed common secondary memory of >;k words for the program(instruction and data Deduced the e&tent of hardware because the data was processed serially bit by bit and numbers were stored as binary bits ;;;; ) ; ;;;4 ) 4))))) 4;;4)5 #sed instruction format (4 (> (8 (7.?op? was the word defined the operation (4 (> are the words for source operand addresses, (8 was the destination operand adderess,(7 was the address of ne&t instruction #sed a separate instruction format for input and output operations
IAS: (!nstitute of (dvanced 2tudies computer was built later also in 4564. 0ollowing are the features' #sed 2tored Programmed Concept #sed common main memory block of 7;59 or 4;>7 words (7k or 4k #sed concept of CP# registers. 2o data could be accessed <uicker for the instruction !ntroduced (ccumulator((C concept.(C was used as source as well as destination operand !ntroduced instruction registers. !t held the instructions that had been fetched from the main memory. !t also held the ne&t instruction in the ne&t cycle.(( cycle consisting of
Page |4
fetching and then e&ecution of an instruction #sed program counter CP#.PC holds the ne&t instruction to be e&ecuted. PC is incremented after each instruction is fetched to hold the ne&t instruction address Ead a secondary common memory of 49k words that used electro)mechanical devices for storing the instructions(program and data Second $eneration %&'))-&'-(* Transistors: Transistors replaced vacuum tubes and ushered in the second generation of computers. The transistor was invented in 457F but did not see widespread use in computers until the late 456;s. The transistor was far superior to the vacuum tube, allowing computers to become smaller, faster, cheaper, more energy)efficient and more reliable than their first) generation predecessors. Though the transistor still generated a great deal of heat that sub/ected the computer to damage, it was a vast improvement over the vacuum tube. 2econd) generation computers still relied on punched cards for input and printouts for output. 2econd)generation computers moved from cryptic binary machine language to symbolic, or assembly, languages, which allowed programmers to specify instructions in words. Eigh)level programming languages were also being developed at this time, such as early versions of CG3G- and 0GDTD(.. These were also the first computers that stored their instructions in their memory, which moved from a magnetic drum to magnetic core technology. The first computers of this generation were developed for the atomic energy industry. I." &-/0 and I." 10'(: These were developed during 4567)97. 0ollowing are the features' Transistors in the computers 0errite corer or main memory (dditional registers in CP# 2eparate !G processors and the disk)drives, tape)drive and line printers for !nput) Gutput operations (ddition, subtraction, multiplication and division on fi&ed point and floating point number 2everal addressing modes for fetching operands Concept of stack and stack pointer for -!0G data operations Concept of sub)routine call Programming in assembler and in high level languages
Page |5
Third $eneration %&'-)-&'1(* Integrated Circuits: The development of the integrated circuit was the hallmark of the third generation of computers. Transistors were miniaturi"ed and placed on silicon chips, called semiconductors, which drastically increased the speed and efficiency of computers. !nstead of punched cards and printouts, users interacted with third generation computers through keyboards and monitors and interfaced with an operating system, which allowed the device to run many different applications at one time with a central program that monitored the memory. Computers for the first time became accessible to a mass audience because they were smaller and cheaper than their predecessors. I." 2-0: 3uilt in 4597)96. 0ollowing are the features' %ach !C with 4;;)4;;; electronic logic gates 49 =eneral purpose registers of 8>)bit each and 7 floating point registers of 97)bit each "ain memory: 2emiconductor !C(-arge fi&ed main memory >;; opcodes (distinct instruction e&ecutable at the e&ecution unit (ddition, subtraction, multiplication, division of fi&ed point and floating point numbers %nhanced number of addressing modes Concept of two modes of CP# operations, super vision$s mode and user mode Concept of status register for holding the flags for e&ceptional conditions(overflow, interrupt, carry Programming in assembler as well as E-2oftware compatibilities with different processing units when programming in E-8>)bit instruction format CP# with , !D)!nstruction Degister (D)(ddress of operand PC)Program counter 2D)2tatus register 2 A3U i. ii. iii. 0i&ed point (-# 0loating point (-# Becimal point (-#
Page |6
#ourth $eneration %&'1)-Present* "icroprocessors: The microprocessor brought the fourth generation of computers, as thousands of integrated circuits were built onto a single silicon chip. Hhat in the first generation filled an entire room could now fit in the palm of the hand. The !ntel 7;;7 chip, developed in 45F4, located all the components of the computerIfrom the central processing unit and memory to input,output controlsIon a single chip. !n 45:4 !3* introduced its first computer for the home user, and in 45:7 (pple introduced the *acintosh. *icroprocessors also moved out of the realm of desktop computers and into many areas of life as more and more everyday products began to use microprocessors. (s these small computers became more powerful, they could be linked together to form networks, which eventually led to the development of the !nternet. 0ourth generation computers also saw the development of =#!s, the mouse and handheld devices. I." PC and Pentium ased computers: !3* PC in 45:; and Pentium since 4558 Cache memory 1-2! chip as microprocessor Concepts of pipelining and 2uper)scaling in e&ecution units for e&ecution of instructions Gperating systems and software reusable ob/ects and modules *icroprocessor with cache, CP#, bus interfacing unit Computer with mp, main memory, interrupt handlers, timers, video monitor, mouse, keyboard, hard disks, CB drivers, floppy disks and %thernet card Cache memory: !t is a small and temporary memory in which the fre<uently and recently used instructions are stored. 3y using their the processor can access the data in faster manner "icroprocessor: *icroprocessor is a single chip CP# Pipelining: !t means that, when one instruction is in e&ecution phase another instruction can be fetched by the processor #ifth $eneration %Present and .eyond* Artificial Intelligence: 0ifth generation computing devices, based on artificial intelligence, are still in development, though there are some applications, such as voice recognition, that are being used today. The use of parallel processing and superconductors is helping to make artificial intelligence a reality. Juantum computation and molecular and nanotechnology will radically change the face of computers in years to come. The goal of fifth)generation computing is to develop devices that respond to natural language input and are capable of learning and self) organi"ation.
Page |7
,lements of modern computers:

Eardware, 2oftware and programming elements of modern computer system are briefly introduced below in the conte&t of parallel processing. Computing Pro lems: !t has been long recogni"ed that the concept of computer architecture is no longer restricted to the structure of the bare machine hardware. ( modern computer is an integrated system consisting of machine hardware, an instruction set, system software, application programs and user interfaces. These system elements are shown in the below figure.
The computer is used to solve different real)life problems with fast and accurate solutions. Bepending on the nature of the problem, the solutions may re<uire different computing resources. 0or numerical problems in science and technology, the solutions demand comple& mathematical formulations and integer or floating point computations. 0or alphanumerical problems in business and government, the solutions demand accurate transactions, large database management and information retrieval operations. 0or artificial intelligence ((! problems, the solutions demand logic inferences and symbolic manipulations.
Page |8
These computing problems have been labeled numerical computing, transaction computing, and logical reasoning. 2ome comple& problems may demand a combination of these processing models. Algorithms and !ata Structures: 2pecial algorithms and data structures are needed to specify the computations and communications involved in computing problems. *ost numerical algorithms are deterministic, using regularly structured data. 2ymbolic processing may use heuristic and nondeterministic searches over large knowledge bases. 4ard5are 6esources: The system architecture of a computer is represented by three nested circles in above figure. ( modern computer system demonstrates its power through coordinated efforts by hardware resources, an operating system, and application software. Processors, memory and peripheral devices form the hardware core of a computer system. 2pecial hardware interfaces are often built into !,G devices, such as optical pages scanners, magnetic ink character recogni"ers, modems, voice data entry, printers and plotters. These peripherals are connected to main frame computers directly or through local or wide) area networks. !n addition, 2oftware interface programs are needed. These software interfaces include file transfer system, editors, word processors, device drivers, network communication programs etc. 7perating System: (n effective operating system manages the allocation and deal location of resources during the e&ecution of user programs. 3eyond the operating system, application software must be developed to benefit the user. 2tandard bench mark programs are needed for performance evaluation. *apping is a bidirectional process matching algorithm structure with hardware architecture, and vice versa. %fficient mapping will benefit the programmer and produce source codes. The mapping of algorithmic and data structures onto the machine architecture includes processor scheduling, memory maps, inter processor communication etc. System Soft5are Support: 2oftware support is needed for the development of efficient programs in high)level languages. The source code written in a Eigh -evel -anguage must be first translated into ob/ect code by an optimi"ing compiler. The compiler assigns variables to registers or to memory words and reserves functional units for operators. (n assembler is used to translate the compiled ob/ect code into machine code which can be recogni"ed by the machine hardware. ( loader is used to initiate the program the program e&ecution through the G2 kernel (kernel is heart of the G2 .
Page |9
Desource binding demands the use of the compiler, assembler, loader, and G2 kernel to commit physical machine resources to program e&ecution. The effectiveness of this process determines the efficiency of hardware utili"ation and the programmability of the computer. Compiler Support: There are three compiler upgrade approaches' Preprocessor, Pre)compiler and paralleli"ing compiler. ( preprocessor uses a se<uential compiler and a low)level library of the target computer to implement high)level parallel constructs. The pre)compiler approach re<uires some program flow analysis, dependence checking, and limited optimi"ations toward parallelism detection. The paralleli"ing compiler demands a fully developed paralleli"ing or vectori"ing compiler which can automatically detect parallelism in source code and transform se<uential codes into parallel constructs.
,volution of Computer Architecture:

The study of computer architecture involves both hardware organi"ation and programming,software re<uirements. 0rom the software programmer point of view, the computer architecture is abstracted by its instruction set, which includes opcode (operation codes , addressing modes, registers, virtual memory etc. 0rom the hardware implementation point of view, the computer is organi"ed with CP#s, cache, buses, pipelines, physical memory etc. Therefore, the study of computer architecture organi"ations. Gver the past four decades, computer architecture has gone through the evolutional rather than revolution changes. 2ustaining features are those that were proven performance delivers. (s depicted in the following figure, the computer development started with the von .eumann architecture built as a se<uential machine e&ecuting scalar data. The se<uential computer was improved bit)serial to word)parallel operations, and from fi&ed)point to floating)point operations. The von .eumann architecture is slow due to the se<uential e&ecution of instructions in programs. covers both instruction)set architecture and machine implementation
P a g e | 10
3oo8ahead9 Parallelism and Pipelining: -ookahead techni<ues were introduced to pre)fetch instructions in order to overlap !,% (!nstruction 0etch, Becode and %&ecution parallelism. 0unctional parallelism was supported by two approaches' Gne is to use multiple functional units simultaneously, and the other is to practice pipelining at various processing levels. The latter includes pipelined instruction e&ecution, pipelined arithmetic computations, and memory)access operations. Pipelining has proven especially attractive in performing identical operations repeatedly over vector data strings. 1ector operations were originally carried out implicitly by software)controlled looping using scalar pipeline processors. Parallel/+ector Computers: !ntrinsic parallel computers are those e&ecute programs in *!*B mode. There are two ma/or classes of parallel computers, namely 2hared)memory multiprocessors and *essage)passing multi)computers. The ma/or difference between multiprocessors and multi) operations and to enable functional
P a g e | 11
computers lies in memory sharing and the mechanisms used for inter)processor communication. %&plicit vector instructions were introduced with the appearance of vector processing. ( vector processor is e<uipped with multiple vector pipelines that can be concurrently used under hardware or firmware control. There are two families of pipelined vector processors' *emory)to)memory architecture supports the pipelined flow of vector operands directly from the memory to pipelines and then back to the memory for storing the results. Degister)to)register architecture uses vector registers to interface between the memory and functional pipelines.
#lynn:s Classification:
!n 4599, *ichael 0lynn proposed a classification for computer architectures based on the number of instruction steams and data streams (0lynn$s Ta&onomy . 0lynn uses the stream concept for describing a machine?s structure ( stream simply means a se<uence of items (data or instructions . The following are the four possible classifications according 0lynn. This is known as 0lynn$s Ta&onomy 4. 2!2B' 2ingle instruction single data >. 2!*B' 2ingle instruction multiple data 8. *!2B' *ultiple instructions single data 7. *!*B' *ultiple instructions multiple data SIS!: 2!2B means 2ingle instruction stream and single data stream 2!2B corresponds to the traditional mono)processor (von .eumann computer . ( single data stream is being processed by one instruction stream. ( single)processor computer (uni)processor in which a single stream of instructions is generated from the program. The following figure gives the architecture of the von .eumann computer'
P a g e | 12
The 1on .eumann architecture is a design model for a stored)program digital computer. !ts main characteristic is a single separate storage structure (the memory that holds both program and data. SI"!: 2!*B means 2ingle instruction stream and multiple data stream %ach instruction is e&ecuted on a different set of data by different processors i.e. multiple processing units of the same type process on multiple)data streams. This group is dedicated to array processing machines. 2ometimes, vector processors can also be seen as a part of this group. The following is the architecture of 2!*B' Program instructions are coded data which tell the computer to do something Bata is simply information to be used by the program Control unit fetches instructions,data from memory, decodes the instructions and then se<uentially coordinates the operations to accomplish the programmed task (rithmetic unit performs basic arithmetic operations !nput,Gutput is the interface to the human operator
2ome important features of the 1on .eumann architecture are' 3oth instructions (code and data (variables and input,output are stored in memory+ *emory is an collection of binary digits (bits that have been organi"ed into bytes, words, and regions with addresses+ The code instructions and all data have memory addresses+ To e&ecute each instruction, it has to be moved to registers+ Gnly the registers have the smarts to do anything with the instructions+ memory locations have no smarts+ To save a result computed in the registers, it has to be moved back to memory+ Gperating systems and compilers keep the instructions and data in memory organi"ed so it doesn?t get mi&ed up together+ !f a program e&ecution goes past its legal last instruction (for e&ample it can overwrite other instructions,data in memory and cause strange things to happen+ Gne of the advantages of modern operating systems and compilers is the concept of re)locatable code i.e. code that can be loaded and run from any location in memory.
P a g e | 13
"IS!: *!2B means *ultiple instruction stream and single data stream %ach processor e&ecutes a different se<uence of instructions. !n case of *!2B computers, multiple processing units operate on one single)data stream. !n practice, this kind of organi"ation has never been used The following is the architecture of *!2B'
"I"!: *!*B means *ultiple instruction stream and multiple data stream %ach processor has a separate program. (n instruction stream is generated from each program. %ach instruction operates on different data.
P a g e | 14
This last machine type builds the group for the traditional multi)processors. 2everal processing units operate on multiple)data streams. The following is the architecture of *!*B'
System attri utes to performance:

The ideal performance of a computer system demands a perfect match between machine capability and program behavior. *achine capability can be enhanced with better hardware technology, innovative architectural features, and efficient resources management
P a g e | 15
There are also many other factors affecting program behavior, including algorithm design, data structures, language efficiency, programmer skill and compiler technology. He introduce below fundamental factors for pro/ecting the performance of a computer. They can be used to guide the system architects in designing better machines or to educate programmers or compiler writers in optimi"ing the codes for more efficient e&ecution by the hardware. Consider the e&ecution of a given program on a given computer. The simplest measure of program performance is the turn around time, which includes disk and memory accesses, input and output activities, compilation time, G2 over head, and CP# time. !n order to reduce the turn around time, we must reduce all these time factors. Cloc8 6ate and CPI: The CP# (or simply the processor of today$s digital computer is driven by a clock with a constant cycle time ( t in nanoseconds . The inverse of cycle time is the Clock Date, which is indicated by Kf$$. f =1/t in megahertz The si"e of a program is determined by its instruction count ( Ic), in terms of the number of instructions to be e&ecuted in the program. Bifferent machine instructions may re<uire different numbers of clock cycles to e&ecute. Therefore, the cycles per instructions (CP! becomes an important parameter for measuring the time needed to e&ecute each instruction. Performance #actors: -et !c be the number of instructions in a given program, or the instructions count. The CP# time (T in seconds,program needed to e&ecute the program is estimated by finding the product of three contributing factors' T= Ic CPI t The e&ecution of an instruction re<uires going through a cycle of events involving the instruction fetch, decode, operand(s fetch, e&ecution, and store results. !n this cycle, only the instructions decode and e&ecution phases are carried out in the CP#. The remaining three operations may be re<uired to access the memory. He define a memory cycle as the time needed to complete one memory reference. #sually, a memory cycle is k times the processor cycle t. The value of k depends on the speed of the memory technology and processor) memory interconnection scheme used. The CP! of an instruction type can be divided into two component terms corresponding to the total processor cycles and memory cycles needed to complete the e&ecution of the instruction. Then, the above e<uation will become as, T=Ic ( p + m k ) t Hhere, p is the number of processor cycles needed for the instruction decode and e&ecution, m is the number of memory references needed,
P a g e | 16
k is the ratio between memory cycle and processor cycle, Ic is the instruction count, t is the processor cycle time. The above five performance factors (Ic. p, m, k, t) are influenced by the four system attributes' !nstruction)set architecture, compiler technology, CP# implementation and control, and cache and memory hierarchy. The instruction)set architecture affects the program length ( Ic) and processor cycle needed (p . The compiler technology affects the values of Ic, p and m. The CP# implementation and control determine the total processor time ( p.t . 0inally, the memory technology and hierarchy design affect the memory access latency ( k.t "IPS 6ate: -et C be the total number of clock cycles needed to e&ecute a given program. Then, the CP# time can be estimated as T=C t A C,f. 0urthermore, CP!A C/Ic and T= Ic CPI t = Ic CP! , f The processor speed is often measured in terms of million instructions per second (*!P2 . He simply call it the *!P2 rate of a given processor.
Throughput 6ate: Eow many programs a system can e&ecute per unit time, is called 2ystem Throughput Date (wp .
0urther, HpA (*!P2 L 4;9 , !c
"ultiprocessors:
*ultiprocessor means more than one processor. !n the multiprocessor environment the system utili"es more than one processor to improve performance. %arly multiprocessor systems used multiple processors to improve throughput by e&ecuting independent /obs on different processors. Gn using multiprocessors we can reduce the e&ecution times of individual applications by dividing a single program$s work across multiple processors. 3y dividing a single programs work among multiple processors, multiprocessors can achieve greater performance than is possible with a single processor.
"ultiprocessor Architecture'
*ultiprocessors consist of a set of processors connected by a communication network, as shown below. %arly multiprocessor systems often used processors that had been specifically designed for use in multiprocessors. !n recent years that are changed and most
P a g e | 17
current multiprocessors use the same processors which suitable for uniprocessor systems, taking the advantage of the large scale volumes of uniprocessors to lower prices.
Shared memory multiprocessor:

!n this multiprocessor system, there is one memory system for the entire multiprocessor and the memory references from all of the processors go to that memory system.
(dvantage of this is that all of the data in the memory is accessible to any processor and that there is never a problem with multiple copies of a given data e&isting. Eowever, the band width of the memory doesn$t grow the number of processors in the machine increases and the latency of the network is added to the latency of the memory references. To avoid such problems many shared memory multiprocessor systems provide local cache for all the each processor and only sends a re<uest that miss in its cache over the network to the main memory. De<uests that hit in the cache are handled <uickly and do not travel over the network, reducing the amount of data that the network must carry and allowing the main memory to support more processors. Eowever, more than one cache may have a copy of a given memory location, creating the cache coherence problem.
!istri uted memory multiprocessor:
P a g e | 18
!n this, each processor has its own memory, which it can access directly. To obtain data that is stored in some other processor$s memory, a processor must communicate with that to re<uest for a particular data access.
This system offers the advantage that each processor has its own local memory system. This means that there is no more total bandwidth in the memory system than in a centrali"ed shared memory and that the latency to complete a memory re<uest is lower, because each processor$s memory is located physically close to it. Bisadvantage is this only access some data from its own memory and if it re<uires the data of another processor it must communicate with that processor, it leads to coherence problem then the data concurrency problem may arises.
"ulti-Computer:
( computer made up of several computers. The term generally refers to an architecture in which each processor has its own memory rather than multiple processors with a shared memory similar to parallel computing.
Net5or8 Topologies used to connect the computers are9
P a g e | 19
M Point)to)point M 3us M 2tar M Ding or circular M *esh M Eybrid Point-to-point: The simplest topology is a permanent link between two endpoints. 2witched point)to)point topologies are the basic model of conventional telephony. .us: %ach computer or server is connected to the single bus cable. ( signal from the source travels in both directions to all machines connected on the bus cable until it finds the intended recipient. !f the machine address does not match the intended address for the data, the machine ignores the data. (lternatively, if the data matches the machine address, the data is accepted. 2ince the bus topology consists of only one wire, it is rather ine&pensive to implement when compared to other topologies.
Star: !n 2tar topology every node (computer workstation or any other peripheral is connected to central node called hub or switch. The switch is the server and the peripherals are the clients. The network does not necessarily have to resemble a star to be classified as a star network, but all of the nodes on the network must be connected to one central device.
6ing or Circular:
P a g e | 20
( network topology that is set up in a circular fashion in which data travels around the ring in one direction and each device on the right acts as a repeater to keep the signal strong as it travels. %ach device incorporates a receiver for the incoming signal and a transmitter to send the data on to the ne&t device in the ring.
"esh: The value of fully meshed networks is proportional to the e&ponent of the number of subscribers, assuming that communicating groups of any two endpoints, up to and including all the endpoints, is appro&imated by Deed?s -aw.
Fully connected
0ully connected mesh topology
The number of connections in a full mesh A n(n ) 4 , >.

Partially connected
Partially connected mesh topology
P a g e | 21
The type of network topology in which some of the nodes of the network are connected to more than one other node in the network with a point)to)point link N this makes it possible to take advantage of some of the redundancy that is provided by a physical fully connected mesh topology without the e&pense and comple&ity re<uired for a connection between every node in the network. 4y rid: Eybrid networks use a combination of any two or more topologies in such a way that the resulting network does not e&hibit one of the standard topologies (e.g., bus, star, ring, etc. . ( hybrid topology is always produced when two different basic network topologies are connected. Two common e&amples for Eybrid network are' star-ring netw rk and star-!"s netw rk
A Ta;anomy of "I"! computers:

Parallel computers appear as either 2!*B or *!*B configuration. The architectural trend for future general)purpose computers is in favor of *!*B configurations with distributed memories having a globally shared virtual address space. *ulticomputers use distributed memories with multiple address space. They are scalable with distributed memory. Centrali"ed multicomputers are yet to appear. The evolution of fast -(. connected workstations will create commodity supercomputing. "ultivector and SI"! computers: !n this, we introduce supercomputers and parallel processors for vector processing and data parallelism. He classify supercomputers either as pipelined vector machines using a few powerful processors e<uipped with vector hardware, or as 2!*B computers emphasi"ing massive data parallelism. +ector Supercomputers: ( vector computer is often built on top of a scalar processor. (s shown in the below figure, the vector processor is attached to the scalar processor as an optional feature. Program and data are first loaded into the main memory through a host computer. (ll instructions are first decoded by the scalar control unit. !f the decoded instruction is a scalar operation or a scalar control operation, it will be directly e&ecuted by the scalar processor using the scalar functional pipelines.
P a g e | 22
!f the instruction is decoded as a vector operation, it will be sent to the vector control unit. This control unit will supervise the flow of vector data between the main memory and vector functional pipelines. The vector data floe is coordinated by the control unit. ( number of vector functional pipelines may be built into a vector processor. Two pipeline vector supercomputer models are described below. +ector Processor "odels: Degister)to)register architecture *emory)to)memory architecture The above figure shows a register)to)register architecture. 1ector registers are used to hold the vector operands, intermediate and final vector results. The vector functional pipelines retrieve operands from and put the results into the vector registers. (ll vector registers are programmable in user instructions. %ach vector register is e<uipped with a component counter which track of the component registers used in successive pipeline cycles. The length of the each vector register is usually fi&ed, say, 97)bit component registers in a vector register in Cray 2eries supercomputers. Gther machines, like 0u/itsu 1P>;;;
P a g e | 23
series, use reconfigurable vector registers to dynamically match the register length with that of the vector operands. ( memory)to)memory architecture differs from a register)to)register architecture in the use of a vector stream unit to replace the vector registers. 1ector operands and results are directly retrieved from the main memory in superwords, say, 64> bits.

Computer Architecture

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computer Architecture

Uploaded by

Copyright:

Available Formats

Page |1

Advantages of Parallel Computing:

Computer !evelopment "ilestones:

,lements of modern computers:

,volution of Computer Architecture:

System attri utes to performance:

0urther, HpA (*!P2 L 4;9 , !c

Shared memory multiprocessor:

!istri uted memory multiprocessor:

Net5or8 Topologies used to connect the computers are9

0ully connected mesh topology

The number of connections in a full mesh A n(n ) 4 , >.

Partially connected mesh topology

A Ta;anomy of "I"! computers:

You might also like