Professional Documents
Culture Documents
Session 1
Introduction
Applications of Parallel Processors
• Structural Analysis
• Weather Forecasting
• Petroleum Exploration
New Applications
• Fusion Energy Research
• Medical Diagnosis More Performance
• Aerodynamics Simulations
• Artificial Intelligence
• Expert Systems
• Industrial Automation
• Remote Sensing
• Military Applications Common Requirement:
• Genetic Engineering
• Socioeconomics High volume of
• Encryption processing and computations
• And Many Other Applications in a limited time
Architecture
Application Technological
requirements constraints
ARCHITECTURE
– Hardware Structure
– Software Structure
– Parallel Computing Algorithms
– Optimal Allocations of Resources
Definition
Parallel processing provides a cost-effective
solution to achieve high system
performance through concurrent activities
It is the method of organization of operations
in a computing system where more than
one operation is performed simultaneously
Scalability
• Scalability is a major objective in the
design of advanced parallel computers
• Scalability means: a proportional increase
in performance with increasing system
resources
• System resources include: Processors,
Memory Capacity, I/O Bandwidth
Course Description
• Introductory Graduate Course
• For Computer Group
• Prerequisite: Computer Organization and Programming Concepts
• Course Work:
– Case Study (Due: Week 8)
– Project (Due: Week 16)
– Possible Homework Assignments
– A Final Exam
• References:
– Advanced Computer Architecture, Parallelism, Scalability, Programmability (Kai
Hwang)
– Scalable Parallel Computing Technology, Architecture, Programming (Kai Hwang
and Zhiwei Xu)
– Parallel Computer Architecture : A Hardware/Software Approach (David Culler,
J.P. Singh, Anoop Gupta )
– Introduction to Parallel Computing (Ted G. Lewis, Hesham El-Rewini)
Course Outline
• Introduction, History, Applications, Classification
• Principles of Parallel Processing and Basic
Concepts
• Parallel Computer Models and Structures
• Programming Requirements
• Interconnection Networks
• Performance and Scalability
• Parallel and Scalable Architectures
• Parallel Programming and Models
History
• Mechanical Computers before 1945
• Five generations of electronic computers
1. (1945-54), Vacuum Tubes, Relay Memories, Fixed-Point Arithmetic, Machine Language
(The age of Dinosaurs!)
2. (1955-64), Discrete Transistors, Multiplexed Memory Access, Floating-Point Arithmetic,
High Level Languages and Compilers, batch processing
3. (1965-74), Integrated Circuits, Microprogramming, Pipelining, Cache, Multiprogramming
and Time-Sharing OS, Multiuser applications
4. (1975-90), LSI/VLSI, Semiconductor Memory, Multiprocessors, Vector Supercomputers,
Multicomputers, Multiprocessor OS, Languages, Compilers, Environment for Parallel
Processing
5. ULSI/VHSIC Processors Memory, and Switches, High-Density Packaging, Scalable
Architectures, Massively Parallel Processing, Teraflops (1012 floating-point operations per
second)
• Introduction of Concurrency:
• In early Von Neumann models every operation (Instruction are fetched, operands are
fetched, operation is executed, the results are stored)
• Prefetch operation introduced some degree of concurrency
• Extra ALUs allowed multiple execution units within the cpu capable of operating in parallel
• Pipelined operation was introduced in the third generation
• More CPUs were added to the computers to be able to perform instructions in parallel and
independently
Evolution
• From a different perspective the evolution
of computers has gone through three
waves:
– First wave: Mainframes
– Second wave: Minicomputers, High
performance super computers
– Third wave: Personal Computers, Networked
computers
• Parallel computers are the next wave
Levels of Parallelism
• Concurrency is achieved in different levels:
– Job or Program Level: Multiple jobs or programs are processed concurrently
through multiprogramming, timesharing and multiprocessing
• Requires the development of parallel processable algorithms
• Efficient allocation of limited hardware and software resources to multiple programms
– Task or Procedure Level: Multiple procedures or tasks (program segments)
within the same program are executed in parallel
• Requires the decomposition of the program into multiple tasks
– Interinstruction Level: Multiple instructions are executed concurrently
• Requires data dependency analysis
– Intrainstruction Level: Faster and concurrent operations are executed within each
instruction
• Software involvement is the highest in the first level and the lowest in the
last level
• Hardware involvement is increasing as its speed and cost is reduced while
software is getting more expensive
Alternatives
• Parallelism in Uniprocessor Systems
• Multiprocessor Systems
• Distributed Computers
– Cluster (Networked) Computers
– Web Computing
• Parallel Computers with Centralized
Computing Facilities
Elements of Modern Computers
• Computer Architecture:
– Not only structure of the hardware
– But also:
• Instruction Set
• System Software
• Application Programs
• User Interfaces
• Depending on the nature of the problems Computing
the solutions may require different Problem
computing resources, for example: Operating
– Numerical Problems require System
mathematical formulations and integer
and floating-point operations (Numerical
Computing) Hardware
– Alphanumerical Problems require
Algorithms Mapping
And Data Architecture
database management and information
retrieval operations (Transaction Structures
Processing)
– Artificial Intelligence require logic
inferences and symbolic manipulations
Programming
(Logical Reasoning)
• Respectively the algorithms and data Binding Applications Software
structures will be different High-Level (Compile, Load)
• The mapping of the system resources for Languages
the appropriate algorithms used for specific
computing problems is an objective of
parallel computer design
• Mapping includes:
Performance Evaluation
– Processor Scheduling
– Memory Maps
– Interprocessor Communicaitons
– …
Elements of Modern Computers
• Coordinated effort by hardware
resources, operating system, and
application software determines the
System Architecture
power of a modern computer
system
• The operating system manages the Computing
allocation and deallocation of Problem
resources during the execution of Operating
user programs System
• The mapping of algorithmic and
data structures onto the machine
Algorithms Mapping Hardware
architecture relies on efficient
compiler and operating system And Data Architecture
support Structures
• Parallelism can be exploited at:
– Algorithm design Programming
– Programming
– Compilation Binding
High-Level Applications Software Processors,
– Run time (Compile, Load)
• Techniques for exploiting Languages Memory,
parallelism at the above levels form I/O and
the core of parallel processing Peripheral
technology Devices
Performance Evaluation
• Standard benchmark programs are
needed for performance evaluation
Classification of Parallel Computers
• Flynn Classification:
– Single Instruction Single Data stream
system (SISD)
– Single Instruction Multiple Data stream
system (SIMD)
– Multiple Instruction Single Data stream
system (MISD)
– Multiple Instruction Multiple Data stream
system (MIMD)
Example
…
from
from CU IS
host
host
Processing DS Local DS
Element (PE) n Memory (LM) n
IS … IS CU 1 CU 2 … CU n
IS IS IS
Memory
(Program and Data)
DS
PU 1
DS
PU 2
DS
… PU n
DS
I/O
…
Shared
Memory
I/O IS DS
CU n PU n
IS
Synchronous Asynchronous
Pipelining MIMD
SIMD (Vector/Array)
MISD (Systolic Array)
Communication Network
M1 M2 … Mi IO 1 IO 2 … IO j
P1 P2 … Pn
Communication Network