You are on page 1of 29

Multicore Processors

Raul Queiroz Feitosa

Parts of these slides are from the support material provided by W. Stallings
Objective

Objective

“This chapter provides an overview of


multicore systems”.

Stallings

Multicore Computers 2
Outline

Outline
 Hardware Performance Issues
 Software Performance Issues
 Multicore Organizations
 Intel Core Architecture

Multicore Computers 3
Hardware performance issues
Chip Density
Microprocessors performance increase in due to
 Pipelining
 Superscalar
a) Improved organization, e.g.,
 Multithreading

…

b) Increased clock frequency

both made possible by 1. increasing chip density!


By 2015 → 100 billion transistors on 300mm2 die.
Multicore Computers 4
Hardware performance issues
Relative Performance

Multicore Computers 5
Hardware performance issues
Relative Performance/cycle

Pollack’s rule: performance is roughly proportional


to square root of increase in complexity
 Double complexity gives 40% more performance

2. Diminishing gains with complexity increase!


Multicore Computers 6
Hardware performance issues
Power

3. Power requirements grow exponentially with


chip density and clock frequency!
Multicore Computers 7
Hardware performance issues
Increased Complexity

4. Memory transistors have a power density one


order of magnitude lower than that of logic.
Multicore Computers 8
Outline

Outline
 Hardware Performance Issues
 Software Performance Issues
 Multicore Organizations
 Intel Core Architecture

Multicore Computers 9
Software Performance Issues
small amounts of serial code impact performance
 According to Amdahl’s law

time to execute program on a single processor 1


Speedup  
f
time to execute program on N parallel processors
1  f 
N

where f is the fraction of code infinitely parallelizable with no schedule


overhead.

Multicore Computers 10
Software Performance Issues
Small amounts of serial code impact performance due to
Communication, distribution of work and cache coherence overheads

percentage of
sequential code

Multicore Computers 11
Software Performance Issues
More recently software Engineers have developed applications that
effectively exploit multiprocessor architecture, e. g., database
applications.

5. New applications exploit multiprocessor architecture!


Multicore Computers 12
Outline

Outline
 Hardware Performance Issues
 Software Performance Issues
 Multicore Organization
 Intel Core Architecture

Multicore Computers 13
Multicore Organization
In view of:
1. Increasing chip density.
2. Diminishing gains with complexity increase,
3. Power requirements grow exponentially with chip density
and clock frequency.
4. Memory transistors have a power density an order of
magnitude lower than that of logic.
5. Applications, which exploit multiprocessor architecture.

what to do with extra transistors made available


by the semiconductor industry?

Multicore Computers 14
Multicore Organization
What to do with extra transistors made
available by the semiconductor industry?
 Reduce complexity, so that multiple complete processors
fit in a single chip
 Reduce clock frequency and increase the proportion of
chip occupied by cache to reduce power requirements

Multicore Computers 15
Multicore Organization

Main variable in a multicore organization:

 Number of core processors on chip


 Number of levels of cache on chip
 Amount of shared cache

Multicore Computers 16
Multicore Organization Alternatives

ARM11 MPCore AMD Opteron

Intel Core Duo Intel Core i7

Multicore Computers 17
Private × shared L2 Cache
Advantages of shared L2 Cache
 Constructive interference reduces overall miss rate
 Data shared by multiple cores not replicated at cache level
 With proper frame replacement algorithms mean amount of shared
cache dedicated to each core is dynamic
 Threads with less locality can have more cache
 Easy inter-process communication through shared memory
 Cache coherency confined to L1

Advantages of private L2 Cache


 Dedicated L2 cache gives each core more rapid access

Shared L3 cache may also improve performance


Multicore Computers 18
Outline

Outline
 Hardware Performance Issues
 Software Performance Issues
 Multicore Organization
 Intel Core Architecture

Multicore Computers 19
Intel Core Architecture
Intel Core Duo uses superscalar cores
Intel Core i7 uses simultaneous multi-
threading (SMT)
 Scales up number of threads supported
 4 SMT cores, each supporting 4 threads appears as 16
cores.

Multicore Computers 20
Intel x86 Core Duo Organization

Multicore Computers 21
Intel x86 Core Duo Organization
Introduced in 2006
Two x86 superscalar, shared L2 cache
Dedicated L1 cache per core implementing MESI protocol
Protocol extended to accommodate multiple chips (SMP)
Thermal control unit per core
 Manages chip heat dissipation
 Maximize performance within thermal constraints
 If temperature of a core exceeds a threshold, clock rate reduced.
Advanced Programmable Interrupt Controlled (APIC)
 Inter-process interrupts between cores
 Routes I/O interrupts to appropriate core.
 Each APIC includes a timer, so that OS can interrupt the local core.

Multicore Computers 22
Intel x86 Core Duo Organization
Power Management Logic
 Monitors thermal conditions and CPU
activity
 Adjusts voltage and power consumption
 Can switch individual logic subsystems
2MB shared L2 cache
 Dynamic allocation
 MESI support for L1 caches
 Extended to support multiple Core Duo
in SMP
 L2 data shared between local cores or
external
Bus interface
Multicore Computers 23
Intel Core i7 Organization

Core 1 Core n

32 KB I&D ● ● ● 32 KB I&D
L1 Caches L1 Caches

256 KB 256 KB
256 KB
L2 Caches L2 Caches
L2 Cache
Up to 15 MB
L3 Cache

DDR3 Memory QuickPath


Controllers Interconnect

unboxing

24
Multicore Computers
Intel Core i7 Organization
Introduced in November 2008, 2nd Generation in January 2011
Up to six x86 SMT processors
Dedicated L2, shared L3 cache
Speculative pre-fetch for caches
On chip DDR3 memory controller
 Three 8 byte channels (192 bits) giving 32GB/s
 No front side bus
Turbo Boost
 Clock frequency is incrementally adjusted on demand.
Hardware Virtualization
 A facility that allows multiple operating systems to simultaneously processor
resources in a safe and efficient manner

Multicore Computers 25
Intel Core i7 Organization

QuickPath Interconnection
 Cache coherent point-to-point
link
 High speed communications
between processor chips
 6.4G transfers per second, 16 bits
per transfer
 Dedicated bi-directional pairs
 Total bandwidth 25.6GB/s
 Intel QPI animated demo

Multicore Computers 26
Intel Multicore Processors
Cache Latency

Compare Intel Core Processors


General Information about Intel Processors

Multicore Computers 27
Text Book References

These topics are covered in


 Stallings - chapter 18

Multicore Computers 28
Multicore Processors

END
Multicore Computers 29

You might also like