To Computing System Design

Introduction
to
Computing System Design
Dr.Pradeep.C
Principal
Mar Baselios Christian College of
Engineering and Technology,
Kuttikkanam
Digital Systems
DIGITAL
CIRCUITS
The Concept of a 3
Computer
Application software
Systems software
User Hardware
Operating system
compiler
assembler
Programs user
writes and runs
Software 4
Compiler Assembler
Application software, MIPS compiler output, MIPS binary machine code:

a program in C: assembly language program:
00000000101000010000000000011000
00000000000110000001100000100001
swap (int v[ ], int k) swap; 10001100011000100000000000000000
{int temp; muli $2, $5, 4 10001100111100100000000000000100
10101100111100100000000000000000
temp = v[k]; add $2, $4, $2 10101100011000100000000000000100
v[k] = v[k+1]; lw $15, 0 ($2) 00000011111000000000000000001000
v[k+1] = temp; lw $16, 4 ($2)
} sw $16, 0 ($2) 32-bit words
sw $15, 4 ($2)
Application
software jr $31 stored in
Systems software
Hardware
memory
Machine instructions
Binary Machine Code 5
00000000101000010000000000011000
00000000000110000001100000100001
10001100011000100000000000000000
10001100111100100000000000000100
10101100111100100000000000000000
10101100011000100000000000000100
00000011111000000000000000001000
Instruction Encoded data

code
(opcode)
The Hardware of a Computer6
Input
Control
Datapath Memory
Central Processing
Unit (CPU)
Application
or “processor” Output
software
Systems software
Hardware
FIVE PIECES
Hardware Processes Machine 7
Code
User program is translated into binary machine code by compiler
and assembler and is stored in memory.
 Control unit reads program from memory, one word at a time
(fetch operation).
 Control unit deciphers the instruction bits of program word and
configures datapath logic, which processes data and saves
results in memory (decode and execute operations).
Digital Hardware of Computer
8
Control Finite
State Machine
Memory (FSM)
Datapath:
Arithmetic logic
and registers
Input/Output bus
George Boole, 1815-1864
 Born, Lincoln, England
 Professor of Math., Queen’s
College, Cork, Ireland
 Book, The Laws of Thought,
1853
 Wife: Mary Everest Boole
Claude E. Shannon (1916-2001)
 A Symbolic Analysis of Relay and Switching Circuits, Master’s

Thesis, MIT, 1940. Perhaps the most influential master’s thesis of
the 20th century.
 An Algebra for Theoretical Genetics, PhD Thesis, MIT, 1940.
 Founded the field of Information Theory.
• C. E. Shannon and W. Weaver, The Mathematical Theory of
Communication, University of Illinois Press, 1949. A “must read.”
Transistor Inventors
William Shockley
(seated)
John Bardeen
Walter Brattain
Jan 23, 1948- first
junction transistor
Nobel Prize in Physics
1956
Integrated Circuit (1958)
Jack Kilby (1923-2005), Nobel Prize, 2000

1-13
Moore’s Law
Comparison of Today’s
computing Systems
Flexibility
Processors Reliability ?
Instruction Flexibility
90% Area Overhead
(Cache , Predictions)
FPGA
Device-wide flexibility
99% Area Overhead
(Configuration)
ASIC
No Flexibility
20% Area Overhead
(Testing)
Speed , Power Efficiency

Back to Basic…
 What does the word “Computer” mean?

 For
someone it is a box sitting under
your desk at home.
 For
someone it is a device used to
check email.
 The 5-stage pipeline processor.
Computing Final Grade (2)
0.1 mt1 0.2 mt2 0.3 hw 0.4 proj
× × × ×
+ +
+ Time
grade
SPACE
2 Ways to Compute
0.1 0.1 0.2 0.2 0.4 0.4
mt1 mt1 mt2 mt2 proj proj
tmp
× tmp
+ tmp
× tmp
+ tmp
+ tmp
grade grade grade grade grade grade

clock cycle 1 clock cycle 2 clock cycle 3 clock cycle n
TIME
× Processor
+
×
+
×
+ Application Specific Integrated Circuit
× ASIC
Processor vs ASIC
 Take longer to  Take shorter time to
compute compute
 slow  fast
 Flexible  Not Flexible

 Need instructions to  No instruction
determine what to  Same calculation
do on each cycle every cycle
 Space is bounded  Space unbounded
 Branches?
Temporal Computing Spatial Computing

Visualizing Spatial
Computing
Actual computation
 AMD Opteron 64-bit processor  Full Custom ASIC
 1MB L2 Cache  4x4 SVD Decomposition
 193 mm sq  3.5 mm sq
 0.18 micron CMOS  90nm CMOS
 89W @ 1.8GHz  34mW @ 100 MHz clock
 ~3 Op / cycle (int op)  70 GOPS = 700 Op / cycle
Between Temporal & Spatial
Computing
Single ASIC
Processor
Temporal
? Spatial
• Slow Reconfigurable • Fast

• Flexible Computing • Inflexible
Reconfigurable Computing
 No standard definition
 “Computing via a post-fabrication and spatially
programmed connection of processing elements.”
-John Wawrzynek Sp04
 A computer that can RE-configure itself to
perform computation spatially as needed
 How often do we RE-configure?
 Coarse-grain? Fine-grain?
 Example: FPGA
Introduction to FPGA
 Field Programmable Gate Array
 Began as ASIC replacements
 ASIC that can be configured “in the field”
 At power up, configuration is load to the chip
 Chip acts as an ASIC until power down
 Modern FPGA more like computers
 Exploit dynamic, partial reconfiguration
 Embedded processors
 Xilinx, Altera are 2 major market leaders
FPGA Principles
 A Field-Programmable Gate Array (FPGA) is an integrated

circuit that can be configured by the user to emulate any
digital circuit as long as there are enough resources
 An FPGA can be seen as an array of Configurable Logic

Blocks (CLBs) connected through programmable
interconnect (Switch Boxes)
FPGA structure
CLB SB CLB
SB SB SB
Configurable Logic Blocks

CLB SB CLB
Interconnection Network
I/O Signals (Pins)

Simplified CLB Structure
Look-Up MUX
SET
Table D Q
(LUT)
CLR Q
CLB SB CLB
SB SB SB

CLB SB CLB
I/O Signals (Pins)

Example: 4-input AND gate
A
B
O
C
D
A B C D O
0 0 0 0 0
0 0 0 1 0 0
0
0 0 1 0 0 0
A 0
MUX O
0 0 1 1 0 0
0 SET
0 1 0 0 0 B 0 D Q
0
0 1 0 1 0 0
C 0
0
0 1 1 0 0 0 CLR Q
0 1 1 1 0 D 0
0 0
0
1 0 0 0 0 1
1 0 0 1 0
1 0 1 0 0 Configuration bits
1 0 1 1 0
1 1 0 0 0
1 1 0 1 0
1 1 1 0 0
1 1 1 1 1
Example 2: Find the configuration
bits for the following circuit
A0
2-to-1 SET
MUX
D Q
A1
CLR Q
Clock A0 MUX
SET
A1 D Q
A0 A1 S
S
0 0 0 CLR Q
0 0 1
0 1 0
0 1 1 Configuration bits
1 0 0
1 0 1
1 1 0
1 1 1
Configuration
bits 0 1
0
0
CLB SB CLB
0 0
SB SB SB

CLB SB CLB
I/O Signals (Pins)

Example 3
 Determine the configuration bits for the following circuit
implementation in a 2x2 FPGA, with I/O constraints as shown in
the following figure. Assume 2-input LUTs in each CLB.
Input1
Input2
CLB0 SB0 CLB1
Input1 D
SET
Q
Input2 Output
Q
Input3 CLR
SB1 SB2 SB3
Input3
CLB2 SB4 CLB3 Output
CLBs required
CLB 1 CLB 2
Input1 D
SET
Q
Input2 Output
CLR Q
Input3
0
0
MUX O MUX Output
SET
0 D Q D
SET
Q
Input1 O 1
Input2 0 Input3 1
CLR Q CLR Q
1 0
1 0
Configuration bits Configuration bits

Placement: Select CLBs
Input1
Input2
CLB0 SB0 CLB1
SB1 SB2 SB3
Input3
Routing: Select path
Input1
Input2
CLB0 SB0 CLB1 SB1
Configuration bits
0 0
0
SB1 SB2 SB3 1

0 0
SB4
Configuration bits
Input3
0 0
1
0
0 0
Configuration Bitstream
 The configuration bitstream must include ALL

CLBs and SBs, even unused ones
 CLB0: 00011
 CLB1: 01100
 CLB2: XXXXX
 CLB3: ?????
 SB0: 000000
 SB1: 000010
 SB2: 000000
 SB3: 000000
 SB4: 000001
Realistic FPGA CLB: Xilinx
Die Photo of a FPGA
Entire Chip is for Computation
Spartan-3 9nm CMOS

Static Reconfiguration
Traditional FPGA architectures are primarily statically

programmed devices, allowing only one configuration to be
loaded at a time
• Compile-time Reconfiguration
• One configuration per application
• System must be halted and then restarted with new
program
• Most common approach
Dynamic Reconfiguration
Whereas static reconfiguration allocates logic for the duration

of an application, dynamic reconfiguration (often referred as
run-time reconfiguration) uses a dynamic allocation scheme
that re-allocates hardware at run time (i.e. during execution of
the application)
• The physical hardware is smaller than the sum of

required resources.
• With dynamic reconfiguration we can swap the
number of configurations in and out of the actual
hardware, as they are needed
Partial Reconfiguration
• Addresses are used to specify the target location of
the configuration data
• Allows reconfiguration of only a part of a device
while the rest of the device executes.
• Reduces the amount of data that must be transferred
to the FPGA
PR Architecture Example
Battery
Module A FPG
ICAP A
Module A
C
disabled enabled
Module C
Flash controller
(Microblaze)
JTAG
Controller
Module B
Module
ModuleBB Bitstreams
Base system
disabled enabled
storage configuration
External Reconfigurable
Module
I/O A request Static area
area
1. System controller does not need to be placed in an external device
2. Access to fast Internal Configuration Access Port (ICAP – 32 bits, 100 MHz)
3. Smaller partial bitstreams
4. No need to halt complete system when reconfiguring a module
5. Time multiplexing of FPGA resources, load and unload HW modules on demand
Relocation and Defragmentation
• Loading and uploading configurations fragments free

space
• May reconfigure any part of the device
• Configurations most likely occupy contiguous areas
• Relocation needed if fragmentation of a free space
prevents loading new configurations
New
configuration
Relocation
/ c1
c5 c1 c3 defragment c5
ation
c2
c2 c3
c4 c4
Reliability?
Reliability:
Fault Tolerant Computing
Software-based Works only
for transient faults! specific
fault detection
& compensation
Fault
event HW logic & Typically works
RT-level for transient and universal
detection & permanent faults!
compensation
Typically works very

Transistor-and switch level for specific types of
compensation transient faults specific
only!
Triple Modular Redundancy
Execution
Unit 1
input Result out
signal (majority)
Execution Comparator
Unit 2 Voter
Error
Execution detect
Unit 3
Can detect and compensate almost any type of fault

Overhead about 200-300 %, additional signal delays
The voter itself is not covered but must be a „self checking checker“
Standard (by law) in avionics applications!
FPGA-based COMPUTING
SYSTEMS
In-System FPGA Repair
Self Repairable System Model
 System with Dynamic Partial
Reconfiguration feature
 System controller is placed
in internally
 Configuration files are
stored in the flash memory
 External Placer and
Scheduler is used to control
the reconfigurations
 Access to fast Internal
Configuration Access Port
Time multiplexing of FPGA
resources, load and unload
HW modules on demand
Self Repair Algorithm - SPARe
 Uses of king spare allocation technique. King Spare Allocation
 Spare cell is differentiated to replace the

faulty cell
 Faulty cell undergoes a fault

identification test to determine whether
the fault is permanent or transient.
 The spare cell undergoes

dedifferentiation
 Transient faults can be repaired , if at

least one spare cell is available in the
system
SPARe Algorithm
King Spare Allocation
 The faulty cell undergoes
apoptosis and spare becomes new
working cell if the fault is
permanent
 In the case of second fault in the

same section, spare cell in other
sections are used ( Dijkstra)
 System fails if fault is permanent

and no spare cell is available in the
systems
Multi Objective Repair Algorithm
(MORe)
Properties of MORe Algorithm
 Infinite number of transient fault recovery.
 Increased number of permanent fault recovery.
 Size feasibility checking.
 Minimum routing overhead.

MORe Algorithm
Properties of MORe
Algorithm
 Infinite number of transient

fault recovery.
 Increased number of
permanent fault recovery.
 Size feasibility checking.
 Minimum routing overhead.

Example: Video Compression – Sum of
Absolute Differences
Only difference: ball moving
Frame 1 Frame 2 Frame 1 Frame 2
Digitized Digitized Digitized Difference of a

frame 1 frame 2 frame 1 2 from 1
1 Mbyte 1 Mbyte 1 Mbyte 0.01 Mbyte

(a) (b)
 Video is a series of frames (e.g., 30 per second) Just send
difference
 Most frames similar to previous frame
 Compression idea: just send difference from previous frame
RTL Example: Video Compression
– Sum of Absolute Differences
compare Each is a pixel, assume
Frame 1 Frame 2
represented as 1 byte
(actually, a color picture
might have 3 bytes per
pixel, for intensity of
red, green, and blue
components of pixel)
 Need to quickly determine whether two frames are similar
enough to just send difference for second frame
 Compare corresponding 16x16 “blocks”
 Treat 16x16 block as 256-byte array
 Compute the absolute value of the difference of each array item
 Sum those differences – if above a threshold, send complete frame
for second frame; if below, can use difference method (using
another technique, not described)
A SAD
256-byte array
integer
B sad
256-byte array
go
!(i<256)
 Want fast sum-of-absolute-differences (SAD) component
 When go=1, sums the differences of element pairs in arrays A and B,
outputs that sum
RTL Example: Video Compression – Sum
of Absolute Differences
A SAD
Inputs: A, B (256 byte memory); go (bit)
Outputs: sad (32 bits)
B sad Local registers: sum, sad_reg (32 bits); i (9 bits)
go
S0 !go
 S0: wait for go go
sum = 0 a
S1
 S1: initialize sum and index i=0
 S2: check if done (i>=256) (i<256)’
S2
 S3: add difference to sum,
i<256
increment index sum=sum+abs(A[i]-B[i])
S3
 S4: done, write to output i=i+1
sad_reg
S4 sad_reg = sum
Inputs: A, B (256 byte memory); go (bit) AB_addr A_data B_data
Outputs: sad (32 bits)
Local registers: sum, sad_reg (32 bits); i (9 bits) i_lt_256
<256 8 8
9
S0 !go i_inc
go i_clr
i –
sum = 0 a
8
S1
i=0
sum_ld
(i<256)’ sum 32 abs
S2 sum_clr
i<256 32 32 8
sum=sum+abs(A[i]-B[i]) sad_reg_ld
S3
i=i+1
sad_reg +
reg=sum 32
sad_ Datapath
S4
sad
 Step 2: Create datapath
– Sum of Absolute
go AB_rd
Differences AB_addr A_data B_data
i_lt_256
<256 8 8
S0 go’
9
go i_inc
S1
sum=0 sum_clr=1
i_clr
i –
i=0 i_clr=1
8
S2 sum_ld
? i<256 i_lt_256 sum 32 abs
sum_clr
S3 sum=sum+abs(A[i]-B[i])
sum_ld=1; AB_rd=1 32 32 8
!(i<256)
i=i+1 i_inc=1 sad_reg_ld
S4 sad_reg=sum a
sad_reg +
sad_reg_ld=1
!(i<256) (i_lt_256) Controller 32
sad
 Step 3: Connect to controller
 Step 4: Replace high-level state machine by FSM
 Comparing software and custom
circuit SAD
 Circuit: Two states (S2 & S3) for each
i, 256 i’s 512 clock cycles
 Software: Loop (for i = 1 to 256), but (i<256)’
for each i, must move memory to S2
local registers, subtract, compute i<256
absolute value, add to sum, S3
sum=sum+abs(A[i]-B[i])
increment i – say about 6 cycles per i=i+1
array item  256*6 = 1536 cycles
 Circuit is about 3 times (300%) faster
Behavioral Level Design: C
5.5
to Gates C code
S0 !go
int SAD (byte A[256], byte B[256]) // not quite C syntax
go
{
sum = 0
S1 uint sum; short uint I;
i=0
sum = 0;
(i<256)’ i = 0;
S2 while (i < 256) {
sum = sum + abs(A[i] – B[i]);
i<256
i = i + 1;
sum=sum+abs(A[i]-B[i])
S3 }
i=i+1
return sum;
}
a
S4 sad_reg = sum
 Earlier sum-of-absolute-differences example

 Started with high-level state machine
 C code is an even better starting point -- easier to understand
Behavioral-Level Design: Start
with C (or Similar Language)
 Replace first step of RTL design method by two steps
 Capture in C, then convert C to high-level state machine
 How convert from C to high-level state machine?
Step 1A: Capture in C

a
Step 1B: Convert to high-level state machine
Converting from C to High-Level
State Machine
 Convert each C construct to
equivalent states and transitions
 Assignment statement target= a
target = expression;
expression
 Becomes one state with
assignment
 If-then statement
!cond
 Becomes state with condition
check, transitioning to “then” if (cond) {
cond
statements if condition true, // then stmts (then stmts) a
otherwise to ending state }

(end)
 “then” statements would
also be converted to states
Converting from C to
High-Level State Machine
 If-then-else !cond
 Becomes state with condition if (cond) {

// then stmts
cond
check, transitioning to “then” }
(then stmts) (else stmts)
statements if condition true, or else { a
to “else” statements if condition // else stmts (end)

false }
 While loop statement !cond
 Becomes state with condition while (cond) {

cond
check, transitioning to while // while stmts (while stmts) a
loop’s statements if true, then }

transitioning back to condition
check
(end)
Simple Example of Converting from C
to High-Level State Machine
Inputs: uint X, Y
Outputs: uint Max !(X>Y) !(X>Y)
X>Y X>Y
if (X > Y) {
Max = X; (then stmts) (else stmts) Max=X Max=Y
}
else {
Max = Y;
(end) (end)
}
a a
(a) (b) (c)

 Simple example: Computing the maximum of two numbers
 Convert if-then-else statement to states (b)
 Then convert assignment statements to states (c)
HDL Design Verification
Implement your
HDL
Verilog HDL
Behavioral design using
Simulation
Verilog HDL
Functional
Synthesis Simulation
Timing
Implementation Simulation
In-Circuit
Download Verification
Synthesis Design Verification
Behavioral
HDL
Verilog HDL Simulation
Synthesize the
Functional
Synthesis Simulation design to create
an FPGA netlist
Timing
Implementation Simulation
In-Circuit
Download Verification
Implementation
Design Verification
Behavioral
HDL
Verilog HDL Simulation
Functional
Synthesis Simulation
Translate, place
Implementation Timing and route, and
Simulation
generate a
bitstream to
In-Circuit download in the
Download Verification FPGA
On-Chip Verification
ChipScope ILA System Diagram
Target FPGA
USER
Chipscope ILA
FUNCTION
USER with ILA cores
FUNCTION
ILA
ILA
PC running ChipScope
USER
FUNCTION
Control ILA
JTAG
JTAG
MultiLINX Cable or Connection
Parallel Cable III
Target Board
FPGA Development Boards
An FPGA-based development platform with a large FPGA and I/O devices
to support a wide range of digital circuits, including a complete computer
system.
Applications of Reconfigurable
Systems
• Space Missions
• Deffence
• Adaptive Embedded Systems
• Cognitive Computing
• Entertainment
LIST OF PUBLICATIONS
Anila,Ann., Pradeep,C.,FPGA Implementation of Area-Efficient Single Precision Floating

Point Complex Divider with Fault Detection. International Journal of Computational
Systems Engineering (IJCSyE). In Production. Inderscience Publishers.
Jisha, M., Pradeep,C., Intelligent Selective Modular Redundancy for Online Fault
Detection of Adders in FPGA. International Journal of High Performance Systems
Architecture (IJHPSA). Accepted. Inderscience Publishers.
Saranya, R., Pradeep, C., Design and Implementation of a Reconfigurable Finite Impulse
Response Filter for Adaptive Systems. International Journal of Computational
Systems Engineering (IJCSyE). Under Review. Inderscience Publishers.
Eapen, M.E., Pradeep, C., Varghese, A.A. and Nair, J.M., 2016. Placement Strategies for
Faulty Cells in Module Relocation Based BISR Approach. Innovations in Bio-Inspired
Computing and Applications (pp. 437-446). Springer International Publishing.
Anjana, S., Pradeep, C. and Samuel, P., 2015. Synthesize of High Speed Floating-point
Multipliers Based on Vedic Mathematics. Procedia Computer Science, 46, pp.1294-
1302. Elsevier Publishing.
Baby, N., Pradeep, C., Saranya, R. and Radhakrishnan, R., 2015. Synthesis of Reconfigurable Video
Compression Modules in Virtex FPGAs for Multiple Fault Repair Mechanism. Procedia Computer
Science, 46, pp.1333-1340. Elsevier Publishing.
Saranya, R., Pradeep, C., Baby, N. and Radhakrishnan, R., 2015. FPGA Synthesis of Reconfigurable
Modules for FIR Filter. International Journal of Reconfigurable and Embedded Systems
(IJRES), 4(2).
Baby, N. and Pradeep, C., 2014, July. FPGA partitioning and synthesis of reconfigurable video
compression module. In Control, Instrumentation, Communication and Computational
Technologies (ICCICCT), 2014 International Conference on (pp. 360-364). IEEE Xplore.
Saranya, R. and Pradeep, C., 2014, July. FPGA synthesis of area efficient data path for reconfigurable
FIR filter. In Control, Instrumentation, Communication and Computational Technologies (ICCICCT),
2014 International Conference on (pp. 349-354). IEEE Xplore.
Anjana, S. and Pradeep, C., 2014, July. High speed integer multiplier designs for reconfigurable
systems. In Control, Instrumentation, Communication and Computational Technologies (ICCICCT),
2014 International Conference on (pp. 393-397). IEEE Xplore.
Reshma Mary John, Pradeep C., 2013.”Responsive Back-Up Circuits (RBC) Inspired Fault-Recovery
Algorithm for Reconfigurable Systems”, Proceedings of U.G.C Sponsored III National Conference
on Modern Trends in Electronic Communication & Signal Processing.
Ajith Ravindran, Soya Treesa Jose and Pradeep C "A 1.5V Area Efficient Asynchronous Adder using
MODL and Double Pass Transistor Logic" Proceedings of International Conference on Global
Innovation in Technology and Sciences (ICGITS 2013),4-6th April 2013.
Reshma Mary John, Pradeep C., 2013 ”Self-Repairing Algorithm with Shared Spare Allocation for
Reconfigurable Systems”, International Journal of Emerging Technology and Advanced
Engineering. Volume 3, Issue 8, pp 716-721.
Ajith Ravindran, Soya Treesa Jose and Pradeep C"A 1.5V Area Efficient Asynchronous Adder using
MODL and Double Pass Transistor Logic" International Journal of Scientific & Engineering
Research, Volume 4, Issue 8, August 2013
Jose, S.T. And Pradeep, C,2013 "Design of a multichannel NAND Flash memory controller for efficient
utilization of bandwidth in SSD's "proceedings of International Multi-Conference on Automation,
Computing, Communication, Control and Compressed Sensing (iMac4s), 22-23 March
2013,Kottayam,India.pp 235 - 239. IEEE Xplore
Oommen,D. And Pradeep,C.,2012 "Reconfigurable router using RLBS algorithm " Proceedings of 12th
International Conference on Intelligent Systems Design and Applications (ISDA). 27-29 Nov.2012,
Kochi, India. pp 332 - 336. IEEE Xplore
Pradeep, C, Radhakrishnan, R & Philip Samuel 2014, ‘Reduced Time Testing Method for Permanent
Faults in Interconnects of Reconfigurable Hardware’, Proceedings of International Conference On
Systemic, Cybernetics and Informatics, vol. 1 & 2, pp. 018-022.
Pradeep, C, Radhakrishnan, R & Philip Samuel 2014, ‘Fault Recovery Algorithm Using King Spare
Allocation and Shortest Path Shifting for Reconfigurable Systems’, Journal of Theoretical and
Applied Information Technology, vol. 61, no.2, pp 254-261.
Pradeep, C, Radhakrishnan, R 2014, ‘FPGA Evaluation of Reconfigurable Modules with Self Repair
Mechanism’, International Journal of Reconfigurable and Embedded Systems, vol.3, no.2,pp.1-12.
Pradeep, C, Radhakrishnan, R, Saranya, R & Philip Samuel 2014, ‘Area Efficient Data Path with Online
Fault Detection Mechanism for Reconfigurable Systems’, Australian Journal of Basic and Applied
Sciences, vol.8, no.10, pp. 239-245.
Pradeep, C, Radhakrishnan, R, Neena Baby & Philip Samuel, ‘Multi objective Built in Self Repair
Algorithm with Multiple Fault Detection for Reconfigurable Systems’ ,Journal of Theoretical and
Applied Information Technology, vol. 69, no.2,pp.248-256.
Pradeep, C, Radhakrishnan, R 2014, ‘Fault Detection Methods for Interconnects of Reconfigurable
Hardware’, I-manager’s Journal on Embedded systems, vol.4, no.2, pp.1-11.
Pradeep C."Design and Implementation of Reconfigurable LFSR “International Journal on Information

and Communication Technologies, Volume 2, June 2009 (pp 139-142).
Pradeep C,"Design and Implementation of Reconfigurable LFSR" Proceedings of International

Conference ICVCom 09, SAINTGITS College of Engineering, Kottayam. (pp 315-318).
Pradeep C,"Design and Implementation of 32 bit RISC Processor in FPGA" Proceedings of National
Conference NCACS 2009, SJCET, Pala, Kottayam. pp 5-10.
Pradeep C, “Verilog HDL implementation of Superscalar Processor with Speculative branch Prediction",
Proceedings of National Conference NC-(ET) 2, SAINTGITS College of Engineering, Kottayam. pp
315-318.
Pradeep C, NIMISHA SUBHASH, RESHMA MARY JOHN, 2013.”Permanent Fault Detection Method for
Interconnects in Reconfigurable Systems”, Proceedings of U.G.C Sponsored III National
Conference on Modern Trends in Electronic Communication & Signal Processing.
Research labs
1. http://rise.cse.iitm.ac.in/rise1/index.html
2. https://ece.gmu.edu/research-
interests/reconfigurable-computing
3. https://www.cs.washington.edu/affiliates/abstra
cts/vlsi/vlsi.abstracts.html
4. http://brass.cs.berkeley.edu/
5. http://www.ece.auckland.ac.nz/en/about/our-
research/research-
areas/parallelandreconfigurablecomputingrese
archgroup.html
The End
Time for some questions...

To Computing System Design

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

To Computing System Design

Uploaded by

Copyright:

Available Formats

Introduction

Application software, MIPS compiler output, MIPS binary machine code:

Instruction Encoded data

 A Symbolic Analysis of Relay and Switching Circuits, Master’s

Jack Kilby (1923-2005), Nobel Prize, 2000

Speed , Power Efficiency

 What does the word “Computer” mean?

mt1 mt1 mt2 mt2 proj proj

grade grade grade grade grade grade

 Flexible  Not Flexible

Temporal Computing Spatial Computing

• Slow Reconfigurable • Fast

 A Field-Programmable Gate Array (FPGA) is an integrated

 An FPGA can be seen as an array of Configurable Logic

Configurable Logic Blocks

I/O Signals (Pins)

Configurable Logic Blocks

I/O Signals (Pins)

Configurable Logic Blocks

I/O Signals (Pins)

SB1 SB2 SB3

Configuration bits Configuration bits

SB1 SB2 SB3

SB1 SB2 SB3 1

 The configuration bitstream must include ALL

Entire Chip is for Computation

Spartan-3 9nm CMOS

Traditional FPGA architectures are primarily statically

Whereas static reconfiguration allocates logic for the duration

• The physical hardware is smaller than the sum of

• Loading and uploading configurations fragments free

Typically works very

Can detect and compensate almost any type of fault

 Uses of king spare allocation technique. King Spare Allocation

 Spare cell is differentiated to replace the

 Faulty cell undergoes a fault

 The spare cell undergoes

 Transient faults can be repaired , if at

 In the case of second fault in the

 System fails if fault is permanent

 Infinite number of transient fault recovery.

 Increased number of permanent fault recovery.

 Size feasibility checking.

 Minimum routing overhead.

 Infinite number of transient

 Size feasibility checking.

 Minimum routing overhead.

Digitized Digitized Digitized Difference of a

1 Mbyte 1 Mbyte 1 Mbyte 0.01 Mbyte

 Earlier sum-of-absolute-differences example

Step 1A: Capture in C

otherwise to ending state }

 Becomes state with condition if (cond) {

to “else” statements if condition // else stmts (end)

 While loop statement !cond

 Becomes state with condition while (cond) {

loop’s statements if true, then }

(a) (b) (c)

Anila,Ann., Pradeep,C.,FPGA Implementation of Area-Efficient Single Precision Floating

Pradeep C."Design and Implementation of Reconfigurable LFSR “International Journal on Information

Pradeep C,"Design and Implementation of Reconfigurable LFSR" Proceedings of International

Time for some questions...

You might also like