09MEC017

Design & Implementation of Mitigation Techniques for Single Event Upset in SRAM FPGA
Major Project Report

Submitted in partial fulllment of the requirements for the degree of
Master of Technology
in Electronics and Communication (VLSI Design)
By
SAVANI VIJAY GOPALBHAI

(09MEC017)
Department of Electronics and Communication Engineering Institute of Technology Nirma University Ahmedabad-382 481 May 2011
Design & Implementation of Mitigation Techniques for Single Event Upset in SRAM FPGA
Major Project Report
Submitted in partial fulllment of the requirements for the degree of
Master of Technology in Electronics and Communication (VLSI Design)
By
SAVANI VIJAY GOPALBHAI

(09MEC017) Guided By
Prof. N. P. Gajjar
Department of Electronics and Communication Engineering Institute of Technology Nirma University Ahmedabad-382 481 May 2011
iii
DECLARATION
I here by declare that, 1 ) This thesis comprises my original work towards the degree of Master of Technology in Electronics and Communication (VLSI Design) at Nirma University and has not been submitted elsewhere for a Degree. 2 ) Due acknowledgement has been made in the text to all other material used.
- Vijay G Savani
iv
CERTIFICATE
This is to certify that the M.Tech Major Project thesis entitled Design & Implementation of Mitigation Techniques for Single Event Upset in SRAM FPGA submitted by Savani Vijay Gopalbhai (09MEC017) towards the partial fulllment of the requirements for the Degree of Master of Technology (Electronics & Communication) in the eld of VLSI Design at Institute of Technology, Nirma University, Ahmedabad, is the record of the work carried out under our supervision and guidance. The work submitted has in our opinion reached a level required for being accepted for examination. The result embodied in this project work to the best of our knowledge has not been submitted to any other University or Institute for the award of any Degree.
Date:
Place: Ahmedabad
Prof. N. P. Gajjar Sr. Associate Professor, EC-VLSI Design, Institute of Technology, Nirma University, Ahmedabad.
Dr. N. M. Devashrayee PG Co-ordintor, VLSI Design, Institute of Technology, Nirma University, Ahmedabad.
Prof. A. S. Ranade HOD, EE Department, Institute of Technology, Nirma University, Ahmedabad.
Dr. K Kotecha Director, Institute of Technology, Nirma University, Ahmedabad.
ACKNOWLEDGEMENT
First, I would like to thank Almighty GOD for the wonderful life that I have. Thank you very much GOD for keeping the courage and the enthusiasm in me every day. I would like to thank my FAMILY for always looking after me and for continuous encouragement.
I express my sense of gratitude to my project guide Prof. N. P. Gajjar, Sr. Associate Professor, EC Department, Institute of Technology, Nirma University, for all of his guidance, encouragement and patience which enable me to carry out my project work. He inspired me with his original and innovative ideas and provided his technical guidance in the project. Without his never ending support this project would never have been completed.
It is my pleasure to thank Dr. N. M. Devashrayee (PG Coordinator-VLSI Design), Department of Electronics and Communication Engineering, Institute of Technology, Nirma University, Ahmedabad, for his support and for the knowledge and experience they have helped me obtain.
Appreciation is expressed for all research colleagues & all those with whom I worked and interacted at Institute of Technology, for their help and co-operation in the course of my project work. I am also thankful to Nirma University Management for providing me an opportunity to work in the prime institute of the country and providing me excellent working environment together with required resources. Vijay G. Savani (09MEC017), M.Tech (VLSI Design), Institute of Technology, Nirma University.
vi
ABSTRACT
The FPGAs (SRAM BASED), operating in space environment, are perturbed by charged particles, which aect the circuit in dierent ways. This work details the mitigation techniques for one of these eects called Single Event Upset(SEUs). With the progress of technology, the highly scaled devices exhibit an increased sensitivity to SEUs due to a reduced feature size and a proportional increase in device density. FPGA based designs are more susceptible to SEUs compared to ASIC designs and it is more harmful when FPGAs are used in space and defence applications. The Goal of the project is to Design, Develop and Implement the mitigation techniques for SEU, which can be carried out using various techniques. This thesis addresses some developed solutions to turn CMOS memory cells SEU immune by system where software solutions and hardware redundancy is used to mitigate SEU. Various design based solutions like Spatial Redundancy(Triple Modular Redundancy), Temporal Redundancy and Scrubbing(Reconguration) are discussed in the thesis. Apart from the mitigation, infrequent and unpredictable nature of real SEUs, small scale testing of their eects and system verication is impractical. To perform this tasks the SEU monitor system is designed and implemented on FPGA. The system along with controller macro can emulate an SEU by deliberately injecting an error into the FPGA conguration so that its subsequent detection and correction can be conrmed. It can also be used to assess SEU mitigation circuits implemented in a design. Finally, the Self Correcting System is implemented using the SEU monitor system and self recongurable system, which detects and corrects the error. If the correction of error is not possible then it recongure the user design. The thesis describes the operation and architecture of the proposed logic design as well as its implementation in Xilinx virtex-5 FPGA.
Contents
Declaration Certicate Acknowledgement Abstract Contents List of Figures List of Tables 1 Introduction 1.1 Radiation Eects . . . . . . . . . . . . 1.2 Dierent Types of Single Event Eects 1.3 Eects of SEUs on FPGAs . . . . . . . 1.4 Report Organization . . . . . . . . . . 2 SEU Mitigation Techniques 2.1 Technology Based Techniques . . . 2.2 Design Based Techniques . . . . . . 2.2.1 Spatial Redundancy . . . . 2.2.2 Temporal Redundancy . . . 2.2.3 Scrubbing(Reconguration) iii iv v vi vii ix xi 1 2 4 5 9 10 11 11 11 12 13 15 15 16 17 17 18 18
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
3 Implementation of Mitigation using TMR & PR 3.1 Basics of Triple Modular Redundancy . . . . . . . 3.2 Granularity of TMR . . . . . . . . . . . . . . . . 3.3 Various TMR Implementation Methodology . . . 3.3.1 TMR with Single Voter . . . . . . . . . . . 3.3.2 TMR with Triplicated Voter . . . . . . . . 3.3.3 Temporal Data Sampling . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
vii
CONTENTS 3.3.4 Optimal Design of TMR . . . . . . . . . Basics of Partial Reconguration . . . . . . . . Methods of Partial Reconguration . . . . . . . 3.5.1 Module Based Partial Reconguration . 3.5.2 Dierence Based Partial Reconguration Bus Macro Communication . . . . . . . . . . . Programming Medium for Conguration . . . . Development of a Self Recongurable System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
viii 21 23 24 24 27 27 30 33 37 37 38 42 43 44 44 48 51 53 53 57 62 67 67 68 69 70 72 72 72 73 73 74 75 82 83 86
3.4 3.5
3.6 3.7 3.8
4 Error Detection and Correction 4.1 Conguration Memory . . . . . . . . . . . 4.2 FRAME ECC Primitives . . . . . . . . . . 4.2.1 Readback CRC Algorithm . . . . . 4.3 Internal Conguration Access Port (ICAP) 5 Implementation of Self Correcting 5.1 SEU Controller macro . . . . . . 5.2 SEU Monitor System . . . . . . . 5.3 Self Correcting System . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6 Implementation of SEU mitigation Techniques 6.1 TMR Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Implementation of Self Recongurable System . . . . . . . . . . . . . 6.3 Implementation of Systems . . . . . . . . . . . . . . . . . . . . . . . . 7 Conclusion and Future Work 7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A VIRTEX-5 Platform FPGA A.1 Virtex-5 Device Functional Blocks: . . . . . . . . . . . . . . . . . . . B Tools & PR Design Implementation Flow B.1 Tools and Design Board . . . . . . . . . . . . . . . . . . . . B.1.1 Xilinx EDK 9.2.02 . . . . . . . . . . . . . . . . . . . B.1.2 Xilinx ISE 9.2.04i with Partial-Reconguration Patch B.1.3 Xilinx PlanAhead 10.1 . . . . . . . . . . . . . . . . . B.1.4 Virtex-V Evaluation Board . . . . . . . . . . . . . . . B.2 PR Design Implementation Flow . . . . . . . . . . . . . . . Glossary References List of Publication
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
List of Figures
1.1 1.2 1.3 1.4 2.1 2.2 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18 4.1 4.2 4.3 5.1 5.2 5.3 SEU in CMOS . . . . . . . . . . . . . . . . . . 6-Transistor Based SRAM Storage Cell . . . . . SRAM based FPGA architecture(regular array) SEU Sensitive Conguration Bit Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 6 7 8 12 13 16 18 18 19 19 20 21 22 26 28 29 29 30 32 33 34 35 35 38 39 43 45 48 49
Basics of TMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Temporal Redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . Majority Voter Circuit and Truth Table . . . . . . TMR of One-Bit Counter . . . . . . . . . . . . . . Single Voter TMR Counter with Sequential SEUs . Triple Voted TMR of One-Bit Counter . . . . . . . Triple Voted TMR Counter with Sequential SEUs . Proposed Temporal Data Sampling . . . . . . . . . Clocking Scheme for Temporal Sampling . . . . . . TMR register with voters and scrubbing . . . . . . Module Based Partial Reconguration Design Flow Bus macro used between PR Logic and Fixed Logic Bus Macro between two Region . . . . . . . . . . . Physical Implementation of Bus Macro . . . . . . . Methods of FPGA Reconguration . . . . . . . . . Bitstream Flow for the reconguration . . . . . . . System Architecture . . . . . . . . . . . . . . . . . Up Sampler block diagram . . . . . . . . . . . . . . Down Sampler block diagram . . . . . . . . . . . . Interconnection of the recongurable system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Conguration Memory Data Frame . . . . . . . . . . . . . . . . . . . FRAME ECC primitive (Virtex-5) . . . . . . . . . . . . . . . . . . . ICAP Primitive (Virtex-5) . . . . . . . . . . . . . . . . . . . . . . . . SEU Controller macro . . . . . . . . . . . . . . . . . . . . . . . . . . SEU Monitor System . . . . . . . . . . . . . . . . . . . . . . . . . . . Port mapping of System . . . . . . . . . . . . . . . . . . . . . . . . . ix
LIST OF FIGURES 5.4 5.5 5.6 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15 6.16 Internal signal Integration . . . . . . . . . . . . . . . . . . . . . . . . Software for Monitor System using SDK . . . . . . . . . . . . . . . . Block Diagram of Self Correcting System . . . . . . . . . . . . . . . . TMR with single voter(Technology Schematic) . . . . . . . Simulation Results . . . . . . . . . . . . . . . . . . . . . . Implementation of TMR with triple voter . . . . . . . . . . Technology Schematic of TMR(with Voters and Scrubbing) Temporal Data Sampling Technology Schematic . . . . . . Up Sampler RTL . . . . . . . . . . . . . . . . . . . . . . . Up Sampler Simulation Results . . . . . . . . . . . . . . . Implementation of Down Sampler . . . . . . . . . . . . . . Testing of Recongurable System . . . . . . . . . . . . . . Finding no of clock cycles to Recongure . . . . . . . . . . Dynamic PR(Performing Reconguration of UP Sampler) . RTL of SEU Monitor System . . . . . . . . . . . . . . . . Testing of SEU Monitor System . . . . . . . . . . . . . . . Checking CRC of SEU Monitor System . . . . . . . . . . . Error Injection using Monitor System . . . . . . . . . . . . Testing of Self Correcting System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
x 49 50 51 53 54 54 55 55 57 58 58 59 59 60 62 62 63 64 66 69 74 75 77 78 79 80 80
A.1 Virtex FPGA architecture . . . . . . . . . . . . . . . . . . . . . . . . B.1 B.2 B.3 B.4 B.5 B.6 B.7 PlanAhead PR Design Flow Create XPS Project . . . . . Add Software Application . Draw Partial Module . . . . DRC Run . . . . . . . . . . Luanch Run of static RM . PR assemble of Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
List of Tables
4.1 4.2 4.3 5.1 6.1 6.2 6.3 6.4 6.5 Interpretation of syndrome bit . . . . . . . . . . . . . . . . . . . . . . Frame ECC Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . Parity Bit Error Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . Controller Macro Operational Modes . . . . . . . . . . . . . . . . . . Comparison of TMR techniques . . . . . . . . . . . . . . . . . . . . Device Utilization Summary (For combined Up and Down Sampler) Area Comparison for Up and Down Sampler . . . . . . . . . . . . . Timing Analysis for Reconguration . . . . . . . . . . . . . . . . . Resource Utilization comparison for Monitor System . . . . . . . . . . . . . 40 40 42 47 56 60 61 61 65
xi
Chapter 1 Introduction
Given the remarkable success of reprogrammable logic devices in areas such as telecommunications and defence applications, it is only natural that an interest should arise in their use for Space based Electronics solutions as well. However, while such devices present numerous advantages in terms of design exibility, they come with the drawback of being susceptible to bit upsets induced by radiation, more commonly known as Single Events Upsets (SEUs). FPGAs have been very attractive for space applications over the past decade. Indeed, the main advantage provided by gate arrays is the elimination of the large overhead cost of developing custom ASICs. One of the major reasons being the prospect of performing post-launch design optimizations or changes in spacecraft objectives. Another advantage, their inherent re-programmability feature has been fully exploited for prototyping purposes. While antifuse technology has several inherent limitations that make SRAM-based FPGAs more attractive. First, once a device is programmed, it cannot be changed, additional devices have to be programmed and physically replace the installed devices. Second, available antifuse gate arrays are considerably smaller in gate count than SRAM congurable gate arrays [7]. But the problem is that the FPGAs are known to be highly susceptible to SEUs. SEUs can result in deviations from expected component behavior. 1
CHAPTER 1. INTRODUCTION
1.1
Radiation Eects
There are two main categories of radiation eects that are relevant for Static Random Access Memory (SRAM) based Field Programmable Gate Arrays (FPGAs) in space.
1. Total Dose Eects 2. Single Event Eects (SEEs)
Total Dose Eects are cumulative eects that induce degradation of electrical parameters at the device, circuit, and system levels. They are induced by the total amount of ionizing energy deposited by photons or particles such as electrons, protons, or heavy ions. Single Event Eects are induced by the passage of a single high energy proton or heavy ion through a device or a sensitive region of a microcircuit. SEEs in digital integrated circuits (ICs) can be either destructive (e.g., Single-Event Latch-up), or Non-destructive (e.g., Single Event Upset), such as the occurrence of transient faults in combinational and sequential logic [7]. Single event upset (SEU) is dened by NASA as Radiation induced errors in microelectronic circuits caused when charged particles (usually from the radiation belts or from cosmic rays) lose energy by ionizing the medium through which they pass, leaving behind a wake of electron hole pairs[7]. When designing in-orbit, space-based, or extra-terrestrial applications in hostile radiation environments, users must consider the eects of charged particles such as heavy ions or protons. As these charged particles travel through the FPGA as shown in gure 1.1, which can alter the logic state of any static memory element, resulting in single event upsets (SEUs). An SEU in the conguration memory array can have adverse eect on expected FPGA functionality [2]. Recent advancements in the semiconductor industry have led to the fabrication of highly scaled devices. These devices exhibit an increased sensitivity to SEUs due to
Figure 1.1: SEU in CMOS
a reduced feature size and a proportional increase in device density. In the case of programmable logic devices such as FPGAs, SEU eects can be much more severe. Since FPGAs utilize a conguration memory array to dene the logic function, an SEU occurring in a single bit in the array can lead to an unexpected alteration of the original design [8]. SEUs are soft errors, and are nondestructive. In FPGAs, a multitude of latches, also called memory cells or RAM bits, dene all logic functions and on-chip interconnects. Such latches are similar to the 6-transistor storage cells (as shown in gure 1.2, whose function is described in section 1.3) used in SRAMs, which has proved to be sensitive to single event upsets caused by high-energy neutrons. The faults have been observed as bit errors in memories. As the microelectronics industry has advanced, Integrated Circuit (IC) design in general and recongurable architectures (FPGAs, recongurable SoC and etc) in particular have experienced dramatic increase in density and speed due to decrease in feature sizes with which these devices are manufactured. The eects of scaling on the single event response of microelectronics are a direct result of the physics of energy loss, charge collection, and upset due to a cosmic ray striking a junction in an IC
device. When an energetic ion passes through any material it loses energy through interactions with the bound electrons, causing an ionization of the material and the formation of a dense track of electron-hole pairs. The rate at which the ion loses energy is the stopping power (dE/dx). The incremental energy dE is usually measured in units of MeV while the material thickness is usually measured as a mass thickness in units of mg/cm2 . The radiation eects community has adopted the term LET (Linear Energy Transfer) for the stopping power. An ion with an LET of 100 MeV-cm2 /mg deposits approximately 1pC of electron-hole pairs along each micron of its track through silicon [1].
1.2
Dierent Types of Single Event Eects
The single event eects have dierent categories as follow. These are based on the denitions provided by the Joint Electron Device Engineering Council (JEDEC) association [16]. Any observable or measurable change in state or performance occurring in a microelectronic device, component or system that can be digital, analog or optical, resulting from a single energetic particle strike called Single Event Eects (SEEs). SEEs comprise SEUs and several other eects dened below. 1. Single Event Upset (SEU): A soft error resulting from a transient signal induced by a single energetic particle strike. 2. Single Event Transient (SET): A transitory voltage spike (change of state) at a node of an integrated circuit resulting from single particle collisions. 3. Single Hard Error (SHE): An SEU which causes a permanent change to the operation of a device. An example is a stuck bit in a memory device. 4. Single Event Latchup (SEL): An abnormal high-current state caused when single energetic particles pass through sensitive regions of the device, resulting in (in most cases) loss of device functionality.
5. Single Event Burnout (SEB): An event caused by a single particle collision inducing a localized high-current state in a device, and resulting in its destruction.
1.3
Eects of SEUs on FPGAs
In the presence of electric elds, the electron-hole pairs quickly separate as they drift in opposite directions in the eld and are quickly collected by whatever voltage sources are responsible for the eld, thus producing a current transient. In bulk CMOS designs, such electric elds are present across every PN junction in the device. If an ion strikes a junction connected to a signal node, a current transient is subsequently observed on the signal node as the electric elds in the junction and funnel regions separate the electron and hole carriers [1]. SEUs are inherently transient, and might go undetected in several cases. But in storage circuits like latches and memory structures these events get stored. The SRAM structure that commonly acts as the conguration memory for FPGAs is based on the 6-transistor storage cell. However, the cell is known to be vulnerable to SEUs, consequently compromising the functionality of the FPGA [8]. Figure 1.2 shows a possibility of how an SEU might aect the cell. The nMOS pass gates T5 and T6 are used to read/write values to the cell. For a write operation, bit is set to the desired logic, say 0 and bit to logic 1. The pass transistors are then activated via the RD/WR Enable line and the value is latched in by the cross-coupled inverters. Similarly to read the state of the cell the pass gates are once again enabled. Now suppose that the cell holds a logic 0 when a current pulse from a particle collision arrives at node A. The pulse travels to the gate of T3 and if its magnitude is high enough, is capable of turning on T3 thereby pulling node B to logic 0. The positive feedback ensures that the change propagates to the other two transistors, eventually resulting in a logic 1 getting stored in the cell. If the structure originally held a logic 1, then a similar event occurring at node B
Figure 1.2: 6-Transistor Based SRAM Storage Cell
would result in a logic reversal. It is important to note that the occurrence of such logic changes also depends on the design and a particle collision might not necessarily result in a bit-ip [8]. Basically the FPGA memory is divided into two part one is user and other is conguration memory and an SEU can aect either of the two. The former case results in an undened state getting latched in a register element. Though this might cause a temporary disruption in normal functionality, the original design remains unchanged. In the latter case however, bit-ips can cause an alteration of the logic and the affected area needs to be reprogrammed to remove the fault. Faults in this category can be either in the routing, logic resources or in the IOBs.
Xilinxs FPGA
In general, Xilinx FPGAs have an array composed of congurable logic blocks (CLBs) surrounded by programmable input/output blocks (IOBs), all interconnected by a
hierarchy of fast and versatile routing resources. Each CLB has a set of look-up tables (LUT), multiplexers, and ip-ops, which are divided into slices. A LUT is a logic structure able to implement a Boolean function as a truth table. The CLBs provide the functional elements for constructing logic while the IOBs provide the interface between the package pins and the CLBs. Figure 1.3 shows SRAM based FPGA architecture based on regular array.
Figure 1.3: SRAM based FPGA architecture(regular array) The CLBs are interconnected through a general routing matrix (GRM) that comprises an array of routing switches located at the intersections of horizontal and vertical routing channels. The FPGA matrix also has dedicated memory blocks called Block select RAMs (BRAMs), clock delay locked loops (DLLs) for clock distribution delay compensation and clock domain control and other components that vary according to the FPGA family. These devices are quickly programmed by loading a conguration bitstream (collection of conguration bits) into the device. The device functionality can be changed at anytime by loading in a new bitstream. The bitstream is divided into frames and contains all the information to congure the programmable storage elements in the matrix located in the LUT and ip-ops, conguration cells and interconnections. The characteristic of the CLB logic and slice may change consistent
CHAPTER 1. INTRODUCTION with the FPGA family.
Figure 1.4: SEU Sensitive Conguration Bit Storage There are mainly ve areas of CLB that are aected by SEU as shown in gure 1.4. 1. Upsets in the logic (LUT) 2. Upsets in the customization routing bits inside the CLB 3. Upsets in the routing connecting CLBs and pins 4. Upsets in the CLB ip-ops (ip-ops) 5. Upsets in BRAM
1.4
Report Organization
The rest of the report is organized as follows.
Chapter 2, SEU Mitigation Techniques, describes the various techniques to mitigate the eect of SEUs. Chapter 3, Implementation of Mitigation using TMR & PR, describes the various types of TMR that can be implement to overcome the eect of single event upset. It also describes the Methods for reconguration of the FPGA devices using External or Internal interfacing and DPR as a one of the solution to mitigate the SEU. Chapter 4, Error Detection and Correction, describes Emulation of SEU means, Error Detection and Correction of SEU using FRAME ECC and ICAP. Chapter 5, Implementation of Self Correcting System, shows design and development of the SEU monitor system which uses the SEU controller macro and soft core processor to detect, inject and correct the error. Latter part of chapter shows the design and implementation of self correcting system using SEU monitor system and the DPR of FPGA. Chapter 6, Implementation of SEU mitigation Techniques, shows implementation and results of dierent types of TMR, Dynamic Partial Reconguration and Finally the SEU monitor system & Self Correcting System to mitigate the SEUs. Chapter 7, Conclusion and Future Work, Finally this chapter gives concluding remarks of the implementation of dierent techniques to mitigation the SEUs and the future work that can be extended.
Chapter 2 SEU Mitigation Techniques

Radiation hardening against SEUs can be done in many ways. It can either be inherently built into the device during the manufacturing phase or be realized by making modications to the design. The most commonly used approach incorporates redundant components that reduce chances of interruptions in the operation due to upsets. Correction of an upset following detection is usually desirable to prevent accumulation of errors especially in the case of FPGAs. This section discusses dierent types of methods that are commonly employed to cope with the SEU problem. One must be aware nonetheless that an overhead is always involved, either in the form of cost, power consumption or area. These thesis is mainly concern about design based mitigation techniques and nally the system has been design to inject, detect and correct the SEU. These major techniques broadly divided in two categories [8]. 1. Technology Based Techniques 2. Design Based Techniques
10
CHAPTER 2. SEU MITIGATION TECHNIQUES
11
2.1
Technology Based Techniques
Silicon on Insulator(SOI)
In this method a dierent substrate consisting of an insulated SiO2 layer over a silicon substrate is used in place of the conventional silicon bulk substrate. Bulk structures used when fabricating CMOS based devices utilize wells to provide isolation between adjacent congurations. They can also provide some resistance against SEUs by reducing the charge collection area. Another common type of SOI is called Siliconon-Sapphire (SOS). In this case a thin epitaxial Si layer is grown over a sapphire substrate. Both SOI and SOS improve device resistance against SELs and can also reduce the number of SEU events, but the fabrication processes are known to be costly.
Epitaxial CMOS Process

This is a widely used technique, though not as advanced as SOI. An epitaxial layer is grown over a highly doped Si substrate. The low resistivity resulting from the highly doped region, limits its charge-collection thereby improving its immunity against SEUs. This arrangement oers limited SEL protection since the parasitic latch-up elements are still present unlike in SOIs.
2.2
2.2.1
Design Based Techniques

Spatial Redundancy
A classic example of spatial redundancy is Triple Modular Redundancy (TMR). TMR is a widespread method and is applied to several domains other than SEU mitigation. It works on the principle of using three identical design blocks and then voting on their results to detect failures. An output is obtained as long as at least two of the
12
blocks are functioning correctly. If a module encounters an error, it is detected by the voter and can then be corrected by reconguring the block. The module can represent a physical hardware resource such as a processor or a board or can be redundant logic on the same chip. An apparent weakness of the simple approach described is that an upset in the voter or in the input cannot be countered as shown in gure 2.1. This can be resolved by triplicating the voter, and supplying the input from three identical computations through dierent paths. Because the scrubbing does not correct the content of the Congurable Logic Block (CLB) ip-ops, it is necessary to have a feedback path, to correct the content of the ip-op at the next clock cycle. In addition to the TMR scheme for functional blocks inputs and outputs need to be triplicate. The outputs are tied together externally, using minority voting to prevent conicts in the I/O current [14][12].
Figure 2.1: Basics of TMR
2.2.2
Temporal Redundancy
A variant of TMR provides redundancy in the time domain. This technique is generally used to oer mitigation against error in inputs or in combinatorial logic. It functions by sampling the signal at dierent time periods as shown in gure 2.2. A delay element is added to CLKA and is used to clock elements B and C. Assuming
13
positive edge-triggered ip-ops, if an error is present in the input during the rising edge of CLKA , it gets latched in block A. However, the transient dies o before blocks B and C get aected and hence the error can be voted out. This scheme requires fewer elements than spatial TMR, but pays the price in terms of a lower speed.
Figure 2.2: Temporal Redundancy
2.2.3
Scrubbing(Reconguration)
The use of hardware redundancy by itself is not sucient to avoid errors in the FPGA; it is mandatory to reload the bitstream constantly to avoid the accumulation of faults [4]. This continuous reload of the bitstream is called scrubbing. This method uses inherent functions of these devices like readback / reconguration to provide SEU mitigation. This is done in two phases, detection and correction. There are two methods to do this. First involves performing a readback operation that reads back the current data present in the conguration memory. A bit-by-bit comparison is then carried out between the readback le and the original (golden) bitstream to determine the number and position of the upsets. The device is then scrubbed to remove the faults by
14
reconguring the device with the golden conguration [8]. Several modications to this ow can be made to speed up the process. The easiest way is to skip the detection phase altogether and scrub the device periodically at a predened rate called the Scrub Rate. This has the added advantage of faster upset correction as well as a reduced overhead, since the detection process requires additional memory to perform the comparison. In Second method, the error is detected using ECC facility [32][34] of FPGA and correction using the partial reconguration or reconguration. This gives the designer the option to scrub only aected frames instead of the entire device, greatly lowering the scrubbing duration. The scrub rate needs to be carefully selected depending on the estimated SEU rate. Ideally it should be high enough to correct a fault before the arrival of another upset. Scrubbing is commonly used in conjunction with the other mitigation techniques discussed earlier, like TMR, to give a higher degree of immunity against SEUs.
Chapter 3 Implementation of Mitigation using TMR & PR
3.1
Basics of Triple Modular Redundancy
Several well-known and proven SEU mitigation techniques can be implemented by means of hardware logic for digital microcircuits as discussed in previous chapter. All of these techniques are required to work with existing semiconductors. To keep the trade-os of cost versus functionality in check, component-level SEU mitigation techniques prove benecial to reduce radiation-induced failures. One common requirement to these techniques is the ability to perform a localized reset to the specic circuitry by an external reset mechanism, typically implemented with a SEU-tolerant FPGA. Triple Voting is the most common and easy to implement technique. It is termed as Triple Modular Redundancy (TMR). TMR uses redundant hardware to mask circuit faults. A circuit protected by TMR in its most basic form has three redundant copies of the original circuit and a majority voter. A fault in any one of the three replicates of the original circuit does not produce an error at the output because the majority voter selects the correct output from the 15
CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR
16
other two replicates. Triplicate voters are often used to avoid a single point of failure. Basic structure of voter circuit and its functional truth table is shown in Figure 3.1. The basic concept of triple redundancy is that a sensitive circuit can be hardened to SEUs by implementing three copies of the same circuit and performing a bit-wise majority vote on the output of the triplicate circuit. The function of the majority voter is to output the logic value (1 or 0) that corresponds to at least two of its inputs. For example, if two or more of the voters inputs are a 1, then the output of the voter is a 1. If the inputs of the voter are labeled A, B, and C, and the output V, respectively, then the boolean equation for the voter is: V= AB + AC + BC [9].
Figure 3.1: Majority Voter Circuit and Truth Table
3.2
Granularity of TMR
There basically three types of TMR as per the granularity of module triplication implementated into the design.
1. Local TMR Triplicate sequential elements only (Flip-Flops, Shift registers, block RAMs and sequential DSPs.)
CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR Reduces SEE occurrence of frequency dependant SET capture. Clock trees, global routers and I/Os ate still susceptible [15]. 2. Distributed TMR
17
Triplicate sequential and combinational logic (Global routes and IO not Triplicated) Reduces SEE occurrence Clock trees, global routes and IO still susceptible [15]. 3. Global TMR Triplicated entire design including global buers (Sequential elements, Combinational logic, Voters, Global buers). Very high level of radiation protection Preferred scheme for commercial SRAM based FPGAs [15].
3.3
3.3.1
Various TMR Implementation Methodology

TMR with Single Voter
Here the design is implemented using triplicated logic design as shown in gure 3.2 with single Voter. Here the design module is triplicated and the voter is not triplicated. But in the actual condition all the signals will not be same due to the SEU. Mainly, when the sequential circuit is used, the output changes irrespective of each other. This might change the value of all the three inputs to the voter and the condition shown in the gure 3.3 can be observed [9]. Implementation of this method is shown in section 6.1 gure (6.1).
18
Figure 3.2: TMR of One-Bit Counter
Figure 3.3: Single Voter TMR Counter with Sequential SEUs
3.3.2
TMR with Triplicated Voter
To avoid the problem discussed in the previous section, Voter is also triplicated in the designed with the triplicated design module for the sequential circuits as shown in gure 3.4. The functionality of this design shown in the gure 3.5[9]. section 6.1(gure6.1) shows the implementation of given method for one bit counter.
3.3.3
Temporal Data Sampling
The rst key step in new proposed technique is Temporal Data Sampling. A simple embodiment of the Temporal Data Sampling is shown in the Figure 3.6. The
19
Figure 3.4: Triple Voted TMR of One-Bit Counter
Figure 3.5: Triple Voted TMR Counter with Sequential SEUs
circuit consists of ve edge-sensitive ip-ops. Each ip-op operates in Sampling Mode when its respective clock signal is in high state and in Blocking Mode when clock signal is low. In Blocking Mode ip op holds the data and data changes are blocked. In sampling mode the ip-op behaves transparent to the incoming data [1]. The Temporal Sampling stage helps to store data samples at dierent time intervals. These samples are used in voting logic to eliminate single event upsets. Three dierent clocks (CLKA, CLKB & CLKC) are used. These three clocks are derivative of the main clock and have a 90-degree phase shift and 25% duty cycle to cope with the SEEs as shown in Figure 3.7.
20
Figure 3.6: Proposed Temporal Data Sampling
If SEU is observed on any one of the clock lines, the phase shift in the remaining clock signals will help the respective set of ip-ops to store the correct data at dierent time intervals, hence voiding the eect of spurious glitch on the clock line due to radiation. Any transients due to radiation last for small period of time and if it happens at the negative edge of any clock signal, it will die out before the other temporal ip-op start their operation due to phase shift in clock signals. Therefore, this clocking scheme will help to cope with all the single event transients either in data line or any one of the clock signals. A conventional (SEU susceptible) sequential circuit would satisfy timing constraints such that the maximum combinatorial logic transition time would be less that the period of the master clock minus setup time for the D Flip-Flop. In this proposed technique, data is released on the edge of CLKC, and must reach the next sampling stage before the edge of CLKA (minus the setup time). The insertion of the two extra clock phases (CLKB and CLKC) is required for the additional temporal sampling. Figure 3.7 illustrates the clocking scheme. Implementation of this particular method is shown in gure 6.5. The eective, on-chip computational frequency is exactly one half the frequency of the master clock. Therefore speed penalty by a factor of two has been incurred to ensure complete upset immunity. The proposed design technique has two stages,
21
Figure 3.7: Clocking Scheme for Temporal Sampling data sampling and data release stage. Flip-ops L1, L3 and L5 constitute the data sampling stage while L2, L4 and L5 constitute data release stage of the proposed technique. Flip-op L5 is common in both stages. The sampling stage FFs captures data at dierent time intervals based on their respective clock signals. CLKC serves as sampling clock as well as sample release clock. For any given data, two samples of data are stored at dierent time intervals (CLKA, CLKB). Third data sample is stored at time t (CLKC) and at the same time previously stored samples are released to majority voting logic along with this data Sample [1].
3.3.4
Optimal Design of TMR
The majority voters perform a very important task in the TMR approach, because they are able to block the eects of an upset through the logic at the nal output. In this way, the voters can be placed in the end of the combinational and sequential logic blocks, creating barriers for the upset eects [3]. If an upset occurs in one of the redundant combination logic parts (LUTs or routing), its eect will remain until the load of the next bit stream. The constant reconguration of the device avoids the accumulation of upsets in the programmable matrix. This continuous loading of the bit stream is called scrubbing, and it does interrupt
22
the application. It is important to notice that in throughput logic structures composed by registers, the only way to correct an upset in a register is by loading a new data in the input of the register, or by implementing this refreshing structure with voters (as shown in gure 3.8). In the case of the registers, it is not possible to load the conguration bit stream without interrupting the application because the correct state of the register cannot be saved and loaded by the bit stream.
Figure 3.8: TMR register with voters and scrubbing State machine logic is any structure where registered output, at any register stage within the module, is fed back into any prior stage within the module, forming a registered logic loop. This structure is used in accumulators, counters, or any custom state machine or state sequencer where the given state of the internal registers is dependent on its own previous state. In this case, it is necessary to triplicate the logic and have majority voters in the outputs. The register cannot be locked in a wrong value, and for this reason there is a voter for each redundant logic part in the feedback path, making the system able to recover by itself. The structure presented in gure 3.8 can be used. One majority voter can be implemented by one LUT. Because the LUT can be upset (permanent eect), the voters are also triplicate. In this way, if one voter is upset, there are still two voters working properly. The primary purpose of using a TMR design methodology is to remove all single points of failure from the design [3].
23
Since the full triple module redundancy generates every logic path triplicate, the TMR output majority voters, inside the output logic block, allows converging the output again to one signal outside the FPGA.
3.4
Basics of Partial Reconguration
Throughout the research and development of a system that is capable of detecting and gracefully recovering from errors in its circuitry, many new technologies and concepts have been encountered. partial reconguration of the FPGA is one of them. Partial Reconguration is the capability of reprogramming a portion of an FPGA while the rest of the part does not change. The concept of partial reconguration is a relatively new and uncommon technology used in FPGA design. In order to create a system that has no down time while repairing modules in error, partial reconguration must be used to reprogram these modules without aecting the rest of the system. Certain areas of a device can be recongured while other areas remain operational and unaected by reprogramming. Partial Reconguration is done when the device is active. Reconguration of the FPGA (Partial/Full) is the one of the solution against the SEUs. For these there may be two possibilities, one is reconguring the device at frequent rate (the frequency of reconguration must more than the frequency of occupance of error). The other solution is to reconguration of the device as and when error has been detected. This chapter describe the development of rst method and chapter 6 shows the implementation and results of the designed system. Chapter 5 gives the details of the second method, which we called as running repair strategy [10]. Normally, reconguring an FPGA requires it to be held in reset while an external controller reloads a design onto it. Partial reconguration allows for critical parts of the design to continue operating while a controller either on the FPGA or o of it
24
loads a partial design into a recongurable module. Partial reconguration also can be used to save space for multiple designs by only storing the partial designs that change between designs. Xilinx is the one of a few recongurable device producers in the market. A family of its FPGAs (Virtex Series) provides an important feature called partial reconguration. Partial Reconguration can be divided into two groups:
1. Dynamic (or active) partial reconguration, also known as an active partial reconguration, permits to change the part of the device while the rest of an FPGA is still running. User design is not suspended and no reset and start-up sequence is necessary. 2. Static (or shutdown) partial reconguration, the device is not active during the reconguration process. While the partial data is sent into the FPGA, the rest of the device is stopped (in the shutdown mode) and brought up after the conguration is completed. The non-recongurable area of the FPGA is held in reset and the FPGA enters the start-up sequence after partial reconguration is completed
3.5
3.5.1
Methods of Partial Reconguration

Module Based Partial Reconguration
Module based partial reconguration permits to recongure distinct modular parts of the design. To ensure the communication across the recongurable module boundaries, special bus macros ought to be prepared (which will be explained in setion3.6). It works as a xed routing bridge that connects the recongurable module with the rest part of the design. Module-based partial reconguration requires performing a set of specic guidelines during at the stage of design specication. Finally for each
25
recongurable module of the design, separate bit-stream is created. Such a bit-stream is used to perform the partial reconguration of an FPGA. Module based design typically requires oor planning to specify where all of the recongurable modules will be placed on the physical layout of the FPGA. When the device is to be reprogrammed, the entire recongurable module is overwritten with the new partial bit stream [17].
I. Multi-Column Partial Reconguration, Independent Designs: For designs where the modules are completely independent (no common I/O except for clocks; no communication between modules). In this case, reconguring one module does not aect the operation of another module. The Module-Based Partial Reconguration ow is used for these designs. II. Multi-Column Partial Reconguration, Communication between Designs: For modules that communicate with each other, a special bus macro allows signals to cross over a partial reconguration boundary. Without this special consideration, inter-module communication would not be feasible as it is impossible to guarantee routing between modules. The bus macro provides a xed Bus of inter-design communication. Each time partial reconguration is performed, the bus macro is used to establish unchanging routing channels between modules, guaranteeing correct connections. The Module-Based Partial Reconguration ow is used for these designs.
Module Based Design Flow : similar to the standard design ow this ow

comprises the following steps. The detail ow of these approach with required tool sets is shown in Appendix B. 1. Design Entry and Synthesis both the top-level design and modules are created using an HDL (Verilog or VHDL) or any other established design entry method. To synthesize them Xilinx synthesis tool, Xilinx Synthesis Technology (XST), can be used. This tool produces a netlist in NGC format.
CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR 2. Design Implementation This step consists of the three phases:
26
(1) Initial Budgeting - Creating the constraints and oor plan for the toplevel design. (2) Active Module Implementation - Implementing the top-level design with one module expanded at a time. (3) Final Assembly - Assembling the top-level design and all implemented modules into a complete design.
Figure 3.9: Module Based Partial Reconguration Design Flow Modular Design allows large designs to be partitioned into self-contained modules that can be developed in parallel and independently to save time which is based on the Xilinx Modular Design methodology. Later on, implemented modules are merged into one complete FPGA design [17]. Figure 3.9 shows the overview of this ow.
27
3.5.2
Dierence Based Partial Reconguration
Dierence based partial reconguration can be used when a small change is made to the design. It is especially useful in case of changing LUT equations or dedicated memory blocks content. The partial bit-stream contains only information about differences between the current design structure (that resides in the FPGA) and the new content of an FPGA. There are two ways of dierence based reconguration known as a front-end and back-end. The rst one is based on the modication of the design in the hardware description languages (HDLs). It is clear that such a solution requires full repeating of the synthesis and implementation processes. The back end dierence based partial reconguration permits to make changes at the implementation stage of the prototyping ow. Therefore there is no need for re-synthesis of the design. The usage of both methods (either front-end or back-end) leads to creation of a partial bit-stream that can be used for a partial reconguration of the FPGA [11].
3.6
Bus Macro Communication
Recongurable modules communicate with other modules, both xed and recongurable, by using a special Bus Macro. Figure 3.10 shows, how the bus macro is used between xed logic and PR logic. To facilitate communication across recongurable module boundaries, yet still conform to the Partial Reconguration requirement that routing resources across such boundaries be completely xed and static, the use of a special bus macro is required. The Figure 3.11 shows the left half A is a module and the right-half B is another module. Either A, B or both could be partially recongurable. To support communication between modules A and B, a special bus macro is used. Partial Reconguration requires the signals used as communication paths between or through recongurable modules must use xed routing resources. That is, the routing resources used for
28
Figure 3.10: Bus macro used between PR Logic and Fixed Logic
such intermodule signals must not change when a module is recongured. It is a pre-routed hard macro used to specify the exact routing channels and will not change from compilation to compilation. For each of the dierent design implementations, there is absolutely no variation in the bus macro routing. Route locking is required because if any of the designs choose a dierent routing for the bus macro, it will not align properly with other designs and the communication between the two halves is eectively broken. The current implementation of the bus macro uses eight 3-state buers (TBUFs) hooked up in an arrangement that allows one bit of information to travel either left-to-right or right-to-left, using one TBUF long line per bit as shown in the gure 3.11. Each row of the device can support four bits of a bus macro. The bus macro position exactly straddles the dividing line between design A and B, using four columns of TBUFs on the A side, and four columns of TBUFs on the B side. Design A only connects to the TBUFs in the two or four columns on the Design A side. Likewise, Design B only connects to the TBUFs in the two or four columns on the Design B side. The xed bridge that is pre-routed is comprised of the TBUF output long lines to ensure reliable communication between the two sides. The gure 3.12 shows the
29
Figure 3.11: Bus Macro between two Region physical implementation of a bus macro.
Figure 3.12: Physical Implementation of Bus Macro
The bus macro must be physically locked in such a way as to straddle the boundary line between A and B, and it must be locked in exactly the same position for all compilations. The bus macro can be wired so that signals can go in either direction (left-to-right or right-to-left). It is strongly recommended that once direction is dened, it should not change for that particular FPGA design. Bus macro signals
30
should neither be bidirectional nor recongurable. The number of bus macro communication channels is limited by the number of horizontal long line routing resources available in each CLB tile.
3.7
Programming Medium for Conguration
Partial Reconguration can be implemented through a JTAG (External) connection to PC or internally though custom logic or an on board processor (Internal). Partially reprogramming the FPGA through internal circuitry, referred to as selfreconguration, is a much more useful method of partial reconguration since it eliminates the need for an external PC. The partial bit streams are stored in memory and are written to the Internal Conguration Access Port (ICAP) of the FPGA in order to recongure the specied region of the board with the new logic [5][18]. For the Reconguration of Device there are two methods as shown in gure 3.13.
Figure 3.13: Methods of FPGA Reconguration One is Self-Reconguration and another is External-Reconguration. In the rst method we require the use of ICAP or SelectMAP and the controller, while in the
31
latter case the conguration is carried out by JTAG, UART or other methods of external reconguration of FPGA.
SelectMAP Interface: The SelectMAP interface provides an 8-bit, 16-bit,

or 32-bit bidirectional data bus interface to the FPGA conguration logic. It can be used either in Master Mode with the CCLK signal considered as an output from the FPGA, or in Slave mode with the CCLK signal considered as an input. In slave mode, SelectMAP allows for both conguration and readback, while in master mode only conguration is possible. The clock is setup to be generated from outside the FPGA itself while readback may be required for user verication purposes.
Internal Conguration Access Port(ICAP): The ICAP port is the

element provided by Xilinx that allows the Self-Reconguration in Xilinx devices. Which is almost same as SelectMAP as explained in previously. The xed part of the FPGA needs a mechanism to reprogram the recongurable part. This mechanism is provided by Xilinx in some of their FPGA families and is called Internal Conguration Access Port(ICAP). The ICAP interface is a subset of the SelectMAP interface, and it allows the internal logic to access the conguration data of the FPGA. The ICAP interface is located in the lower right hand corner of the FPGA, so this introduces a restriction to our system. The xed part responsible for reprogramming the FPGA must be located on the right hand side. The ICAP can accept data up to 50 MHz without handshaking protocol, but by controlling the CCLK input data can be downloaded at lower rates. The ICAP port only accepts partial bitstreams because it cannot stop the FPGA during the reconguration process. In order to manage the ICAP controller from the Softcore Processor(Power PC microprocessor or Microblaze), a controller is connected to the system bus. This controller has two registers mapped into the memory of the microprocessor, a data register where the data to be written/read to the ICAP port is contained, and a control register to indicate when the transference has to begin or when it has nished. ICAP Controller module takes care of that the process is completed or not according to the
32
conguration command. In this way it will control the conguration of the FPGA. While OPB [19]/PLB [22] controller modules control the bus interfacing of the ICAP to the processor IP Core [20] and update the FPGA according to the conguration. The partial recongure bitstreams are stored into the BRAM or SRAM as per application requirement. So the reconguration time of the FPGA is reduced and the performance of the Dynamic Partial Reconguration is increased.
Bitstream ow using ICAP: The partial bit stream is stored into the external storage memory and it transfered to the FPGA. The gure 3.14 shows, How the bit stream is transfered into FPGA from external memory into FPGA.
Figure 3.14: Bitstream Flow for the reconguration
Embedded Processor:
For the Reconguration there are some Soft IP core [18] requires so that it will give the correct functionality. A self-reconguring system on a Xilinx Virtex FPGA is implemented by making use ICAP. A partial bitstream is written to the ICAP, which then recongures the specied portions of the FPGA with the new logic. Communication with the ICAP can be implemented through either the Softcore processor(IP) or through a custom VHDL logic design.
33
Basically two IP core are available.
1. Power PC [28] 2. Micro Blaze [26]
For these project, Microblaze soft core processor have been used to implement the desisng. The MicroBlaze embedded processor soft core is a reduced instruction set computer (RISC) optimized for implementation in Xilinx Field Programmable Gate Arrays (FPGAs) Features. The MicroBlaze soft core processor is highly congurable, allowing you to select a specic set of features required by your design.
3.8
Development of a Self Recongurable System
Figure 3.15: System Architecture
34
The system has been developed to give the proof of concept of DPR and how it is used to mitigate the SEUs. Figure 3.15 shows the block diagram of the developed system to implement the design. In these Designed System, To give the proof of concept and to explore the eciency of partial reconguration, one reconguration region(RR) and two RM has been developed. Reconguration modules are sampler (one is digital upsampler and another is digital downsampler ). They were designed and congured using dynamic partial reconguration. The system accepts data from input serially and gives output serially with dierent sampling rate depending upon currently which recongurable module is loaded into to FPGA.
Figure 3.16: Up Sampler block diagram Figure 3.16 show the Block diagram of the up sampler (RM) and the gure 3.17 shows the Block diagram of the down sampler design. The RTL schematic and the simulation results of up sampler is shown in gure 6.6 and 6.7 respectively. Same for the down sampler gure 6.8(a) and 6.8(b). The gure 3.18 shows the interconnecting of various blocks in the design of self recongurable system. Right now the control of reconguration is with the use of serial communication on hyper terminal. The algorithm can be developed to automatically recongure the system at desired rate which is the method to mitigate the SEUs as discussed in the section 2.2
35
Figure 3.17: Down Sampler block diagram
Figure 3.18: Interconnection of the recongurable system The following terminology is used in PR
(i) Partial Recongurable Region (PRR) : area which is dened as a recongurable region from the entire region while doing FloorPlanning. (ii) Recongurable Module (RM) : The dierent designs or modules which are loaded into FPGA during PR. (iii) Static Logic : The xed part of the design.
CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR (iv) Bus Macro : As dened in section 3.6.
36
(v) Partial Bitstream : The bit le which is stored into the external compact ash card. (vi) Merged Bitstream : The nal bit le which contains the Hardware and Software parts of the design. The implementation results of the system is shown in chapter 6. For the detail analysis of DPR for Area and Timing requirement also has been carried out and the results are also available in Chapter 6.
Chapter 4 Error Detection and Correction
4.1
Conguration Memory
Like any other RAM, the conguration memory of an FPGA is partitioned into words, also called frames, which represent the smallest addressable unit of the memory for write and read operations. Values stored in static memory cells control the congurable logic elements and interconnect resources. These values load into the memory cells on power-up, and can reload if necessary to change the function of the device. The conguration memory cells lie closely to the specic functions they control and are laid out in a regular pattern. A data-frame is a 1-bit slice of the memory array along the vertical axis. The conguration data is written to the conguration memory from conguration registers one data-frame at a time. Therefore, the smallest portion of conguration data that may be read from, or written to, the conguration memory is one data-frame. As Shown in gure 4.1, a single data-frame contains portions of conguration data for each and every block that lies in that column. Hence, multiple data-frames are required to describe the complete width of a column. In order to read and write individual data-frames, each must be uniquely addressed by the conguration logic [6]. 37
CHAPTER 4. ERROR DETECTION AND CORRECTION
38
Figure 4.1: Conguration Memory Data Frame Virtex-5 frames consist of 1,312 bits [34][32]. Each frame includes a 12-bit eld (bits 640 to 651) consisting of 11 Hamming bits and an overall parity bit for the frame data to provide the potential for single error correction (SEC) as well as double error detection (DED) in the frame data and 16 unused bits (Bits 656 to 671). The parity and Hamming bits are generated external to the FPGA by the conguration Bitstream generation software and are subsequently downloaded with the application specic conguration data. However, system memory data subject to change during the operation of the FPGA, such as the contents of block RAMs and look-up tables (LUTs) used as distributed RAMs, are not covered by the parity and Hamming bits [35][21].
4.2
FRAME ECC Primitives
Virtex-5 provide a specialized primitive, called FRAME ECC (Error Correcting Code) as shown in gure 4.2, for detection and identication of single and double bit errors in the frame data [1][32]. This primitive is designed to detect single or double bit errors in conguration frame data. It uses SECDED (Hamming code) parity values
39
based on the frame data generated by BitGen (conguration Bitstream generation software).
Figure 4.2: FRAME ECC primitive (Virtex-5)
During read back (each frame read from the conguration memory), the Frame ECC module calculates the Hamming bits as well as the overall parity for the frame data, and compares these bits with the Hamming bits and parity for that frame stored in the conguration memory. Based on this comparison, the FRAME ECC module produces indications for no error, single bit error, and double bit error conditions in addition to a syndrome indicating the location of single bit errors. If the bits have not changed from the original programmed values, then the syndrome bits are all 0s. If a single bit has changed, including any of the ECC bits, then the location of the bit is indicated by syndrome bits 10 to 0 and the syndrome bit 11 is 1. If two bits have changed, then syndrome bit 11 is 0 and the remaining bits are non zero and meaningless. If more than two bits have changed, then the syndrome bits are indeterminate. The error output of the block is asserted if one or two bits have changed, indicating that action needs to be taken. The syndrome bits S[10:0] are derived from the Hamming parity bits, while S[11] is derived from the overall parity bit. The table 4.1 shows the interpretation of syndrome bit [32].
40
SYNDROME BITS INTERPRETATION S[11] = 0 S[10:0] = 0 No error S[11] = 1 S[10:0] = 0 Single bit (SED) error; S[10:0] denotes location of bit to patch (indirectly) S[11] = 1 S[10:0] = 0 Single-bit error; overall parity bit p[11] is in error S[11] = 0 S[10:0] = 0 Double-bit error, not correctable Table 4.1: Interpretation of syndrome bit
System memory contents block RAMs and LUT RAMs, for example are masked from the internal parity and Hamming calculation by the Frame ECC. Table 4.2 [1][34] shows the summary of type of error depending upon the hamming bit and parity bit while the syndrome valid signal is asserted high.
Type of Error Condition(syndrome valid = 1) No bit error Hamming match, no parity error 1-bit correctable error (SEC) Hamming mismatch, parity error 2-bit error detection (DED) Hamming mismatch, no parity error Table 4.2: Frame ECC Error Codes
Repair implementation using FRAME ECC

To use the Frame ECC logic, FRAME ECC VIRTEX5 must be instantiated in the users design, and readback must be performed through SelectMAP, JTAG, or ICAP. At the end of each frame of readback, the syndrome valid signal is asserted for one cycle of the readback clock. The number of cycles required to read back a frame varies with the interface used. The FRAME ECC VIRTEX5 logic does not repair changed
41
bits; this requires a user design. The design must be able to store at least one frame of data, or be able to fetch original frames of data for reload. Following is a repair implementation steps:
(I). A frame is read out through ICAP and stored in block RAM. The frame address must be generated as each frame is read. (II). If an error is indicated by the error output of the FRAME ECC block, then the read back is halted and the syndrome value is saved. If bit 11 is 0, then the whole frame must be restored. If bit 11 is 1, then bits 10:0 are used to locate the error bit in the saved frame, and the bit is inverted. (III). The repaired frame is then written back into the frame address generated in step (I). (IV). Readback then begins again with the next frame address.
In case of a single-bit error in the frame data, the syndrome bits S[10:0] points to the ipped bit in the address space from 704 (location of the rst bit in the frame) to 2047 (last bit in the frame). To convert the syndrome value S [10:0] to the index of the ipped bit in the range 0 to 1311, subtract 704 decimal (2C0 hexadecimal or 01011000000 binary) if the syndrome is less than 1,024 decimal; otherwise, subtract 736 decimal (2E0 hexadecimal or 01011100000 binary). This is equivalent to subtracting 22 or 23 decimal from S [10:5]. An ecient algorithm for determining the bit-oset of the error in the range 0-1311 is shown in Equation 4.1 [32], where S[10:0] are the Frame ECC syndrome outputs [13][34].
Oset ={S [10:5] 6d22 S [10], S [5:0]}
(4.1)
If the binary value of syndrome [10:0] is 0 or a power of 2, then the error is located
42
in one of the parity bits, in which case the location of the bit error is determined as shown in Table 4.3 [34].
SYNDROME[11:0] Oset SYNDROME[11:0] Oset 100000000001 640 100001000000 646 100000000010 641 100001000000 647 100000000100 642 100100000000 648 100000001000 643 101000000000 649 100000010000 644 110000000000 650 100000100000 645 100000000000 651 Table 4.3: Parity Bit Error Diagnosis
4.2.1
Readback CRC Algorithm
The read back CRC in the virtex-5 FPGA is performed in this manner
(I.) Continuous Readback of conguration data in the background of a user design. (II.) Dedicated logic Readback continuously in the background to check the CRC of the conguration memory content. (III.) The rst round of Readback CRC value is latched as the golden value for later comparison. (IV.) The subsequent rounds of Readback CRC value are compared against the golden value. (V.) When a CRC mismatch is found, the CRCERROR pin of the FRAME ECC VIRTEX5 primitive is driven high.
43
4.3
Internal Conguration Access Port (ICAP)
In addition to this, Virtex-5 contains a 32-bit internal conguration access port (ICAP) primitives that provides access to the conguration memory from within the FPGA core. Figure 4.3 shows the ICAP primitive of virtex-5 FPGA. The details functionality was discussed in chapter 3.
Figure 4.3: ICAP Primitive (Virtex-5)
The next chapter describes the design of monitor system which uses the FRAME ECC and ICAP primitives of virtex5 FPGA.
Chapter 5 Implementation of Self Correcting System
5.1
SEU Controller macro
Since the FRAME ECC primitive itselt does not provide error correction, circuitry must be added in the FPGA fabric that uses the ICAP and Frame ECC modules to cycle through all frames and to detect and correct SEUs in the conguration memory. The Frame ECC function is also performed each time a frame is read via the ICAP. So, in this project for detection and correction of bit error, SEU controller macro have been used, which uses the FRAME ECC and ICAP primitives of Virtex5 as shown in the gure 5.1. Basically the SEU controller macro uses the ICAP VIRTEX5 and FRAME ECC VIRTEX5 primitives to clock and observe the readback CRC circuit, which performs the SEU detection as described earlier. The macro also includes a controller that connects to the other ports of these primitives to perform the operations necessary to locate and correct SEU errors using the built-in ECC facility. For test purposes, the connection to ICAP is used to facilitate the controlled injection of conguration 44
CHAPTER 5. IMPLEMENTATION OF SELF CORRECTING SYSTEM errors.
45
Figure 5.1: SEU Controller macro
The functionality of all ports of controller macro is listed below [10]

mode[2:0]: This 3-bit value species the operational mode of the macro as shown in Table 5.1. The value provided on the mode[2:0] is only read by the macro on the rising edge of clk when the mode en signal is active high. mode en: Apply an active-High signal to this input to instruct the macro to read the value provided on the mode[2:0] on the next rising edge of CLK. rst: Active-High input to instruct the macro to reset on the rising edge of CLK. The reset clears the operational mode to 000 and initializes the error injection pointer to zero. (Macro reset is not possible when the BUSY signal is active) clk: Input clock used by all elements within the macro and used to clock the built-in readback CRC logic of the Virtex-5 device. seu detect: An active high level on this output signies that a conguration error has been detected by the readback CRC circuit. This signal has a direct
CHAPTER 5. IMPLEMENTATION OF SELF CORRECTING SYSTEM
46
connection to the CRCERROR output of the FRAME ECC VIRTEX5 primitive used within the macro. detection active: This output signal has a direct connection to the SYNDROME VALID output of the FRAME ECC VIRTEX5 primitive used within the macro and can be used to conrm that readback CRC is actively scanning the device conguration in order to detect SEUs. error inject: This active-High input should only be used in conjunction with modes 4, 5, 6, and 7 (MODE[2:0] = 100, 101, 110, or 111) to initiate the action associated with the particular mode. Each cycle of the ERROR INJECT signal requires a hand-shaking interaction with the BUSY output using the following sequence: 1. Conrm BUSY is Low or wait for it to become Low. 2. Assert (drive High) ERROR INJECT. 3. Wait for BUSY to become high. 4. Deassert (drive Low) ERROR INJECT. busy: Active high output indicating the macro is actively engaged in performing a task such as error correction or error injection. The main purpose of this signal is in the hand-shake sequencing of the ERROR INJECT input and to determine when an operation has been completed so that the macro is ready to continue with the same or a dierent task. addr[34:0]: This 35-bit input is only used in conjunction with mode 4 (MODE[2:0] = 100) and denes the location at which an error will be injected. error: The macro will drive this output High to indicate when an operation has been unsuccessful There are two possible reasons why error injection can fail to happen
CHAPTER 5. IMPLEMENTATION OF SELF CORRECTING SYSTEM 1. All frames contain 16 unused bits which cannot be changed.
47
2. second error injection is performed at the same location (without previously correcting it using mode 1)
The table 5.1 shows the dierent operating modes of the controller macro [10].
MODE 0 1 2 3 4 5 6
mode[2:0] 000 001 010 011 100 101 110
Description Detection only (default at power-on or following reset) Detection and automatic correction. Reserved (Do not use) Reserved (Do not use Inject conguration error at location = ADDR Increment the error injection pointer(ADDR) Reset the error injection index (ADDR = 0)
Table 5.1: Controller Macro Operational Modes
The operation of the SEU controller macro is as follows:
(1). A 1312-bit frame of conguration memory is read through the ICAP as forty-one words, each of 32-bits length. The frame data is stored in a block RAM. (2). If an error is indicated by the outputs of the FRAME ECC primitive, the type of error is determined as shown in Table 4.1 ( If bit S[11] =0, then the whole frame must be restored. If bit S[11] =1, then bits S[10:0] are used to locate the error bit in the saved frame, and the bit is inverted) If the error indicates a double-bit error, the error output of the SEU controller is latched high and read back continues with the next frame of conguration memory. If a single-bit error is indicated, the location of the bit is determined from the syndrome and the erroneous bit is corrected (i.e. inverted) in the frame data stored in the block RAM.
48
(3). The repaired frame is written back into the conguration memory at the same frame address from which it was read. (4). Read back resumes with the rst frame of conguration memory in the conguration column containing the newly repaired frame. (5). When a conguration column has been completely read and repaired, the SEU controller advances to the next conguration column in the array.
5.2
SEU Monitor System
Apart from the SEU controller macro, we require the monitor system to control, monitor and to provide the user interface to check the functionality of the macro by injecting, detecting and correcting the error into the conguration memory. In this method we require the soft processor to monitor and to control the functionality of SEU controller macro and to provide the user interface. In this project I have used the MicroBlaze as a soft core processor and UART (RS232) as a standard IO of the entire system to provide the interface for the user.
Figure 5.2: SEU Monitor System
49
Figure 5.2 shows the Block Diagram of the created monitor system. The SEU Monitor system can emulate an SEU by deliberately injecting an error into the FPGA conguration so that its subsequent detection and correction can be conrmed. Design(SEU monitor system) have been created using Xilinxs ISE, EDK and SDK software to handle the functionality of this macro as shown in gure 5.2. Figure 5.3 shows port mapping of the system which has been created using xilinx EDK.
Figure 5.3: Port mapping of System
Figure 5.4: Internal signal Integration
50
The internal signal connection of the monitor system and the SEU controller macro is shown in gure 5.4. The interfacing between the system and macro has been created using the GPIO of the soft processor. The software part of the system has been written using SDK (shown in gure 5.5).
Figure 5.5: Software for Monitor System using SDK
The software having following facilities to test the system. Bold character value shows the the particular key required to press from the hyper terminal to initiate the particular functionality. (1). S : To Know the Macro Status (2). C : To Know the CRC Scan Status & FRAME COUNT (3). R : To Reset the SEU CONTROLLER (4). M : To Set the Operation Mode (5). A : To Set the Address(Error Injection) (6). I : To inject Error at Specied Address
CHAPTER 5. IMPLEMENTATION OF SELF CORRECTING SYSTEM (7). Z : To Make the Address Pointer Zero
51
The implementation and testing of the system has been carried out on ML505 Board having Virtex5 (XC5VLX110T-1FF1136) FPGA and the implementation an the testing results are shown in the chapter 6. The same design have been designed and synthesize for the virtex6 (XC6VLX240T-1FF1156) for the resource utilization comparison and the results of that is shown in the table 6.5, in implementation and results chapter. This system can be integrated in any user design to facilitate the functionality of SEU detection, injection and correction.
5.3
Self Correcting System
Finally the all the modules have been intergraded into one system called Self correcting System. The SEU controller macro have been integrated in to the previously created Self recongurable system. The Block Diagram of the self correcting system is shown in gure 5.6.
Figure 5.6: Block Diagram of Self Correcting System
52
The concept of the nal design is like this, the user design is triplicated by means of any previously discussed TMR method and that would be the Recongurable Module(RM), then using the SEU Monitor system we can emulate the SEU by injecting the error, detecting and then correcting the bit error by means of the facility of SEU monitor System and if this is not possible then the reconguration is carried out by two modes(Manual or Self). The System operates in dierent three modes.
1. SEU Emulation mode 2. Manual Reconguration Mode 3. Automatic Reconguration mode
First mode (SEU Emulation mode), is the same as only for the emulating the SEUs by injecting, detecting and correcting the error into conguration memory, which is same as the only monitor system created earlier. The second mode (Manual Reconguration Mode) adds the feature of reconguration of the user logic into rst mode. In this reconguration is carried out if the error is detected and the monitor system can not correct it or as and when user is need to do so. but the reconguration is carried out by user interface manually. Third mode (Automatic Reconguration mode) is automatic mode. If the system is kept into this mode, the system keep macro in mode 1 which is automatic detection and correction and if the correction is not possible by macro then it do the reconguration of the user logic. Implementation and testing of the system has been carried out on ML505 Board having Virtex5 (XC5VLX110T-1FF1136) FPGA and the implementation an the testing results are shown in the gure 6.16 chapter 6.
Chapter 6 Implementation of SEU mitigation Techniques
6.1
TMR Implementation
This section covers implementation of various methodology of TMR as discussed in chapter 3.
Figure 6.1: TMR with single voter(Technology Schematic)
53
CHAPTER 6. IMPLEMENTATION OF SEU MITIGATION TECHNIQUES
54
Figure 6.2: Simulation Results TMR with single voter: Various methodologies of TMR has been implemented for the one bit counter and the ML505 evaluation board with XC5VLX110T-f1136 FPGA have been used for the design which have been simulated using the ModelSim simulator. Figure 6.1 shows the technology schematic of one bit counter where the logic is triplicated and the voter is not triplicated and the gure 6.2 shows the simulation results for the same.
(a) Technology Schematic
(b) Simulation Results
Figure 6.3: Implementation of TMR with triple voter
TMR with triple voter: Figure 6.3(a) shows the technology schematic of one-bit counter where the logic is triplicated as well as voter is also triplicated and the result
CHAPTER 6. IMPLEMENTATION OF SEU MITIGATION TECHNIQUES of its simulation is shown in gure 6.3(b).
55
Figure 6.4: Technology Schematic of TMR(with Voters and Scrubbing) TMR registers with Voters and Scrubbing: The Figure 6.4 shows implementation of TMR registers with Voters and Scrubbing for the one-bit counter. This method has been discussed in the topic Optimal Design of TMR of section 3.3.
Figure 6.5: Temporal Data Sampling Technology Schematic TMR with Temporal Data Sampling: Figure 6.5 shows implementation of TMR with Temporal Data Sampling, the technique which has been discussed in section 3.3, for the one-bit counter.
56
Comparison of TMR techniques All the TMR Methodology have been implemented on Xilinx Virtex5 XC5VLX110T-f1136.. Table 6.1 shows the comparison of all the techniques of TMR for simple One bit Counter.
Name of Technique
Delay (Min Period ns)/ (Max Frequency MHz) 1.154/866.551
Without TMR Single TMR Triple TMR
Area (#of Slice Registers/ Slice LUTs) 1/1 3/5
Pros
Cons
voter 1.386/721.501
voter
1.386/721.501
3/5
Simple Hardware Less Hardware required for Voter Circuit Ecient for sequential circuit with SEU
TMR registers 1.061/942.507 3/5 with Voters and Scrubbing TMR with 0.812/1231.527 5/2 Temporal Data Sampling
Self Correction
No SEU Detection Not ecient for sequential circuits More Hardware for Voter circuit compared to above method More hardware complexity
Also Detects More no of FF Double Error and Clock required and effective clock is Half of the original CLK
Table 6.1: Comparison of TMR techniques
57
6.2
Implementation of Self Recongurable System
For the proof of concept of Reconguring system, system have been designed as discussed in the chapter 3 section 3.8, which uses the Microblaze softcore processor, ICAP provided by xilinx and RS-232 interface. As a reconguration module design of upsampler and downsampler have been used. The goal of this reconguring system is to reprogram a portion of the hardware without aecting the performance of the remaining static hardware. The partially recongurable sampler accepts data serially and gives output serially with down sampled or up sampled depending upon which module is currently used. A successful reconguring system will allow for the partially recongurable module (Upsampler or Downsampler or Blank) to be reprogrammed by a pressing the appropriate switch on keyboard through hyper terminal without aecting the static module. Static module consist of start or stop button. Figure 6.6 shows the RTL of the upsampler module and the simulation results is shown in gure 6.7.
Figure 6.6: Up Sampler RTL
58
Figure 6.7: Up Sampler Simulation Results RTL Schematic of downsampler module is shown in gure 6.8(a), where as gure 6.8(b) shows the simulation results for the same.
(a) RTL Schematic
(b) Simulation Results
Figure 6.8: Implementation of Down Sampler
The gure 6.9 shows the testing of Self Recongurable System on hyper terminal and the gure 6.11 shows Performing Reconguration of UP Sampler by appropriate command through user interface. The design has been extended by adding the timer in the soft core processor to nd out time(no of clock cycles) require to recongure the recongurable module dynamically and the testing shown in gure 6.10.
59
Figure 6.9: Testing of Recongurable System
Figure 6.10: Finding no of clock cycles to Recongure
60
Figure 6.11: Dynamic PR(Performing Reconguration of UP Sampler) In the next part the table 6.2 shows the Device Utilization Summary for the combined Up and Down Sampler. Table 6.3 shows the Area Comparison for combined and individual module, Up and Down Sampler. Timing Analysis for Reconguration is listed in the table 6.4. Device Used: Xilinx Virtex5 XC5VLX110T-11136 Logic Utilization Used Available Utilization Number of Slice Registers 3255 69120 4.70% Number of Slice LUTs 3245 69120 4.70% Number of fully used Bit Slices 1328 5172 25.67% Number of bonded IOBs 22 640 3.43% Number of Block RAM/FIFO 18 148 12.16% Number of BUFG/BUFGCTRLs 5 32 15.62% Number of DCM ADVs 1 12 8.33% Number of DSP48Es 3 64 4.68% Table 6.2: Device Utilization Summary (For combined Up and Down Sampler)
61
Device utiliza- Available Up tion summary: sampler alone Number of 69120 3 Slice Registers Number of 69120 4 Slice LUTs Number used 69120 4 as Logic Number of 7 LUT Flip Flop pairs used
Down sampler alone 8 9 9 17
Combine up and down sampler 11 12 12 12
Average % of Saving 45.46 % 50.00 % 50.00 % 8.34 %
Table 6.3: Area Comparison for Up and Down Sampler
Recongure Module Down Sampler Up Sampler Blank
Size of bit le Time Require to Recongure (Using No of Count) 43.379 Kbytes 0.99 sec (990 nsec) 39.993 Kbytes 39.863 Kbytes 1.16 sec (1160 nsec) 0.97 sec (970 nsec)
No of Processor Clock Cycle 1 Cycle 1 Cycle 1 Cycle
Table 6.4: Timing Analysis for Reconguration
62
6.3
Implementation of Systems
SEU Monitor system: Figure 6.12 show the RTL of SEU Controller Monitor
system implemented on virtex5 FPGA.
Figure 6.12: RTL of SEU Monitor System
Figure 6.13: Testing of SEU Monitor System
63
Figure 6.13 show the Testing of SEU Monitor System of SEU Controller Monitor system through Hyper Terminal. Working of CRC through SEU Monitor System has been carried out which is shown in gure 6.14. As discussed earlier about the emulation of SEUs thorough this monitor system performed by injecting the error into conguration memory and then correcting, whose result is shown in gure 6.15.
Figure 6.14: Checking CRC of SEU Monitor System
64
Figure 6.15: Error Injection using Monitor System
65
Device Virtex-5 XC5VLX110T-11136 Virtex-6 XC6VLX240T-11156 Resource Available Used Utilization Available Used Utilization Total Num69,120 2,599 3.76% 301440 155 0.05% ber Slice Registers Number of 69,120 2,220 3.76% 150720 214 0.14% LUTs Number 640 4 0.62% 600 4 0.66% of bonded IOBs Number 148 66 44.59% 416 2 0.05% of Block RAM/ FIFO Number 32 4 12.5% 32 2 6.25% of BUFG/ BUFGCTRLs Number of 2 1 50% 2 1 50% ICAP VIRTEXs Number of 1 1 100% 1 1 100% FRAME ECC VIRTEX Table 6.5: Resource Utilization comparison for Monitor System
Self Correcting System: Finally, the Implementation and testing of the Self
Correcting System is shown in gure 6.16 and the SEU Monitor System is also implemented on virtex6 for area utilization comparison Table 6.5 list this comparison.
66
Figure 6.16: Testing of Self Correcting System
Chapter 7 Conclusion and Future Work
7.1
Conclusion
The SEU eects can be nullied using the TMR and temporal sampling methods very eectively. But, these methods increase the area overhead and also the latency of the systems. The methods described can also be modied to remove the double event upset eect. Small area overhead can be tolerated for the settlement of the single event upset eect especially when the FPGAs are used in the space applications. Partial Reconguration method is extremely useful for the space application as the SEU has to be mitigated. Small FPGA can also be used for larger design by dividing the design in the modules and we load the module into the FPGA as and when required by the application in time multiplex way. In Dyanmic Partial Reconguration the total area and time require to recongure the design is very less compare to the full reconguration. Due to the infrequent and unpredictable nature of real SEUs, small scale testing of their eects and system verication is impractical. The SEU monitor system is successfully implemented to solve this problems. The implemented design (which uses the FRAME ECC and ICAP primitives) is capable of injecting, detecting and correc67
CHAPTER 7. CONCLUSION AND FUTURE WORK
68
tion the of single-bit errors for all Virtex-5 FPGAs. The design is easily integrated in any existing user design with minimal resource overhead for detection and correction of single bit errors. The Entire Monitor System requires 66 Block RAM and about 2,599 logic slices in case of Virtex-5(XC5VLX110T-11136) and 2 Block RAM and 155 logic slice in case of virtex-6(XC6VLX240T-11156),which is almost only 3.76% and 0.05% of available resource respectively. Finally, The Self Correcting System is implemented using the SEU monitor system and self recongurable system, which detects and corrects the error. If the correction of error is not possible then it recongure the user design.
7.2
Future Work
1. The Designed System can be modied, which improves the timing requirement of reconguration by means of some techniques like using DMA for the accessing the memory. 2. The Designed System can be modied, which would be area ecient and will work in time multiplex way.
Appendix A VIRTEX-5 Platform FPGA

Architecture Overview
Figure A.1: Virtex FPGA architecture
Virtex-5 devices are user-programmable gate arrays with various congurable elements and embedded cores optimized for high-density and high-performance system designs. [30]
69
APPENDIX A. VIRTEX-5 PLATFORM FPGA Available in ve platforms LX, LXT, SXT, TXT, and FXT[29]
70
(1). Virtex-5 LX: High-performance general logic applications (2). Virtex-5 LXT: High-performance logic with advanced serial connectivity (3). Virtex-5 SXT: High-performance signal processing applications with advanced serial connectivity (4). Virtex-5 TXT: High-performance systems with double density advanced serial connectivity (5). Virtex-5 FXT: High-performance embedded systems with advanced serial connectivity
A.1
Virtex-5 Device Functional Blocks:
1. I/O blocks provide the interface between package pins and the internal congurable logic. Most popular and leading-edge I/O standards are supported by programmable I/O blocks (IOBs). The IOBs can be connected to very exible ChipSync logic for enhanced source-synchronous interfacing. Source-synchronous optimizations include per-bit deskew (on both input and output signals), data serializers/deserializers, clock dividers, and dedicated I/O and local clocking resources. 2. Congurable Logic Blocks (CLBs), the basic logic elements for Xilinx FPGAs, provide combinatorial and synchronous logic as well as distributed memory and SRL32 shift register capability. Virtex-5 FPGA CLBs are based on real 6-input look-up table technology. 3. Block RAM modules provide exible 36 Kbit true Dual Port RAM that are cascadable to form larger memory blocks. In addition, Virtex-5 FPGA block RAMs
APPENDIX A. VIRTEX-5 PLATFORM FPGA
71
contain optional programmable FIFO logic for increased device utilization. Each block RAM can also be congured as two independent 18 Kbit true dual-port RAM blocks. 4. Cascadable embedded DSP48E slices with 25 x 18 twos complement multipliers and 48-bit adder/subtracter/accumulator provide massively parallel DSP algorithm support. In addition, each DSP48E slice can be used to perform bitwise logical functions. 5. Clock Management Tile (CMT) blocks provide the most exible, highestperformance clocking for FPGAs. Each CMT contains two Digital Clock Manager (DCM) blocks (self-calibrating, fully digital), and one PLL block (self calibrating, analog) for clock distribution delay compensation, clock multiplication/division.
All programmable elements, including the routing resources, are controlled by values stored in static storage elements. These values are loaded into the FPGA during conguration and can be reloaded to change the functions of the programmable elements.[30].
Appendix B Tools & PR Design Implementation Flow
B.1
Tools and Design Board
Several hardware and software tools are necessary for the completion of this project. These tools include the physical FPGA and its associated development board that allowed for continual reprogramming of test systems as well as many features for data storage and output display. Xilinx has also supplied a suite of tools that are necessary for this project. These Xilinx software tools are used for developing the hardware and software aspects of the system. Although many of these tools have included documentation from Xilinx, their support of partially recongurable systems is currently somewhat lacking.
B.1.1
Xilinx EDK 9.2.02
Xilinx EDK is a software development kit used to utilize the functionality of the onboard Microblaze cores of the Virtex FPGA chip. The EDK provides us with IP 72
APPENDIX B. TOOLS & PR DESIGN IMPLEMENTATION FLOW
73
Cores to bridge functionality of hardware designs on the FPGA with software designs on the Microblaze. This is an attractive feature as this project will implement VHDL designs on the FPGA while being able to partially reprogram those designs using the Microblaze. Most of the software developed for this project will be created using the Xilinx EDK interface. After successful software build, the EDK can synthesize the correct VHDL wrappers to instantiate the Microblaze core and can inject the Microblaze instructions into a blockRAM on the FPGA for the Microblaze to read and process [25][23][24].
B.1.2
Xilinx ISE 9.2.04i with Partial-Reconguration Patch
Xilinx ISE is a popular FPGA development tool used widely in industry and in educational institutions. This tool allows for complete FPGA development. ISE can read in VHDL/Verilog/Schematic(..And many more formats) modules created for a project design and synthesize them into logic elements to be placed on an FPGA. ISE can automatically interpret your HDL syntax, synthesize your description, place and route the logic elements and then provide a software BIT le description to connect these logic elements together to create the circuit described in the HDL. All these tool ow steps require their own respective application program to perform the function. ISE calls these programs automatically in a pleasant GUI so that the user can create FPGA designs in seconds.
B.1.3
Xilinx PlanAhead 10.1
Xilinx PlanAhead is a FloorPlanning tool provided by Xilinx to allow developers exibility on how there synthesis designs should be placed on the FPGA oorplan. This tool is useful in the designs where locations of logic elements are an important factor to the performance of the application. For the scope of this project, PlanAhead has partial reconguration options which make it a required tool in the partial
74
reconguration tool ow. The Figure B.1 shows Summarized PlanAhead PR Design Flow.
Figure B.1: PlanAhead PR Design Flow
B.1.4
Virtex-V Evaluation Board
These boards provide a variety of features in order to implement our entire design with partial reconguration, SEU Detection and correction, and nally the integration of entire design. This board contains a Virtex FPGA chip from Xilinx. Out of the many features which this board provides, this project just utilizes the FPGA, onboard memory storage, Compact Flash memory card and the inputs/outputs provided by RS-232, LEDs, and Push-Button.[27]. In this Project work I have used Virtex-5 evaluation board with XC5VLX110T-f1136 FPGA.
75
B.2
PR Design Implementation Flow
When creating a self-recongurable system there are many specic considerations to take. This production ow will cover all specic changes which must occur in all tool ows in order to integrate the Microblaze wrappers with any custom partial or static VHDL modules. Following these steps it should be possible to recreate the entire design. The following tool ow is best used with the exact versions of Xilinx tools described in Section Tools. This tool ow is not guaranteed with any older or newer version of these Xilinx tools [31].
1. Open EDK and create a new project with the project wizard as shown in g B.2. Specify your specic board settings. Add desirable peripherals. Set boot memory from the ilmb cntlr and unselect the option of memory test and Peripheral selftest.
Figure B.2: Create XPS Project
2. Once the project has been created, right click on clock generator 0 instance and select delete instance. A selection box will open up. Select Delete instance but keep its ports and click OK.
76
3. Using Project tab, double-click on system.mhs entry to open the le in an editor window. 4. Replace dcm clk s with sys clk s. 5. Add the following line after sys rst pin to bring the dcm lock signal from the toplevel PORT DCM all locked pin = DCM all locked, DIR = I. Now Save the le and close it. 6. Using IP Catalog tab, add three instances of XPS gpio and name them as icap go out, icap done in, and icap bitstreamlength for User ICAP interface processor or Add one XPS gpio for HWICAP processor. 7. Congure icap go out to have 1 bit databus (to provide start conguring signal), unidirectional, output only. 8. Name GPIO d out port connection to icap go. 9. Similarly, congure icap done in to have 1 bit databus (to provide start conguring signal), unidirectional, input only. 10. Name GPIO in port connection to recong done. 11. Congure icap bitstreamlength for 32-bit, output only. 12. Name GPIO d out as bitstreamlength as this port will provide the bitstream length to the icap processor. 13. From IP Catalog tab, add an instance of icap processor and ICAP INTERFACE from Project Local pcores folder. 14. Expand three xps gpio and icap processor instances in System Assembly View window with Bus Interface lter in eect, and connect various busses. 15. Select Addresses lter and set 64K bytes for icap interface 0, icap bitstreamlength, icap done in and icap go out each and then click Generate Addresses.
77
16. Select Ports lter and connect each port to the appropriate signals in the system and make BM enable an external port. 17. Select Software Platform Settings, and click on xilfatfs check box to select the fatfs le system support. 18. In Application tab of XPS, double-click on Add Software Application Project as shown in g B.3, type TestApp (Any name for the Software Project) in the Project Name eld, and click OK.
Figure B.3: Add Software Application
19. Right-click on Sources and select Add Existing Files.(If already available other wise create it using SDK) 20. Browse to directory where your TestApp is stored and add one (main.c) source le and the Right-click on Headers and select Add existing Files (Browse to TestApp src directory and add one (icap interface.h) source le). 21. In Applications tab, deselect Mark to Initialize BRAMs for Default: microblaze 0 bootloop application and select Mark to Initialize BRAMs for TestApp project. 22. Select Software Build All Users Applications to run LibGen to generate library les and compiler to compile the application.
78
23. Open ISE and create a new project with your board settings. Create a top level VHDL module. This module will include port mapping references to any static or partial designs. 24. In your top-level ISE project add an existing source (not a copy). Import the *.XMP le created in your EDK project into ISE. 25. Select top.vhd in Sources window and double-click Synthesize to synthesize the design. 26. Open PlanAhead and create a new project. Import the top-level NGC le created during synthesis. Point reference to any folders which contain NGC les for child modules described within the top-level NGC. Select the correct part number for your board; import the UCF le created in ISE and click nish. 27. For every static module, right click on the static module and click create new pblock. 28. click on site constrain and put all Bus Macro within the pblock as shown in g B.4.
Figure B.4: Draw Partial Module
29. For every partial module on the oorplan, add the MODE attribute to it and set it to Recongurable.
79
30. Next perform Tools=>RunDRC (shown in g B.5, in the window click on generate script les only and click on nish.
Figure B.5: DRC Run
31. At this stage, the ExploreAhead Runs tab should show static and two RM modules entry(as we have two RM in our Design). 32. Before we run the PR Implementation ow, we need to set path to system system stub.bmm le in order to generate system stub bd.bmm le after the implementation. The system stub.bmm le describes the BRAM composition (logical) where program will be stored when the FPGA is congured. The system stub bd.bmm le describes actual BRAMs used in the implementation. 33. Select static in ExploreAhead Runs tab, and select options tab in Run Properties view. Select -bm option under ngdbuild and click on down arrow. 34. Browse to icap ../edk/implementation, select system stub.bmmle and click Open and then Click Apply to have the option in eect. 35. Select static in ExploreAhead Runs tab, right-click, and select Launch Run. 36. simultaneously Select both the RM in ExploreAhead Runs tab, right-click, and select Launch Run as shown in g B.6.
80
Figure B.6: Luanch Run of static RM 37. The last step in the PR Implementation ow is to run PR Assemble and PR Verify design steps, which can be launched simultaneously by right-clicking one of the RM modules and selecting Run PR Assemble as shown in gure B.7.
Figure B.7: PR assemble of Design
38. Browse to ../project 1/project 1.runs/oorplan 1/merge directory and copy pblock recong sampler blank.bit, up partial.bit and down partial.bit and place them asblank.bit, up.bit and down.bit respectively in ../image folder. 39. Browse to ../project 1/project 1.runs/oorplan 1/merge directory and copy static full.bit and place it in ../edk/implementation folder. 40. Open EDK shell. In EDK shell, change directory to ../edk and then execute the
81
following command to generate download.bit le (having software component included) from static full.bit (having just hardware component) [33]. data2mem -bm implementation/system stub bd -bt implementation /static full.bit -bd TestApp/executable.elf tag microblaze 0 o b implementation/download.bit This will generate download.bit in edk/implementation directory. 41. In EDK shell, execute following command to generate system.ace le using following command. xmd -tcl genace.tcl -jprog -target mdm -hw implementation /download.bit -elf TestApp/executable.elf -board ml505 -ace system.ace This will generate system.ace le in edk directory. 42. Place Compact Flash memory card in a CF writer. 43. Using Windows Explorer copy all les (3 bit and 1 ace) from ..lab/image folder into CF card (Make sure that there are no les in CF card before copying). 44. Place the CF card into ML505 board, start HyperTerminal window with 115200 baud, and power ON the ML505 board. 45. Test the design.
Glossary
Glossary of signicant acronyms used in this thesis FPGA ASIC CMOS CLB LUT NASA SEU SEE SET SHE SEL SOI SOS TMR JTAG ICAP PR DPR PRR RM ECC CRC SECDED DED MDD MHS MPD MSS PLB BMM BDD DCM UCF EDK SDK DMA Field Programmable Gate Array Application Specic Integrated Circuit Complementary Metal Oxide Semiconductor Congurable Logic Block Look Up Table National Aeronautics and Space Administration Single Event Upset Single Event Eect Single Event Transient Single Hard Error Single Event Latchup Silicon On Insulator Silicon On Sapphire Triple Modular Redundancy Joint Test Action Group Internal Conguration Access Port Partial Reconguration Dynamic Partial Reconguration Partial Recongurable Region Recongurable Module Error Correcting Coding Cyclic Redundancy Check Single Error Correction Double Error Detection Double Error Detection Microprocessor Driver Description Microprocessor Hardware Specication Microprocessor Peripheral Description Microprocessor Software Specication Processor Local Bus Block RAM Memory Map Black Box Denition Digital Clock Manager User Constrain File Embedded Development Kit Software Development Kit direct Memory Access
References
[1] A Stoica , T Arslan and S Baloch. Design of a Single Event Upset(SEU) Mitigation Technique for Programmable Devices. In 7th International Symposium on Quality Electronic Design(ISQED), 2006. [2] Carl Carmichael , Brendan Bridgford and Chen Wei Tseng. Single Event Upset Mitigation Selection Guide, XAPP987(v1.0). Xilinx Inc., March 2008. [3] L Sterpone , F Lima Kastensmidt and M Sonza Reorda. On the Optimal Design of Triple Modular Redundancy Logic for SRAM based FPGAs. In Design, Automation and Test in Europe Conference and Exhibition(DATE), 2005. [4] Michael Carey and Anthony Salazar. Correcting Single-Event Upsets Through Virtex Partial Conguration, XAPP216(v1.0). Los Alamos National Laboratories, June 2000. [5] Ming Liu , Zhonghai Lu and Wolfgang Kuehn. Run-time partial reconguration speed investigation and architectural design space exploration. In International conference on Field programmable Logic and its Appplication, FPL09, September 2009. [6] Phil Blain , Carl Carmichael and Michael Carey. SEU Mitigation Techniques for Virtex FPGAs in Space Applications. In Los Alamos National laboratory, Carmichael. [7] Philippe Adell and Greg Allen. Assessing and Mitigating Radiation Eects in Xilinx FPGAs. Technical report, Jet Propulsion Laboratory, California Institute of Technology Pasadena, 2008. [8] Shadab Gopinath Ambat. Single Event Upset Detection in Field Programmable Gate Arrays. Technical report, University of Kentucky, February 2008. [9] Carl Carmichael. Triple Module Redundancy Design Techniques for Virtex FPGAs, XAPP197(v1.0.1). Xilinx Inc., July 2006. [10] Ken Chapman. SEU Strategies for Virtex-5 Devices, XAPP864(v2.0). Xilinx Inc., April, 2010. 83
REFERENCES
84
[11] Emi Eto. Dierence Based Partial Reconguration, XAPP290(v2.0). Xilinx Inc., December 2007. [12] Sandi Habinc. Functional Triple Modular Redundancy(FTMR). Technical report, European Space Agency Contract Report, December 2002. [13] L Jones. Single Event Upset (SEU) Detection and Correction Using Virtex-4 Devices, XAPP714(v 1.5). Xilinx Inc., June 2007. [14] Anthony Lai. Mitigation techniques for electronics in Single Event Upset environments. Technical report, Military Embedded Systems, January 2006. [15] Oliver Neumann. Graphics, 2009. Radiation Eects Cookbook. Technical report, Mentor
[16] JEDEC Standard. JEDEC Dictionary of Terms for Solid State Technology. JESD88B, 3, May 2006. [17] Xilinx Inc. Two ows for partial reconguration: Module based or small bit manipulations, xapp290(v1.0), May 2002. [18] Xilinx Inc. In-Circuit Partial Reconguration of RocketIO Attributes, XAPP662(v2.4), May 2004. [19] Xilinx Inc. OPB HWICAP v1.3, DS280, March 2004. [20] Xilinx Inc. Processor IP Reference guide, v1.9, January 2004. [21] Xilinx Inc. Virtex Series Conguration Architecture User Guide, XAPP151(v1.7), October 2004. [22] Xilinx Inc. PLB IPIF v2.02a, DS448, April 2005. [23] Xilinx Inc. EDK 9.2 MicroBlaze Tutorial in Virtex, WT001(v4.0), October 2007. [24] Xilinx Inc. EDK Concepts, Tools, and Technique-A Hands On Guide to Eective Embedded System Design, May 2007. [25] Xilinx Inc. Embedded System Tools Reference Manual, EDK 9.2i, September 2007. [26] Xilinx Inc. MicroBlaze Processor Reference Guide Embedded Development Kit EDK 9.2i, UG081(v9.0), January 2008. [27] Xilinx Inc. ML505/ML506/ML507 Evaluation Platform User Guide, UG347(v3.1.1), October 2009.
REFERENCES
85
[28] Xilinx Inc. Virtex-4 FPGA Embedded Processor Block with PowerPC 405 Processor(v2.01b), DS306, April 2009. [29] Xilinx Inc. Virtex-5 Family Overview, DS100(v5.0), February 2009. [30] Xilinx Inc. Virtex 5 FPGA user guide, UG190(v5.3), November 2009. [31] Xilinx Inc. Early access partial reconguration user guide for ISE 9.2.04i, UG208(v1.2), August 2010. [32] Xilinx Inc. Virtex5 conguration guide, UG191(v3.9.1), August 2010. [33] Xilinx Inc. Data2MEM User Guide, UG658(v 11.2), June, 2009. [34] Xilinx Inc. Virtex-4 FPGA Conguration User Guide, UG071 (v1.11), June, 2009. [35] Xilinx Inc. Correcting Single-Event Upsets in Virtex-4 Platform FPGA Conguration Memory, XAPP988 (v1.0), March, 2008.
List of Publication
1. Vijay Savani, Akash Mecwan, N. P. Gajjar. Single Event Upset Mitigation Techniques Implementation Based on TMR, In Nirma University International conference on Current Trends in Technology. Institute of Technology, Nirma University, December 2010. 2. Vijay Savani, Akash Mecwan, N. P. Gajjar. Dynamic Partial Reconguration of FPGA for SEU Mitigation and Area Eciency, In International Journal for Engineering and Technology, volume 2, No.2, pages 17. ,http://www. ijict.org , December 2011. 3. Vijay Savani, Akash Mecwan, N. P. Gajjar. Design and Implementation of SEU Monitor System for SEU Detection and Correction in virtex-5 FPGA, In Integration the VLSI journal(ELSEVIER), Comunicated(In review).

09MEC017

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

09MEC017

Uploaded by

Copyright:

Available Formats

Design & Implementation of Mitigation Techniques for Single Event Upset in SRAM FPGA

Major Project Report

SAVANI VIJAY GOPALBHAI

Master of Technology in Electronics and Communication (VLSI Design)

SAVANI VIJAY GOPALBHAI

Prof. A. S. Ranade HOD, EE Department, Institute of Technology, Nirma University, Ahmedabad.

Dr. K Kotecha Director, Institute of Technology, Nirma University, Ahmedabad.

3.6 3.7 3.8

1. Total Dose Eects 2. Single Event Eects (SEEs)

Figure 1.1: SEU in CMOS

Dierent Types of Single Event Eects

Eects of SEUs on FPGAs

Figure 1.2: 6-Transistor Based SRAM Storage Cell

CHAPTER 1. INTRODUCTION with the FPGA family.

The rest of the report is organized as follows.

Chapter 2 SEU Mitigation Techniques

CHAPTER 2. SEU MITIGATION TECHNIQUES

Technology Based Techniques

Epitaxial CMOS Process

Design Based Techniques

CHAPTER 2. SEU MITIGATION TECHNIQUES

Figure 2.1: Basics of TMR

CHAPTER 2. SEU MITIGATION TECHNIQUES

Figure 2.2: Temporal Redundancy

CHAPTER 2. SEU MITIGATION TECHNIQUES

Chapter 3 Implementation of Mitigation using TMR & PR

Basics of Triple Modular Redundancy

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

Figure 3.1: Majority Voter Circuit and Truth Table

Various TMR Implementation Methodology

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

Figure 3.2: TMR of One-Bit Counter

Figure 3.3: Single Voter TMR Counter with Sequential SEUs

TMR with Triplicated Voter

Temporal Data Sampling

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

Figure 3.4: Triple Voted TMR of One-Bit Counter

Figure 3.5: Triple Voted TMR Counter with Sequential SEUs

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

Figure 3.6: Proposed Temporal Data Sampling

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

Optimal Design of TMR

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

Basics of Partial Reconguration

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

Methods of Partial Reconguration

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

Module Based Design Flow : similar to the standard design ow this ow

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

Dierence Based Partial Reconguration

Bus Macro Communication

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

Figure 3.12: Physical Implementation of Bus Macro

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

Programming Medium for Conguration

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

SelectMAP Interface: The SelectMAP interface provides an 8-bit, 16-bit,

Internal Conguration Access Port(ICAP): The ICAP port is the

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR

Figure 3.14: Bitstream Flow for the reconguration

CHAPTER 3. IMPLEMENTATION OF MITIGATION USING TMR & PR