Professional Documents
Culture Documents
Yuejian Wu
Northern Telecom, P.O. Box 3511, Station C, Ottawa, Ontario, Canada
Abstract
This paper first analyzes faulty scan chain behaviors. In addition to stuck-at faults, we also consider timing faults due to hold time violations. Test sequences to determine the fault types in a failing scan chain are presented. This is followed by a presentation of two scan design techniques that
simplifies scan chain fault diagnosis for both stuck-at and timing faults.
1. Introduction
Scan design is the most popular DFT methodology in the VLSI industry. Many scan-based
test/diagnostic tools assume operational scan chains even if the circuit under test is faulty. Otherwise, it is impossible to apply any scan test vector. Therefore, when zero or very low yield occurs
due to scan chain failures, it is important to locate the fault and correct the design and/or fabrication errors so that scan test can be performed in production test. Unfortunately, with standard scan
design methods, scan chain failure diagnosis is very difficult or even impossible. Recently, a few
scan chain diagnostic techniques have been reported. In [1], a technique uses a sequential ATPG to
generate diagnostic vectors without modifying scan chain. Unfortunately, its CPU time can be prohibitive for large designs and its resolution is often inadequate. In [2], a new scan design was proposed, where the output of each flip-flop on a chain is sampled by a flip-flop on another chain. So,
when a chain fails, one can always shift a diagnostic vector into another chain and have it loaded
into the failing chain. It costs a pair of wires between each pair of flip-flops from different chains
plus a global diagnostic control. In [3], another scan design method was proposed. With the addition of an extra XOR gate to each flip-flop and a global diagnostic control, it makes scan diagnosis
straightforward. Besides the cost for routing the global diagnostic control, the cost of the XOR
gates can become significant for large designs. In order to save silicon for routing the global diagnostic control, another technique was proposed in [4]. It takes advantages of a special class of scan
flip-flops, where each flip-flop includes a dedicated scan-out latch in addition to a normal masterslave scan flip-flop. Extra circuit was added to each flip-flop to detect a 1-to-0 transition upon SM
(scan mode) when CLK (or some times SCLK, a dedicated scan clock) is 0. Upon the detection of
such a event, it forces a constant value into the scan-out latch, which is then shifted out for analysis. Stuck-at faults are assumed in all the previous work. This paper presents two alternative scan
chain diagnostic techniques that consider both stuck-at and timing faults in scan chains.
hold time fault at its SI input as the path to SI input is usually the fastest path in a whole design.
Depending on the amount of clock skew (due to whatever reason), there are three types of hold
time problems for scan chains shift operation. In a type-I problem, a faulty flip-flop captures
incorrect data if and only if its SI transits from 0 to 1. The cause for such a fault is that the Q of the
flip-flop that feeds the faulty one has faster rise time than fall time and the clock skew is just large
enough to fail the rising transition at the SI but small enough to pass the falling transition. Such
failures have been observed in practice. Similarly, if the proceeding flip-flop has faster fall time on
its Q, a faulty flip-flop fails if and only if its SI transits from 1 to 0 in a type II problem. In a typeIII problem, a flip-flop fails whenever its SI transits. This happens for large clock skews.
The behaviors of stuck-at faults and hold time faults in a scan chain are different. With a
stuck-at fault, the sequence shifted out from the chain is always constant of the stuck-at fault value
no matter what sequence is shifted in the chain. For a hold time fault in a scan chain, it does not
prevent the chain from shifting through sequences of both 0s and 1s. However, if transitions exist
in the sequence, the chain appears shorter, meaning that the bit following each problematic transition always gets shifted out of the chain a cycle earlier than expected. Figure 1 shows some example test responses due to a single hold time fault. It should be pointed out that the sequences shown
in Figure 1 are sequences observed at the scan output (SO) of a scan chain and are independent of
fault locations. As shown in Figure 1, a type-I fault generates an extra 1 for each 0-to-1 transition
shifted through the faulty flip-flop; a type-II fault generates an extra 0 for each 1-to-0 transition;
and a type-III fault makes an extra shift of all transitions towards the scan out of the chain.
FIGURE 1. Example faulty behaviors of single hold time faults in a scan chain
good response:
response:
response:
response:
Type-I
Type-II
Type-III
0001011100
0001111100
0000001100
0000101110
shift direction
Figure 2 (a) shows a scan chain with a hold time problem at the SI input of flip-flop i due
to a clock skew. Figure 2 (b) shows a model for the hold time problem. As shown in the model, if
a problematic transition occurs at the faulty flip-flop is input, it incorrectly captures the data from
flip-flop i+2 as opposed to from flip-flop i+1.
problematic
transition detector
si q
si q
si q
si q
si q
0
1
si q
si q
skew
flop i+2
To determine if the fault in a failing scan chain is a stuck-at fault, we suggest to use two
flush test sequences, one of all-0s and the other of all-1s. If the response to the all-0 sequence is
an all-1 sequence, the fault must be stuck-at-1. Similarly, if the response to the all-1 sequence
becomes all-0, the fault must be stuck-at-0. This is because the suggested test sequences contain
no transition and thus will not trigger any hold time violation.
N-bits
sequence
sequence
sequence
sequence
1: ...111...111000...000...
2: ...000...000111...111...
3: ...000...000100000000...
4: ...111...111011...111...
shift direction
Note: N represents the scan chain length.
Similarly, we may also determine the combination of different types of hold time faults.
sa1
diag
dm
clk
dm
dm
dm
dm
si q
si q
si q
si q
si q
sa1
flop 3
on a tester
after shifting
five 0s into the chain: 0 0 0 1 1 x
after a postive pulse on diag: 1 1 1 0 0 x shift
direction
shift out the chain: 1 1 1 1 0 0
1 1 1 1 1 0
error
observed
Figure 5 shows a logical representation of a modified scan flip-flop that is able to invert its
state when dm (diagnostic mode) is set to 1. When dm = 0, the modified flip-flop behaves like a
normal scan flip-flop with an extra mux delay added to its SI input. When dm = 1 and SM = 1, the
flip-flop complements its state at a clock edge. The modification has no performance impact on the
data path. Figure 5 is a logical representation of the modification. In reality, a flip-flop cell could
be modified to include the additional mux. In this case, the extra mux would cost only 6-8 transistors with reduced routing complexity.
FIGURE 5. An example implementation of scan flip-flops with dm ports
SM
D
0
0
SI
dm
Q
QB
CLK
after step 1:
after step 2:
start step 3:
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
0
0
0
1
1
0
0
x
x
1
1
0
sa1@flop 3
1 0 1
0 1 0
1 0 1
0 1 0
transition observed at 1 0 1
rd
1
0
0
1
1
x
x
0
0
1
shift direction
transition observed at
at the 3rd clock cycle
(b) diagnosis of sa1
after step 2:
start step 3:
0
1
0
1
0
1
0
1
0
1
x
1
1
1
0
x
x
1
1
0
x
x
x
1
0
1
0
1
0
0
1
0
1
1
1
1
1
0
1
1
1
0
0
1
1
0
0
0
1
0
1
0
1
1
0
1
0
x
0
0
0
x
x
0
0
x
x
x
0
x
x
x
x
shift direction
0 1 1 1 1 x
1 0
transition observed at 0 1
the 4th clock cycle
1 0
0 1
0
0
0
0
1
0
0
0
1
1
0
0
1
1 transition observed at
th
1 the 4 clock cycle
0
diag
clk
rst
si q
set
si q
flop i+2
flop i+1
rst
si q
set
si q
flop i
rst
si q
flop i-1
flop i-2
0 1 0 0 0 1
1 0 1 0 0 0
0 1 0 0 0 0
sa1@flop 3
an error observed at
the 3rd clock cycle
1
0
1
0
0
0
1
0
1
1
rd
1
0
1
0
0
0
1
1
1
1
1
0
1
1
1
x
1
0
1
1
shift direction
an error observed at
the 4th clock cycle
(b) diagnosis of sa1
As shown in Figure 9, an error is observed at the 3 clock cycle for the sa0 fault. This
indicates a fault between flip-flops 3 and 2, which matches the assumed fault location. However,
for the sa1 fault, an error is observed at the 4th clock cycle, which appears to suggest a fault
between flip-flops 4 and 3 even though the assumed fault is between flip-flops 3 and 2. This is
because the diagnostic vector at the fault site coincides with the faulty value, thus the fault effect is
masked. Therefore, we have to declare a fault between flip-flops 4 and 2, which spans 3 flip-flops.
In general, the diagnostic resolution for this scheme is 3 flip-flops.
starts step 1:
1
1
1
1
1
0
1
1
1
1
1
0
1
1
1
x
1
0
1
1
an error observed at
the 4th clock cycle
type-ii@flop 3
1 0
0 1
1 0
0 1
1 0
0 1
1
0
0
0
0
0
0
1
1
0
0
0
1
0
1
1
0
0
As shown in Figure 10, for a type-I fault, the first error is observed at the 4th clock cycle,
which indicates a fault between flip-flops 4 and 3. This matches the assumed fault location. However, for the type-II fault at the same location, the first error is observed at the 5th clock cycle,
which appears to indicate a fault between flip-flops 5 and 4. This is because in the diagnostic pattern the first transition seen by the faulty flip-flop does not trigger the fault. Therefore, Procedure II
has to declare a fault between flip-flops 5 and 2, which spans 3 flip-flops.
Figure 11 illustrates the diagnosis of type-III faults. Figure 11 (a) assumes a fault at the
input of flip-flop 3 and Figure 11 (b) assumes a fault at the input of flip-flop 2. For such a fault at
the input of flip-flop 3, an error is observed at the 4th clock cycle. This indicates a fault between
the output of flip-flop 4 and the input of flip-flop 2. For a fault at the input of flip-flop 2 shown in
Figure 11 (b), an error is observed at the 3rd clock cycle. This suggests the existence of a fault
between flip-flops 3 and 2, which is exactly the fault location assumed.
FIGURE 11. Diagnosis of type-III faults
type-iii@flop 3
1 0
0 1
in steps 2 & 3:
1 0
0 1
1 0
starts step 1:
1
1
0
1
0
0
1
1
0
1
1
0
1
1
0
x
1
0
1
1
1
0
0
1
x
1
0
0
shift direction
an error observed at
the 3rd clock cycle
6. Conclusions
In the case of zero or low yields due to scan chain failures during prototype/early production runs, it is important to locate the fault. In practice, in addition to stuck-at faults, timing faults
due to hold time violations are also an important cause of scan chain failures. This paper analyzed
the faulty behaviors of these faults and presented simple flush test sequences to distinguish these
faults. Two diagnostic techniques were also presented to simplify scan chain diagnosis.
Acknowledgment: The author would like to knowledge Dr. S. Adham and Mr. K. Brough
for helpful discussions and careful reading of the draft version of this paper.
References
[1] Kundu, S., On Diagnosis of Faults in a Scan Chain, Proc. VTS93., pp. 303-308.
[2] Schafer, J.L., et al., Partner SRLs for Improved Shift Register Diagnostics, Proc. VTS92, pp.
198-201.
[3] Edirisooriya, S., et al.., Diagnosis of Scan Path Failures, Proc. VTS95, pp. 250-255.
[4] Narayanan, S., et al.., An Efficient Scheme to Diagnose Scan Chains, Proc.ITC97.