96 CW Thesis

Temperature-Aware Self-Refresh Control
Scheme for Low Power Embedded-DRAM

9661621
(Chih-Wen Cheng)
(Prof. Meng-Fan Chang)
Temperature-Aware Self-Refresh Control

Scheme for Low Power Embedded-DRAM

Student: Chih-Wen Cheng
Advisor: Prof. Meng-Fan Chang
A Thesis
Submitted to Department of Electrical Engineering
College of Electrical Engineering and Computer Science
National Tsing Hua University
in Partial Fulfillment of the Requirements
for the Degree of
Master of Science
in
Electrical Engineering
July 2010
Hsinchu, Taiwan, ROC
(8Mb)
ii
Temperature-Aware Self-Refresh (TASFR) Control Scheme

for Low Power Embedded-DRAMs
Student: Chih-Wen Cheng
Advisor: Prof. Meng-Fan Chang
Submitted to Department of Electrical Engineering

College of Electrical Engineering and Computer Science
National Tsing Hua University
Abstract
Embedded-DRAMs are widely used in many electronic products due to its
more cost-effective than SRAM and its faster read/write random access than FLASH.
However, increasingly large power consumption is a big problem in SOC system.
For this reason, low power design issue should be taken into consideration. For
embedded-DRAM, the stored data should be confirmed to retain in cell array with
conventional period in self-refresh mode. But at room temperature, the cell data
retention time will extend much longer than that in higher temperature condition.
Thus, there is an additional AC component of data retention power at room
temperature with conventional period.
To solve this problem, we propose a temperature-aware self-refresh control
scheme to extend self-refresh period in lower temperature condition. By using
discrete-time dynamic tracking to detect replica cell array, conventional self-refresh
period can be extended by power function of two with various temperatures.
We apply our design in 65nm EDRAM low leakage process within an 8Mb
eDRAM macro. The experiment results show that, 95.92% reduction of AC
component of data retention power can achieve at room temperature.
iii
(LaRC)
iv
Contents
Abstract (Chinese)
Abstract (English)
Acknowledgements (Chinese)
Contents ........................................................................................................................... 1
List of Figures ............................................................................................................... viii
List of Tables................................................................................................................. xiv
Chapter 1
Introduction ................................................................................................ 1
1.1
Low Power Embedded-DRAM Applications ............................................... 1
1.2
Challenges of Low Power Embedded-DRAM ............................................. 3
1.3
Basic Operations of DRAM .......................................................................... 4

1.3.1
Read Operation ................................................................................. 5
1.3.2
Write Operation................................................................................. 6
1.3.3
Refresh Operation ............................................................................. 7
Structure of This Thesis ................................................................................ 8
1.4
Chapter 2
Leakage and Temperature Dependency ................................................ 10

MOSFET Leakage Mechanisms ................................................................. 10
2.1
2.1.1
Sub-threshold Current (I1)............................................................... 11
2.1.2
Gate-Induced Drain Leakage (I2) .................................................... 13
2.1.3
Gate-Oxide Tunneling Current (I3) ................................................. 15
2.1.4
Hot carrier injection Current (I4)..................................................... 17
2.1.5
Reverse-Biased Junction BTBT Current (I5) .................................. 19
2.1.6
Punch-Through Current (I6) ............................................................ 20
2.2
Chapter 3
Leakage with Temperature Dependency ..................................................... 21

Design Issues in Low Power Self-Refresh Mode ................................... 23
v
Cell data retention ....................................................................................... 23
3.1
3.1.1
Cell Structure .................................................................................. 23
3.1.2
Data Retention Time ....................................................................... 27
3.2
Power Consumption .................................................................................... 30
3.3
Conventional Self-Refresh Mode ............................................................... 31
3.4
Temperature Dependency in Self-Refresh Mode ........................................ 32

3.4.1
Retention Time with Temperature Dependency.............................. 33
3.4.2
Power Dissipation with Temperature Dependency ......................... 34
Previous Works ........................................................................................... 36
3.5
3.5.1
Replica-Cell based Self-Refresh Control Scheme .......................... 36
3.5.2
Sensor Based Self-Refresh Control Scheme ................................... 40
3.5.3
Temperature Sensor ........................................................................ 44
Chapter 4
Proposed Scheme ..................................................................................... 47
4.1
Motivation of Proposed TASFR control Scheme ....................................... 47
4.2
Structure of Proposed TASFR Control Scheme .......................................... 48

4.2.1
Replica Cell Array Structure ........................................................... 49
4.2.2
Differential Sampling Structure ...................................................... 50
4.2.3
Adaptive Refresh Period Structure ................................................. 51
Algorithm of Proposed TASFR Control Scheme ........................................ 52
4.3
Chapter 5
5.1
5.2
Design considerations and Analyses ....................................................... 57

Design issues ............................................................................................... 57
5.1.1
Short Channel Effects (SCE) .......................................................... 58
5.1.2
Process Variations ........................................................................... 60
Design considerations ................................................................................. 62

5.2.1
Resolution ....................................................................................... 63
5.2.2
Resistor ladder ................................................................................ 65

vi
5.2.3
Comparator ..................................................................................... 68
Analyses of Proposed TASFR Control Scheme .......................................... 70
5.3
5.3.1
Adaptive Refresh Period ................................................................. 71
5.3.2
Power Reduction in Self-Refresh mode .......................................... 75
Chapter 6
Macro Implementation ............................................................................ 76

Macro of Embedded-DRAM ...................................................................... 76
6.1
6.1.1
Memory Cell Arrays ....................................................................... 77
6.1.2
Peripheral Circuits .......................................................................... 78
6.1.3
I/O Interface Circuits ...................................................................... 79
6.2
Chapter 7
Test Chip Design ......................................................................................... 80

Measurement Results and Conclusions ................................................. 82
7.1
Measurement Results .................................................................................. 82
7.2
Summary and Conclusions ......................................................................... 85
7.3
Future Works ............................................................................................... 89
vii
List of Figures
Chapter 1
Fig. 1-1: (a) A conceptual DRAM array and (b) an actual data-line configuration
[6-8]. ................................................................................................................. 4
Fig. 1-2: The read operation [6-8]. ................................................................................... 5
Fig. 1-3: The write operation [6-8]. .................................................................................. 6
Fig. 1-4: The refresh operation [6-8]. ............................................................................... 7
Chapter 2
Fig. 2-1: MOSFET leakage is composed of sub-threshold leakage (I1), Gate Induced
Drain Leakage (GIDL) (I2), gate-oxide tunneling current (I3), hot carrier
injection current (I4), reverse-biased junction band-to-bandtunneling
(BTBT) current (I5), and punch-through current (I6). .................................... 10
Fig. 2-2: (a) Cross section of sub-threshold current model. (b) ID versus different
VGS with VDS=1.2V and VDS=0.4V. ................................................................ 12
Fig. 2-3: (a) GIDL effect with small VDS. (b) GIDL effect with large VDS = VDD. (c)
Simulation results of GIDL current. ............................................................... 14
Fig. 2-4: (a) Fowler-Nordheim (FN) tunneling of electrons for applied more positive
high gate voltage. (b) Direct tunneling of electrons for applied positive
high gate voltage. ........................................................................................... 15
Fig. 2-5: (a) The mechanism of hot carrier injection (b) Energy band diagram of
Injection of hot electrons from substrate to oxide. ......................................... 17
Fig. 2-6: Leakage current with various temperature in 25nm predict model [13]. ........ 21
Chapter 3
Fig. 3-1: Different types of storage capacitor [14] such as (a) trench-type capacitor,
(b) stack-type capacitor. ................................................................................. 23
viii
Fig. 3-2: Various logic-compatible embedded-DRAM cell structures. .......................... 25

Fig. 3-3: Cross section view of embedded-DRAM cell and leakage mechanisms ........ 27
Fig. 3-4: (a) Memory cell is subjected to leakage current, thus cell data retention
time is also much relevant to leakage current. (b) The loss of stored binary
data due to the data line L disturbances. QC = 0 and QN = 0 are assumed
[6]. .................................................................................................................. 28
Fig. 3-5: Current versus cycle time [6]........................................................................... 30
Fig. 3-6: (a) The refresh operation [6]. (b) The self-refresh operation. .......................... 32
Fig. 3-7: (a) Cell bias conditions in the self-refresh mode. The charge of storage
node (SN) in memory cell will be discharged by leakage current (b) The
voltage drop in memory cell is sensitive to temperature because IBTBT is
also temperature-dependent. ........................................................................... 33
Fig. 3-8: Power consumption in self-refresh mode versus temperatures with three
self-refresh periods. ........................................................................................ 35
Fig. 3-9: Power consumption in self-refresh mode versus self-refresh period in two
cases, temperature 25 and 85. ................................................................ 35
Fig. 3-10: (a) The developed refresh timer with special self-refresh control. (b)
Timing diagram of the self-refresh control scheme. ...................................... 37
Fig. 3-11: Comparison of cell-leakage monitoring scheme. (a) Conventional
fixed-plate scheme [31]. (b) Plate-floating leakage monitoring (PFM)
scheme. (c) Timing diagram with PFM and conventional scheme. ............... 38
Fig. 3-12: (a) Block diagram of the temperature-insensitive self- recharging circuitry.
(b) Memory cell. (c) Comparator. .................................................................. 39
Fig. 3-13: (a) Self refresh controller with temperature detecting circuit. (b) Timing
diagram for temperature detecting circuit. ..................................................... 40
Fig. 3-14: (a) Schematic of the temperature sensor with binary weighted variable
ix
resistors. (b) Simplified function diagram. (c) Block diagram and timing
diagram for the self-refresh period control with temperature sensor. ............ 41
Fig. 3-15: (a) Temperature sensor. (b) Simplified function diagram. ............................. 42
Fig. 3-16: (a) Thermometer block. (b) Self refresh and thermometer control scheme.
(c) Temperature sensor. (d) Dual-slope integrating analog-to-digital
Converter. ....................................................................................................... 43
Fig. 3-17: (a) Operating principle of the temperature sensor. (b) Temperature
dependency of the key voltages in the sensor. (c) Block diagram of the
modulator. (d) Block diagram of the temperature sensor. .............................. 45
Fig. 3-18: Schematic of a novel low overhead CMOS temperature sensor that uses a
differential amplifier to amplify the temperature dependence of mobility
and to minimize the temperature dependence of VTH. ................................... 46
Chapter 4
Fig. 4-1: Block diagram of the proposed scheme. .......................................................... 47
Fig. 4-2: Simplified block diagram of the proposed TASFR control scheme. ............... 48
Fig. 4-3: Replica cell array structure is incorporated into this scheme. It can make a
copy of the actual memory cell array and detect the replica cells to obtain
the on-chip information. ................................................................................. 49
Fig. 4-4: Differential sampling structure is adopted in TASFR control scheme.
Global process variations and system offset can be cancelled by using
differential sampling structure. ....................................................................... 50
Fig. 4-5: Adaptive refresh period structure. Conventional self-refresh period can be
divided into slower self-refresh period at lower temperature in order to
reduce power dissipation embedded-DRAM. ................................................ 51
Fig. 4-6: A simplified case study is used to describe how the TASFR control scheme
works. ............................................................................................................. 52
x
Fig. 4-7: Flow chart of TASFR control scheme. ............................................................ 56

Chapter 5
Fig. 5-1: (a) For long-channel devices, the drain voltage has negligible effect on the
barrier at the source-channel interface. (b) For short-channel devices, the
drain voltage tends to reduce the barrier at the source end. (c) VT across
effective channel length at VDS = 1.2V and VDS = 0.1V. ................................ 58
Fig. 5-2: (a) Process variations. (b) Distributions of MOSFET threshold voltage (VT)
across multiples of short channel length with W=1m. ................................. 60
Fig. 5-3: Two steps of signal amplification in sense amplifiers. (a) Selected WL
turning on and charge shared between cell capacitance CS and data-line
parasitic capacitance CDL. (b) Sense amplifier starts to amplify the small
voltage swing data-line after charge sharing. ................................................. 63
Fig. 5-4: Yield of latch-type sense amplifier in eDRAMs. ............................................ 64
Fig. 5-5: Block diagram of the resistor ladder. ............................................................... 65
Fig. 5-6: Monte-Carlo simulation for resistor ladder for 10K trials with local
statistical variations. ....................................................................................... 67
Fig. 5-7: Two types of latch-type sense amplifiers widely used in memory. ................. 68
Fig. 5-8: Yield versus channel width with different MOSFET as channel length of
MOSFETs all are 0.5m in the latch-type comparator. .................................. 69
Fig. 5-9: Yield versus input difference VIN with different VINDC. ................................ 70
Fig. 5-10: Yield versus input difference VIN with different VDD.................................. 70
Fig. 5-11: The self-refresh period oscillator is generated from eDRAM macro. (a)
Self-refresh period variations with five process corners. (b) Self-refresh
period variations with core supply voltage fluctuations in the range of VDD
10% VDD10%................................................................................... 72
Fig. 5-12: Self-refresh period across various temperatures with five process corners. .. 73
xi
Fig. 5-13: Self-refresh period across various temperatures with different supply
voltage VARY. .................................................................................................. 74
Fig. 5-14: (a) Normalized power consumption of AC component of data retention
power with and without TASFR control scheme across various
temperatures. (b) Power reduction of AC component of data retention
power with TASFR control scheme with various temperatures ..................... 75
Chapter 6
Fig. 6-1: Conventional DRAM basic architecture [6, 7]. ............................................... 76
Fig. 6-2: Scrambling techniques [50] have been widely used in DRAM circuit. .......... 77
Fig. 6-3: Internal voltage generators for modern DRAMs [53]. .................................... 78
Fig. 6-4: Annotated macro layout of the 8M-bit embedded-DRAM in 65nm
EDRAM low leakage process. ....................................................................... 79
Fig. 6-5: Block diagram of the test chip design for the 8M-bit eDRAM macro with
TASFR control scheme in 65nm EDRAM low leakage process. ................... 80
Fig. 6-6: Annotated layout shot of the test chip in 65nm EDRAM low leakage
process. ........................................................................................................... 81
Chapter 7
Fig. 7-1: Conventional self-refresh period generated from the self-refresh oscillator. .. 82
Fig. 7-2: Oscilloscope plot showing adaptive self-refresh period at different
temperatures. .................................................................................................. 83
Fig. 7-3: (a) Power consumption of AC component of data retention power with and
without TASFR control scheme across various temperatures. (b) Power
reduction of AC component of data retention power with TASFR control
scheme across various temperatures. .............................................................. 85
Fig. 7-4: Die photograph of the 8M-bit eDRAM macro with proposed TASFR
control scheme in UMC 65nm EDRAM low leakage process. Top level of
xii
test-chip is filled with dummy metal layer obscures the features in the
interior of the die photograph ......................................................................... 87
xiii
List of Tables
Chapter3
Table 3-1: Maximum data retention time across various temperatures from the
results simulated by the predicted 65nm EDRAM cell model. ................... 34
Chapter5
Table 5-1: Design considerations for area penalty and dc current consumption about
resistor ladder in our application. ................................................................ 66
Chapter7
Table 7-1: Performance summary table of proposed TASFR control scheme ............. 87
xiv
Chapter 1
Introduction
This chapter roughly describes the applications and requirement of embedded
dynamic random access memory (embedded-DRAM) for embedded systems. Then,
leakage current mechanisms in MOSFET will be described, because the data stored in
memory cell is severely prone to leakage current. The organization of this thesis will
be shown at the end of this chapter.
1.1 Low Power Embedded-DRAM Applications

Low power and energy design technique is more and more important for modern
VLSI or System on a Chip (SOC) design, especially in consumer electronics
applications. As rapid improvement in semiconductor technology and rapid progress in
memory density has led to their playing an important role in enhancing the
performance and reducing the cost of electronic systems. Up to now, there are a lot of
memory systems which are embedded in SOC systems for raising system performance.
In order to reduce the large amount of power consumption in memory systems, low
power embedded memory is one of emerging issues.
Embedded SRAM macros are widely used as SoC (System on Chip) memory
because of their higher random access while embedded-DRAM is more cost-effective
because of its smaller memory cell size [1]. DRAM has had the advantages of lower
cost per bit cost than SRAM and faster read/write random access than flash memory.
Previously, embedded-DRAMs have been widely used in the graphic engine chips for
high data rate, which have resulted in the dual ports on the sense amplifier and wide
1
I/O buses on memory arrays. When considering the requirements for DRAMs in
battery-operated portable equipment, it is expected that DRAM data retention current
as small as SRAM standby current will be needed so as to reduce the size and weight
allocated to batteries. For this reason, recently, portable multimedia equipments
managing not only audio/speech data but also moving-picture data have increased
rapidly in the worldwide market. Such portable multimedia equipment is based on the
MPEG standard, such as a videophone, a personal digital assistant (PDA), a digital
video disk recorder, and a codec system for satellite broadcast. The emerging market in
mobile
applications
such
as
third-generation
and
wide-band
code-division
multiple-access (WCDMA) phones, personal digital assistants, and hand-held personal

computers has nourished application-specific DRAM with mobile features attractive to
system users seeking a tightened power budget for memory [2]. In computer system
application, an n-bit macro cell that implements one static cell and n-1 dynamic cells
has been proposed [3]. This cell is aimed at being used in an n-way set-associative
first-level data cache.
Although the multimedia systems have previously used discrete SDRAM memory
chips, the demand of the SOC with embedded-DRAMs has become strong for
portability of equipment. The encoder chip for high-definition TV needs not only the
low power consumption and the small macro size but also the high data rate. In order
to realize embedded-DRAM macro, to cover all demands in one chip design, our
design goals are: high-speed access at low voltage condition, low power consumption
and small macro size. Furthermore, in the field of the SoC, the compatibility for the
request of costumer and the improvement of testability become key points [4, 5].
1.2 Challenges of Low Power Embedded-DRAM

The simplest way to reduce the power dissipation in the active mode of memories
is to reduce the supply voltage or to increase the cycle time [6-8]. DRAMs have a
little more power dissipation in the standby mode than SRAMs, since the data must be
refreshed periodically. The standby power reduction is also important. One of the most
important criteria in mobile DRAM is the small data retention current. Because the
operation time in standby mode is normally much longer than the active mode, even
small power dissipation in the standby mode cannot be negligible. Thus, refresh
current for data retention accounts a large amount of the power dissipation in standby
mode. Besides refresh current, there still exists a considerable amount of the standby
current in the DRAMs, since they have various internal voltage generators such as
high word-line voltage generators (VPP), substrate back-bias generators (VBB), and
internal voltage down converters (VINT). The major is due to level detectors (level
comparators) or voltage reference circuits. Therefore, to reduce the standby current, the
bias current through the level detectors or the voltage reference circuits should be
small enough. But if the bias current is too small, the desired voltage level of the level
detector or the voltage reference circuit can be very sensitive to the variation of process
parameters or temperature, thereby making its practical applications rather difficult.
Power consumption in DRAM can be divided into two parts as the dc and the ac
current. Further, the reduction of both the dc and the ac current components in
data-retention mode is a prime concern. For reducing the dc current component, it is
necessary to minimize the power of on-chip voltage converters, such as the voltage
down converter, voltage up converter, the substrate back-bias generator, the VREF
generator, and the half-VDD generator. On the other hand, for reducing the as current
component, it is necessary to extend the refresh time and reduce the refresh charge.
In this thesis, we will focus on the techniques for reducing self-refresh current in
standby mode, because the reduction of DRAM data-retention current is governed by
the dynamic refresh current, which is strongly dependant on the refresh cycle. However,
data retention time in the memory cell is strongly dependent on various temperatures.
Thus, we proposed a temperature-aware self-refresh (TASFR) control scheme for low
power embedded-DRAM, which will be introduced in Chapter 4 in more detail.
1.3 Basic Operations of DRAM
(a)
(b)
Fig. 1-1: (a) A conceptual DRAM array and (b) an actual data-line configuration [6-8].
A conceptual DRAM array with an actual data-line configuration is shown in the
Fig. 1-1. In a conceptual DRAM array, a large amount of conventional 1T1C memory
cell array of n rows by m columns, pre-charge circuits and equalizers (PE), latch-type
CMOS sense amplifiers (SA), and I/O line are used. For each column, memory cells, a
4
pre-charge circuit and equalizer, and a SA are connected to each pair of data lines (DLs)
which communicate with a pair of common data input/output lines (I/O and
through a column switch. The 1T1C memory cell is chosen due to its high density
characteristic, and its operation comprises read, write, and refresh operations. Other
cell structures for CMOS logic process technology and capacitor types for advanced
process technology are described in the Chapter 3. In the following subsections, we
will introduce its operations in detail.
1.3.1 Read Operation
Fig. 1-2: The read operation [6-8].

In the read operation, a stored voltage at the cell node (N) of each cell along the
word line, as shown in Fig. 1-1, is read out on the corresponding data line. In order to
read out the cell data completely, the word line voltage should be boosted higher than
VDD. Furthermore, it should be boosted high enough to the potential of VDD+VT for
turning the pass transistor on. As a result of charge sharing between cell capacitor of
5
the reading cell and parasitic capacitor on the corresponding data line after the word
line activates, the signal voltage (VS) developed on the floating data line (DL) is
expressed by
VS
V DD
2
CS
CD CS
(1-1)
The destructive readout characteristics need high-gain amplification and write-back

restoration for each of cells along the word line. When signal SN and SP go to each
supposed potential VSS and VDD, respectively, the sense amplifier will amplify the
small difference voltage between DL and
. Then, one of the amplified signals on
the data line is outputted as a differential voltage to the I/O lines by activating a
selected column line, YL.
1.3.2 Write Operation
Fig. 1-3: The write operation [6-8].

In the write operation, because of that write before sensing introduces noise into
adjacent bit lines, a preceding read operation is always activated at first. After almost
6
completing the pre-amplification, a set of differential data-in voltages of VDD and VSS
is inputted form the I/O lines to the selected pair of data lines. Thus, the new data are
written into the assigned memory cell. As a result of the preceding read operation, the
remaining cells on the selected word line will be also done the amplification and
restoration simultaneously so as to avoid loss of the data which has already been
stored.
1.3.3 Refresh Operation
Fig. 1-4: The refresh operation [6-8].

The stored voltage in the cell capacitor of each cell is subjected to leakage current,
and the stored data 1 degrade to low voltage after a long time. Unfortunately, the
sense amplifier cannot distinguish the original stored data between 1 and 0. For this
reason, the stored data in the memory cell should be restored by a refresh operation
which is almost the same as for the read operation, except that all YLs are kept inactive.
This action is done by reading the data of cells on the activated word line and restoring
them by sense amplifier for each word line so that all of the cells retain the data for at
least tREFmax. Here tREFmax is the maximum refresh time for the memory cell and will be
determined in the subsection 3.1.2. Therefore, each memory cell is periodically
refreshed at intervals of tREFmax, as shown in Fig. 1-4, although each cell actually has a
longer data retention time.
7
1.4 Structure of This Thesis

In Chapter 1, the demand of low power/energy design techniques are introduced.
Furthermore, our emphasis on the applications of low power embedded-DRAM is also
introduced. Finally, the DRAM basic operations are briefly illustrated.
In Chapter 2, leakage mechanisms in the nanometer device are explained in detail,
such as sub-threshold current (I1), gate induced drain leakage current (I2), gate-oxide
tunneling current (I3), hot carrier injection current (I4), reverse-biased junction
band-to-band-tunneling (BTBT) current (I5) and punch-through current (I6). Moreover,
we show the relevance to the temperature dependency on MOSEFT leakage current.
In Chapter 3, design issues in self-refresh mode are mentioned. The relevant issues
of data retention time and power consumption are introduced one by one. Then we
show the temperature dependency about self-refresh period and data retention power in
the conventional self-refresh mode. Finally, previous works offers some useful
solutions to overcome the shortcoming in conventional self-refresh mode.
In Chapter 4, the proposed TASFR control scheme which includes three important
features is described. After showing the design motivation and the innovative structure
of proposed TASFR control scheme, a simplified case study is elaborated on the
algorithm of proposed TASFR control scheme.
In Chapter 5, design issues in nanometer circuit are first illustrated. In turn, we
show the design considerations of our proposed TASFR control scheme. Simulation
analyses are placed in the end of this chapter.
In Chapter 6, we give an overview of embedded-DRAM circuit. Several previous
developed techniques which have been mitigated design difficulties and widely used in
embedded-DRAM circuits are shown at first. The test chip design and layout floorplan
are described at the end of chapter.

Chapter 7 presents the hardware measurement results of the 8M-bit embedded
-DRAM with our proposed TASFR control scheme fabricated in 65nm EDRAM low
leakage process at first. Then we will summarize and conclude this thesis. Finally, we
discuss the improvement and modifications of the proposed TASFR control scheme for
future work.
Chapter 2
Leakage and temperature dependency
Semiconductor devices are very sensitive to temperature variation. It can be said
that temperature has an appreciable effect on the device characteristics. In this chapter,
MOSFET leakage mechanisms will first be introduced. Next, temperature dependency
will be taken into consideration leakage current. However, larger power consumption
increases considerably with higher temperature. The effect caused by higher
temperature increases much larger power dissipation again. Therefore, there is a
positive feedback between power consumption and on-die temperature. For this reason,
temperature dependency in MOSFET must be an important design issue for many high
performance applications.
2.1 MOSFET Leakage Mechanisms

I3
I4
Source
STI
Drain
N+
I1
N+
STI
I6
I2
P-Substrate
I5
Body
Fig. 2-1: MOSFET leakage is composed of sub-threshold leakage (I1), Gate Induced
Drain Leakage (GIDL) (I2), gate-oxide tunneling current (I3), hot carrier injection
current (I4), reverse-biased junction band-to-bandtunneling (BTBT) current (I5), and
punch-through current (I6).
10
In deep sub-micron-meter scale regimes, threshold voltage, channel length, and

gate oxide thickness are reduced in order to achieve higher density and performance and
low power dissipation. For this reason, the aggressive scaling of CMOS technology in
each generation gives rise to significant increase in leakage current in the CMOS
devices. Moreover, the increasing statistical variation in the process parameters has
emerged as a serious problem in the nano-scaled circuit design. Nowadays, there are
many design challenges in nano-scaled technologies for circuit designers.
MOSFET leakage mechanisms can be categorized into five elements as illustrated
in Fig. 2-1. I1 is the sub-threshold leakage [9]; I2 is the Gate Induced Drain Leakage
(GIDL) [6, 7]; I3 is the oxide tunneling current [10]; I4 is the gate current due to
hot-carrier injection [11-13]; I5 is the reverse-biased junction BTBT leakage [10]; and I6
is the channel punch-through current [6, 11-13]. Currents I1, I2, and I6 are off-state
leakage mechanisms, while I3 and I5 occur in both ON and OFF states. I4 can occur in
the off state, but more typically occurs during the transistor bias states in transition.
2.1.1 Sub-threshold Current (I1)

The drain current of MOSFET in weak-inversion is different from that in the
strong-inversion. In the strong inversion, channel is inverted and current in the channel
flows by electric field drift. But in the weak inversion, channel is not inverted and
current flows by diffusion. The characteristic of transistor operate as the Bipolar
Junction Transistor (BJT). Equation (2-1) shows a basic model for sub-threshold
current and total off current.
V VT
I DS :Subthresho ld I 0 exp GS
nV th
Where I0 is the drain current in VGS = VT given in equation (2-2).

11
(2-1)
VGS
VDS
COX
n+
n+
CDP
depletion
P-Substrate
VBB
(a)
ID (unit:A, log scale)
1.00E-03
1.00E-04
1.00E-05
1.00E-06
1.00E-07
1.00E-08
VDS=1.2V
1.00E-09
VDS=0.4V
1.00E-10
1.00E-11
1.00E-12
1.00E-13
0
0.2
0.4
0.6
0.8
1.2
VGS (unit:V)
(b)
Fig. 2-2: (a) Cross section of sub-threshold current model. (b) ID versus different VGS
with VDS=1.2V and VDS=0.4V.
I 0 O C ox
W
L
n 1Vth 2
(2-2)
We can see that IDS is dependent exponentially on VGS as expected for diffusion current.
VT is the transistor threshold voltage, Vth is the thermal voltage, VthkT/q, and n is the
sub-threshold slope factor (n1Cd/Cox). If low VDS roll-off is to be taken into
consideration, equation (2-1) can be modified into equation (2-3), as shown in the
below.
12
V VT
nV th
V DS
1 exp
th
(2-3)
As channel length of transistor continues to shrink in nano-scale technology, the

short channel effect (SCE) should be taken into consideration in the circuit design.
Such as Drain-Induced barrier lowering (DIBL) effect, it can be modeled into equation
(2-4) by using linearized coefficient, , which is relative to VDS, and the modified
equation is shown as Equation .
V VT V DS
nV th
V DS
1 exp
th
(2-4)
The sub-threshold slope (S) of MOSFET is defined as VGS/(log10IDS), and it can

be approximated to Equation (2-5) for simplicity.
S nV th ln 10
(2-5)
The ideal value of S at room temperature is 60mV/decade when n = 1. It is much more

difficult for bulk device because Cd should be approximated to zero or Cox should be
approximated to infinite. Typically, S is roughly equal to 70mV/decade in the modern
CMOS technology for n1.4~1.5.
2.1.2 Gate-Induced Drain Leakage (I2)

Gate-induced drain leakage is due to high electric field effect in the drain junction
of an MOSFET. The silicon surface under the gate has almost same potential as the
p-type substrate when the gate is biased to form an accumulation layer at the silicon
surface. The surface near silicon behaves like a p region much more heavily doped
than the substrate because of accumulated holes. For this reason, the depletion layer at
the surface becomes much narrower than elsewhere.
13
VG < 0
VD > 0
VG < 0
VD = VDD
+ + + + +
+ + + +
n+
P-Substrate
P-Substrate
VBB
VBB
(a)
(b)
Isub-threshold
GIDL
(c)
Fig. 2-3: (a) GIDL effect with small VDS. (b) GIDL effect with large VDS = VDD. (c)
Simulation results of GIDL current.
Furthermore, the narrowing of the depletion layer at or near the surface results in
field crowding or an increase in the local electric field, thus enhances the high field
effects near that region. If gate is biased more negative, the n+ drain region under the
gate can be depleted and even becomes inverted. This will cause more electric field
crowding and peak field increase, and result in a dramatic increase of high field effects
such as avalanche multiplication and BTBT. The tunneling probability via near-surface
traps also increases. The minority carriers that have been accumulated or formed at the
drain depletion region under the gate are swept laterally to the substrate because that
the substrate is always at lower potential compared with other node. This current flow
14
is called Gate-Induced Drain Leakage. GIDL current becomes more serious in deep
submicron device, and it highly depends on electric field as shown in equation (2-6)
B
I GIDL AE 2.5 exp
(2-6)
where A and B are constant.

It can be said that GIDL is the leakage current flows from the drain to the
substrate induced by band-to-band-tunneling mechanism, being exacerbated by larger
negative gate bias. For DRAM operation, it should be avoided that charge will leak
away from the storage capacitor by many leakage mechanisms. To reduce leakage,
using negative gate bias is an effective manner to lower off-leakage current. But for the
above explanation, it will result in the more serious leakage current again if the gate
bias is too negative.
2.1.3 Gate-Oxide Tunneling Current (I3)
+VOX
+VOX
OX EC
OX

EV
p substrate
EV
EC
EC
p substrate
EV
EV
EC
Gate oxide
Gate oxide
n+ poly silicon
n+ poly silicon
(a)
(b)
Fig. 2-4: (a) Fowler-Nordheim (FN) tunneling of electrons for applied more positive
high gate voltage. (b) Direct tunneling of electrons for applied positive high gate
voltage.
In order to seek better gate control ability and larger MOSFET driving ability, the
gate oxide thickness is reduced from one technology generation to the next new
15
technology generation. Unfortunately, the MOSFET gate tunneling current increases

exponentially with a reduced thickness of the gate oxide, and this will result in larger
power dissipation due to the considerable gate tunneling leakage, which was almost
nonexistent in previous technology generations. The tunneling mechanism between gate
poly-silicon and substrate can be divided into two different types: One is the
Fowler-Nordheim (FN) tunneling, which most of electrons tunnel through the triangular
potential barrier region. Fig. 2-4 (b) shows energy band diagram of the FN tunneling of
electrons from the inverted surface to the gate. The current density in FN tunneling is
given by
J FN
q 3 E OX
16 2 OX
4 2 m *
OX
exp
3 qE OX
(2-7)
where EOX is the field across the oxide; OX is the barrier height for electrons in the
conduction band; and m* is the effective mass of an electron in the conduction band of
silicon. The FN tunneling current density equation is valid only when VOX > OX,
where VOX is the voltage drop across the gate oxide. For modern short channel devices,
which are mostly operated at VOX < OX in normal operation, thus the FN tunneling
current always can be ignored; the other is direct tunneling, which most of electrons
from the inverted silicon surface directly tunnel to the gate through the forbidden gap
of the SiO2 layer. In case of direct tunneling, electrons tunnel through a trapezoidal
potential barrier region instead of a triangular potential barrier region. By this way, the
direct tunneling occurs at VOX < OX. The current density in direct tunneling is given
by
J DT
V
2
OX
B 1 1
OX
AE OX exp
E OX
16
(2-8)
where Aq3/162OX and B
OX2/3/3q. The direct tunneling current is
dominant in modern CMOS device for continuously reducing the thickness of gate
oxide. For both two types of gate tunneling current, it is quite sensitive to the oxide
thickness (tOX), while it is less sensitive to the junction temperature (Tj) and gate
voltage (VGS). Up to today, using high-K gate dielectric materials is the most popular
method to effectively reduce gate leakage dissipation. Hence, innovative circuit
techniques to reduce gate direct tunneling need to be developed, since further scaling
of the high-K insulation film would still result in significant gate leakage power
dissipation.
2.1.4 Hot carrier injection Current (I4)

VGS (<VDS)
VDS
EC
EV
+VOX
IG
n+
+
+
Channel
(Inversion)
n+
p substrate
depletion
EC
IBB
EV
P-Substrate
Gate oxide
n+ poly silicon
VBB
(a)
(b)
Fig. 2-5: (a) The mechanism of hot carrier injection (b) Energy band diagram of
Injection of hot electrons from substrate to oxide.
As the channel length of MOSFET is scaled down for a fixed VDD the longitudinal
electric field near the drain side strengthens. Eventually, electrons flowing from the
source termination to the drain termination attain to enough kinetic energy from the
high longitudinal electric field near the drain end side, which are called channel hot
elections, and generate electron-hole pairs as a result of impact ionization at the drain
17
end side. Some of the generated electrons flow into the drain termination. For a
transverse electric field, which accelerates these fast-moving carriers toward the oxide,
the others electrons are injected into the gate insulator as gate current (IG) and trapped
there. It will cause a gradual change of VT and a decrease in the trans-conductance of
the MOSFET, so this phenomenon will lead to another device degradation which is
known as hot-carrier-induced degradation. Hot carrier induced degradation is not
thermally activated but does rely on the kinetic energy of the channel carriers. The hot
carrier induced degradation will not be discussed in more detail in this thesis. On the
other hand, some of the generated holes flow into substrate, resulting in a substrate
current (IBB). The IBB of an NMOS is 3 orders larger than that of a PMOS for a larger
impact ionization coefficient and a higher electric field that stems from a sharper
impurity profile near the drain side. For this reason, device parameters of NMOS will
degrade more apparently. The maximum substrate current, IBBMAX, which is usually
developed at VGSVDS/2, is expressed as following equation.
I BBMAX exp / V DS
(2-9)
And is a constant given in Equation (2-9) The life time (HC) of an NMOS, which is
defined as the time when VT degrades by 100mV due to channel hot electrons is given
by equation (2-10),
HC (
I BBMAX
W
) n /( f t SUB )
(2-10)
where n is a factor about 2.5~3, W is the channel width, tSUB is the pulse width of an
IBB pulse with amplitude of IBBMAX, and f is the pulse frequency. Conspicuously,
lowering VDS, just like that lowering VDD, is the most efficient way to reduce IBB
because IBBMAX exponentially reduces with lowering VDS. One-order reduction of
IBBMAX extends the MOSFET life time by 3 orders.
18
2.1.5 Reverse-Biased Junction BTBT Current (I5)

In NMOS device structure, n-type heavily doping impurity is implanted into
source termination and drain termination, and p-type doping is always implanted into
substrate. As it can be seen, the one-sided n p junction diode is formed. The
designation n indicates degenerated doped n type. In normal operation, the one-sided
n+p junction diode always operates in reverse-biased region. Thus, the reverse-biased
junction current should be taken into consideration in MOSFET leakage model.
A reverse-biased junction BTBT current has two main components [10]: one is
the minority carrier diffusion/drift near the edge of the depletion region. The
forward-biased PN junction current is given as equation (2-11)(2-12)
qV
1
J J 0 exp a

(2-11)
Dn n p 0
D p pn0
J 0 q
L
Lp
n
(2-12)
where Va is applied voltage across junction, is Boltzmanns constant

(=1.38 1023 / ), T is temperature in Kelvin degree, Dn and Dp are the minority
carrier diffusion coefficients, Ln and Lp are the minority carrier diffusion length, and
np0 and pn0 are the minority concentration in the thermal equilibrium state. It is like the
forward-biased PN junction current with applied negative bias Va shown as equation
(2-13)
qV
1 J
J J 0 exp a

0
(2-13)
;the other one is due to electron-hole pair generation in the depletion region of the
reverse-biased junction. For MOSFET, additional leakage can occur between the drain
and the substrate junction from gated diode device or carrier generation in drain to
substrate depletion regions with influence of the gate on these current components.
19
This is a case for modern advanced MOSFETs using heavily doped shallow junctions
and halo doping for better SCE. If both P and N region are heavily doped,
band-to-band tunneling (BTBT) current dominates the PN junction leakage.
For high electric field ( >106 V/cm) across the reverse-biased PN junction, current
flow in the junction becomes more significant due to tunneling of electrons from the
valence band of the p region to conduction band of the n region. The tunneling current
density is given by [12]
J BTBT A
EV APP
Eg
1/ 2
3/ 2
Eg
exp B
* 3
*
, A 2 m q , B 4 2 m
4 3 2
3q
(2-14)
where m* is effective mass of electron; Eg is the energy bandgap; Va is the applied

reverse bias; E is the electric field at the junction; q is the electronic charge; and is
1/2 times Plancks constant. For a well-known step junction structure, the electric field
at the junction is shown in equation (2-15)
E
2 qN a N d (Va Vbi )
si N a N d
(2-15)
where Na and Nd are the doping concentration in the p and n side; si is permittivity of
silicon; Vbi is the built-in potential across the junction.
2.1.6 Punch-Through Current (I6)

In the short-channel device, the depletion regions at the drain-substrate and
source-substrate junctions extend into the channel because drain and source become
closer to each other. If the doping concentration keeps constant, the separation between
depletion region boundaries decreases as channel length is reduced. In addition,
increasing in the reverse bias across the junction also push the junctions nearer to each
other. As the combination of channel length and reverse bias result in the merging of
20
the depletion regions, punch-through is said to have occurred. The device parameter
which is commonly used to characterize the punch-through is the punch-through
voltage (VPT), which estimates the value of VDS which the phenomenon of
punch-through occurs at VGS=0. VPT is roughly estimated as the value of the VDS for
which the sum of the widths of the drain and source depletion regions is equal to
effective channel length
V PT N B L W J
(2-16)
Where NB is the doping concentration at the bulk; L is the channel length; and Wj is the
junction width.
2.2 Leakage with Temperature Dependency

Sub.
1.5
Current (A/M)
BTBT
Gate
Total
1.0
0.5
0.0
300
320
340
360
380
400
Temperature (K)
Fig. 2-6: Leakage current with various temperature in 25nm predict model [13].
As previous section mentioned, we can simply illustrate the relationships of the
leakage mechanisms with some basic physical phenomena. There are different
temperature dependences that govern the different leakage current components. First,
sub-threshold current is governed by the carrier diffusion that increases with an increase
of temperature. Second, the gate and the junction BTBT current are less sensitive to
temperature than sub-threshold current, because tunneling probability of an electron
21
through a potential barrier does not directly depend on temperature. Nevertheless, the
bandgap of silicon which is the barrier height for tunneling in BTBT will be reduced by
increasing temperature. Thus, the junction BTBT current increases with increasing
temperature. As shown in Fig. 2-6, it shows the simulation results that different leakage
components vary with increasing temperature for NMOS device of Leff = 25 nm. We
can observe that the sub-threshold leakage increases exponential with temperature, the
junction BTBT increases slowly with temperature, and gate tunneling current is rarely
independent on temperature [13].
22
Chapter 3
Design issues in Low power Self-Refresh
Mode
In this chapter, we introduce the design issues in self-refresh mode, such as cell
data retention, temperature dependency, and power consumption in data retention
mode. After identifying these issues, we will present some previous approaches which
have been reported in papers to overcome these issues.
3.1 Cell data retention

We will illustrate the design issues about cell data retention in this section. The cell
structures, which can be classified into two categories such as that process
improvement and CMOS logic based only, are first introduced. Then, we will show the
data retention time for determining the refresh period.
3.1.1 Cell Structure

Trench cell
Stack cell
M1
Cell Height
Cell Plate
Bit-line
Bit-line Contact
P-Well
Cell Height
Word-line
STI
n+ poly
Trench
Bit-line Contact
(a)
Word-line
(b)
Fig. 3-1: Different types of storage capacitor [14] such as (a) trench-type capacitor, (b)
stack-type capacitor.
23
Dynamic random access memory (DRAM) is a type of random access memory

that uses charge stored on individual capacitors to hold binary data. The typical one
transistor, one capacitor (1T1C) cell has been universal since the mid-1970s because it
offers the highest density. In the past decade, the DRAM cell structures have been
developed from planar-type capacitor into two groups which are trench-type capacitor
and stack-type capacitor, as shown in Fig. 3-1. In order to achieve high capacitance of
the storage capacitor, DRAM process has been rapidly improved every generation.
Therefore, the capacitor structures are varied from the early gate oxide dielectric with
poly-insulator-silicon (PIS) capacitor to poly-insulator-poly (PIP) capacitor, then to
metal-insulator-silicon (MIS) capacitor, and now to metal-insulator-metal (MIM)
capacitor [15]. For this evolution of process technologies, dielectric constant is
improved step by step so as to enlarge capacitance of storage capacitor by enlarging r,
as shown in the following equation
C=r 0
, r =
d
(3-1)
where 0 is permittivity of free space, and r is dielectric constant. Moreover, there are
several published researches of advanced storage capacitor for embedded-DRAM
technologies. Referring to the stack-type cell, it includes capacitor-over-bit-line (COB),
COB with hemi-spherical-grain (HSG) poly-silicon electrode, COB with HSG and
cylindrical capacitor. Referring to the trench-type cell, substrate-plate trench (SPT)
capacitor, merged isolation and node trench (MINT) capacitor, stacked trench capacitor,
buried plate trench (BPT) capacitor, buried strap trench (BEST) capacitor, fully
planarized trench capacitor [16-19]. The major advantage of these advanced memory
cells is much smaller area which can be used for high density. However, the number of
process steps for embedded-DRAM are increased compared to commodity DRAM
process. In other words, the process cost keeps increasing to integrate
24
embedded-DRAM macro in System-on-Chip.

BLw
WWL
WL
BLr
WLw
Qr
Write
device
(wg)
Qw
CS
RWL
WBL
RBL
CS
DL
gated diode
storage + amplifier
(gd)
(b) [6]
(c) [22, 23]

WL
RW
Q3
RW
Q2
Q1
WD
Q3
WW
DL
(d) [6]
Q3
Q2
WLw
Write
device
(wg)
extra
capacitor
(3T1C)
(g) [6]
WW
Q2
RD
(f) [6]
BLw
BLw
BLr
BLr
WLr
VC
Q3
WD
(e) [6]
WL
DL
Q1
Q2
Q1
RD
Q1
Read device
(rg)
WLr
ZDL
(a) [20, 21]
VC
Read select
(rs)
Read device
(rg)
WLw
Write
device
(wg)
WLr
Read select
(rs)
VC
Read device
(rg)
WLr
storage
(h) [24-26]
gated diode
storage + amplifier
(gd)
(i) [24-26]
Fig. 3-2: Various logic-compatible embedded-DRAM cell structures.

To lower process cost, logic-compatible memory cells are adopted to
embedded-DRAM. In early published paper, there are many logic-compatible cell
structures suitable for embedded-DRAM, such as 2T cell structure [20, 21], 2T2C cell
structure [6], 2T1D cell structure [22, 23], 3T cell structure [6], 3T/3T1C cell structure,
and 3T1D cell structure [24-26]. Referring to these structures, as shown in Fig. 3-2, all
the memory cells can be implemented in logic CMOS process without additional
process steps. Furthermore, some of the cell structures can modify the destructive-read
into nondestructive read by their dual-port characteristic. However, the operations of
25
these cells are more difficult than conventional 1T1C cell which has been described in
section 1.3, and their area are larger than 1T1C cell. The most important thing is that the
capacitance of the storage capacitor is much smaller than conventional 1T1C cell, so
high signal-to-noise ration design is always difficult to realize. Moreover, the small
capacitance of the storage capacitor results in shorter data retention time, and the refresh
operation should be as soon as possible. For these reasons, the key design issue is to
overcome their small capacitance of the storage capacitor.
In summary, the 1T1C cell structure is widely used in SoC system because of its
high density. Although there are various methods to reduce process cost by constructing
new cell structures that change transistor number, the conventional 1T1C cell are the
major trend in embedded-DRAM. For this reason, it is a trade-off that cell structures
improved by advanced process and transistor combination are applied to products with
different demands. That is why many researches still continue to improve modern
DRAM process and circuit techniques.
26
3.1.2 Data Retention Time

BL1
BL0
ISUB: Sub-threshold leakage
IJUN: Reverse Junction leakage
IGATE: Gate tunneling leakage
IONO: Oxide-Nitride-Oxide leakage
IGATE WL0
N+
STI
ISUB
IBTBT
WL1
N+
N+
STI
P-Well
N-Band
P-Substrate
IONO
IONO
Fig. 3-3: Cross section view of embedded-DRAM cell and leakage mechanisms
The key issue in data retention time degradation is OFF-state leakage in
memory-cell transistors. The data 1 stored at the storage node will be leaked when
memory cell is not selected. As previously discussed in section 2.1, leakage
mechanisms have been described in more detail. For embedded-DRAM cell, OFF-state
leakage in memory-cell transistors consists of components, which includes
sub-threshold channel leakage (ISUB), punch-through current, gate tunneling current, and
reverse-biased junction band-to-band-tunneling current (IBTBT) [27]. Fig. 3-3 shows that
the junction band-to-band-tunneling current (IBTBT) flows from the storage node to the
substrate, gate tunneling current flows to the gate node; and the punch-through current
and sub-threshold leakage flows from the storage node to the data line (DL).
27
VSN
VWL= 0.5V
VDD
I3
SN|t=0= VDD
margin
VDD/2
I1
I2
CS
VDL= 1/2VDD
VSS
I4
Time
VBB= 0V
(a)
ZDL
H WRITE
VDD+VT
DL
WL
L disturbance from DL
WL
SN
CS
READ
selection
other
WLs
no selection
good
VDD/2
other
WLs
fail
VDD/2
Sense limit
SN
(b)
Fig. 3-4: (a) Memory cell is subjected to leakage current, thus cell data retention time
is also much relevant to leakage current. (b) The loss of stored binary data due to the
data line L disturbances. QC = 0 and QN = 0 are assumed [6].
The data retention time of a chip is determined by the worst memory cell with the
shortest retention time. However, binary data stored in memory cell is subjected to
leakage current, thus data retention time is also much relevant to leakage current, as
plotted in Fig. 3-4 (a). Even though it is under the worst cell conditions, a memory cell
which is not selected during the tREFmax period must hold stored data. The combination
of the maximum junction temperature (Tjmax) and successive low-level disturbances
from the corresponding data line results in the worst cell conditions. For the condition
in maximum junction temperature, we will introduce its influence later. Then for the
condition in successive low-level disturbances from the corresponding data line, the
sub-threshold leakage is further enhanced when the lowest DL voltage (0V) is supplied
28
for as long as possible. Thus, the worst conditions are eventually the successive
low-level data-line disturbances due to successive operations of other cells on the data
line at the minimum cycle time and maximum ambient temperature, as shown in Fig.
3-4 (b). Fortunately, successive high-level (VDD) data-line disturbances are applied,
ISUB which is the dominant leakage component disappears and only ITUNNEL and IJUN
are developed.
There are several methods to increase data retention time. First, it is make sense
that reducing leakage current can improve data retention ability. For this reason, it is
useful to adopt high- metal gate for reducing ITUNNEL [12], negative word line bias
method [28] for reducing ISUB, and negatively back bias [29, 30] for increasing
threshold voltage (VT) to reduce ISUB. Moreover, the channel length of access transistor
used in memory cell is always chosen longer than the minimum length in the process.
This is a design trade-off between speed consideration and power consideration. Such
as that access transistor with minimum channel length improves the access time, but
non-selected cells will suffer from larger leakage current. On the contrary, access
transistor without minimum channel length degrades the access time, but it can prevent
non-selected cells from larger leakage current. Secondly, as described in previous
section, it is effective to adopt new cell structures for increasing capacitance of storage
capacitor to increase data retention time.
29
3.2 Power Consumption

Average chip current IT
Operating range
ITmax
ITmin
IPH
tRCmin
tREFmax/n
Cycle time tRC
Fig. 3-5: Current versus cycle time [6].

The power dissipation (P) of a DRAM chip which operates with cycle time tRC and
power supply VDD can be expressed as following equations.
P
Cj Vj t VDD IDC IP VDD

QT VDD tRC IDC IP VDD
(CDT VD CPT VP ) VDD tRC
VDD tRC
IDC IP VDD
(3-2)
(3-3)
(3-4)
(3-5)
where Cj and Vj are the capacitance and voltage swing at node j; QT, QDT, QPT
are the total charges of the whole chip, data lines, and peripheral circuits, which are
charged by power supply VDD; the IDC are the major DC currents such as the ration
current at common I/O circuits and the constant-current sources of main amplifiers; the
IPH are the quasi DC currents from circuits which are always in operation, such as
on-chip voltage converters and relevant refresh circuits; and CDT and VD are the total
charge of data lines that are simultaneously charged up, and voltage swing on the
data-line. Therefore, the average power consumption current IT on the chip is
expressed as equation (3-6).
IT
QDT QPT
tRC IDC IP
30
(3-6)
The array current IA and peripheral current IP, when the chip is operating at the
minimum cycle time tRCmin are given by following equations.
IA
I
I
QDT tRCmin
QPT tRCmin
,
;
(IA IP )(tRCmin tRC ) IDC IP
(3-7)
(3-8)
(3-9)
In VDD/2 pre-charging structure, only one of a pair of data lines contributes to

power consumption. That is, a charging current flows from VDD power supply to data
line in order to raise data line from half VDD to VDD at SA amplification cycle. VD is
equal to half VDD. As shown in Fig. 3-5, the total power consumption current IT on a
chip has a maximum value ITmax at cycle time tRCmin and a minimum value ITmin at cycle
time tRCmax/n. Reduction in ITmax and ITmin both are important, because lowering ITmax
can reduce on-chip junction temperature (Tj), and lowering ITmin can extend battery
back-up time in portable application. Namely, that is a major concern for lowering
power consumption to reduce VDD, QT and IDC, as expressed in equation (3-6).
3.3 Conventional Self-Refresh Mode

DRAM cell is subjected to leakage current and hence lose stored data if it has
been left alone for a long time. For this reason, cell charges need to be restored
periodically. When a refresh cycle starts, all of the memory cells along the selected row
(word line) are simultaneously refreshed. Just like refresh operation, all of the memory
cells continued to be refreshed by sequential selection of all rows. Nevertheless, there
is something different between refresh mode and self-refresh mode that memory cells
are kept being refreshed and no normal access operation occurs in self-refresh mode, as
plotted in Fig. 3-6(b). Only retaining the stored data is used to reducing power
dissipation, and therefore it is suitable for battery back-up. In self-refresh mode, the
31
embedded-DRAM macro internally generates all the refresh-relevant signals without

external control signals, such as the refresh request and row address signals, on a chip.
TREFmax
TREFmax/n
TRCmin
WL1
Refresh
WL2
WLn
WL1
Time
Normal random access
(a)
TREFmax
TREFmax/n
WL1
WL2
WLn
WL1
Time
Refresh No normal random access
(b)
Fig. 3-6: (a) The refresh operation [6]. (b) The self-refresh operation.
3.4 Temperature Dependency in Self-Refresh Mode

In order to keep correct data which are stored in the DRAM cells, the self refresh
period must meet the requirement of the highest operation temperature and the worst
condition in process. However, if a fixed self-refresh period is adopted, the wasted
power will be more significant when the chip operates in a lower temperature
condition. In this section, first we introduce retention time with temperature
dependency. Next, refresh power can be reduced with adaptive refresh period.
32
3.4.1 Retention Time with Temperature Dependency

VSN (storage node) versus time
VWL= 0.5V
Low temp.
IGATE SN|t=0= VDD

ISUB-VT
IGIDL
VDL= 1/2VDD
VBB= 0V
IJUN
CS
High temp.
IONO
(a)
(b)
Fig. 3-7: (a) Cell bias conditions in the self-refresh mode. The charge of storage node
(SN) in memory cell will be discharged by leakage current (b) The voltage drop in
memory cell is sensitive to temperature because IBTBT is also temperature-dependent.
Because the storage node of memory cell is operated in floating state after write
operation, it is drastically subject to leakage current. As described in section 2.1, there
are many leakage mechanisms in MOSFET device. With reference to Fig. 3-7(a), cell
bias conditions in self-refresh mode are shown. In this embedded-DRAM, the pass gate
of cell substrate (body) is gated to ground level, thus the substrate-bias voltage
generator are not necessary. Therefore, the dc component in data retention current can
be significantly reduced. In self-refresh mode, the word-lines of non-selected rows are
biased on a negative voltage -0.5V. The negative word-line voltage is used reduce the
dominate leakage component, sub-threshold leakage. However, there are still the
remaining components of cell leakage current path which includes the reverse-biased
junction BTBT current (IBTBT), gate tunneling current (IGATE) and oxide-nitride-oxide
leakage (IONO). As described in section 2.2, it shows that sub-threshold current (ISUB)
and reverse-biased junction band-to-band-tunneling current (IBTBT) are more
33
temperature-dependent than gate oxide tunneling current (IGATE). For this reason, the
IBTBT and IONO are the leakage components which are more temperature-dependent in
this case. Referring to Fig. 3-7(b), the voltage drop in memory cell is also sensitive to
temperature because of these temperature-dependent leakage sources. As shown in
Table 3-1, data retention abilities with various temperatures are simulated by predicted
65nm EDRAM cell model.
Table 3-1: Maximum data retention time across various temperatures from the results
simulated by the predicted 65nm EDRAM cell model.
Temperature ()
25
55
85
TRET (sec)
258.54m
55.23m
3.85m
3.4.2 Power Dissipation with Temperature Dependency

As mentioned in previous sub-section, power consumption in embedded-DRAM
can be divided into two parts as the dc and the ac current. Furthermore, the reduction
of both the dc and the ac current components in data retention mode is a prime concern.
For reducing the DC current component, it is necessary to minimize the power of
on-chip voltage converters, such as the voltage down converter, voltage up converter,
the substrate back-bias generator, the VREF generator, and the half-VDD generator. On
the other hand, for reducing the AC current component, it is necessary to extend the
refresh period and reduce the refresh charge. To reduce data retention power, the
current consumed during the refresh operation is a major concern. With adaptive
self-refresh period, the ac component of self-refresh current can be reduced with
different junction temperatures.
34
norm. data retention current
3.0
rfclk_100ns
rfclk_200ns
2.5
rfclk_500ns
2.0
1.5
1.0
25
50
75
100
125
Temperature (C)
Fig. 3-8: Power consumption in self-refresh mode versus temperatures with three
self-refresh periods.
Fig. 3-8 shows that power consumption in self-refresh mode versus temperatures
with three self-refresh periods. As it is shown, power consumption in self-refresh mode
increases with different temperatures. Higher temperature increase, larger power
consumption dissipates. That is because off-current of non-selected circuits is larger in
nanometer device and on-current generated from on-chip voltage converters become
Norm. data retention current
larger with higher temperature.

2.5
Temp=25
Temp=85
2.0
1.5
1.0
0
10
Self refresh period (s)
Fig. 3-9: Power consumption in self-refresh mode versus self-refresh period in two
cases, temperature 25 and 85.
Fig. 3-9 shows that power consumption in self-refresh mode versus self-refresh
35
period in two cases, temperature 25 and 85. The power consumption can be
reduced with longer refresh period in various temperatures. In order to achieve low
power consumption in self-refresh mode, self-refresh period should be extended
carefully in different temperature conditions. For this reason, we can design a circuit
which can automatically select a reasonable self-refresh period which does not result in
data loss of real cell array with different temperatures. More importantly, it should
achieve low self-refresh power issue.
3.5 Previous Works

Here we introduce previous published works to realize how they use circuit
techniques or system architectures for meeting the low power consumption demand.
The different self-refresh control schemes for various applications to DRAM can be
classified into three categories, such as replica replica-cell-based self-refresh control
scheme, sensor-based self-refresh control scheme, and temperature sensor. All of them
will be introduced in detail in the following subsections.
3.5.1 Replica-Cell based Self-Refresh Control Scheme

An effective way to control the refresh period is to track the voltage of memory
cells whether the stored data is lost or not. Cell leak monitoring scheme intends to
monitor the cell leakage current of memory cells in peripheral regions. After the
monitored cell voltage reaches a minimum voltage level of memory cells, the output of
the monitoring scheme will turn to high level, and then start a burst self refresh
operation.
36
Comparator
VREF
S Q
R
VN
REF
TIMER
VREF
P
E
VN
CS
01
511
01
511
REF
1k bit
T1
Refresh
(a)
T2
Pause
Refresh
(b)
Fig. 3-10: (a) The developed refresh timer with special self-refresh control. (b) Timing
diagram of the self-refresh control scheme.
A low power dissipation back-bias generator (VBB generator) with a new back-bias
level sensor to reduce the DC component of the data retention current, and the other is a
temperature-dependent self-refresh timer with a unique internal refresh control to reduce
the ac component of the data retention current [31]. Referring to Fig. 3-10(a), the
memory cell monitor uses 1K memory cells of the same structure as actual memory
cells. When the voltage VN drops below the reference voltage level, output of the
comparator will flip and trigger signal E to high level. Thus, the self-refresh operation
is activated, and it starts to generate 512 refresh cycles with period T1. However, the
period T1 is less temperature-dependent than the period T2. Because the period T2 is
controlled by the cell monitoring scheme, the duration of T2 is determined by leakage
source of memory cells which varies with temperature variations. Thus, the period T2 is
temperature dependent and reduces the ac component of data retention current.
37
Plate
VDWL
1/2VCC
Dummy VCC
1k cells
VDWL
VSS
Monitored storage
node VN
+
VPLT
VPLC
BL
VPLC
VPLD
S Q
R
Int. RAS
VN
TIMER
Fixed plate
Floating plate
VREF
VREF
Fixed plate
VPLT
Floating plate
VREF
VCC
VPLD
VSS
(a)
Plate
Dummy VCC
1k cells
VDWL
01
Monitored storage
node VN
+
VPLT
1/2VCC VREF
1/2VCC
S Q
R
511
01
511
VSS
Int. RAS
Int. RAS
TIMER
T1
Refresh
(b)
T2
Pause
Refresh
(c)
Fig. 3-11: Comparison of cell-leakage monitoring scheme. (a) Conventional fixed-plate

scheme [31]. (b) Plate-floating leakage monitoring (PFM) scheme. (c) Timing diagram
with PFM and conventional scheme.
However, using a structure for dummy cells identical to actual cells, the leakage
from bad cells cannot be monitored. To monitor data retention time more accurately,
the plate-floating leakage monitoring scheme is presented in [32], which accelerates
the replica cell discharge-speed of the monitored node (VN). When VN is being
monitored, the plate node (VPLT) is set to floating state. So the effective capacitance is
reduced to accelerate the discharge-speed of VN. When VN drops to the reference level
(VREF), the plate level of cell array of the DRAM chip is reset by VPLD and refresh
operation is again activated.
38
Cell Voltage
Memory cell
Control Signal
Generator (CSG)
Comparator
VREF
OSC_OUT
Enable
(a)
VDD
1/2VDD VDD
P31 P32
N7
P33
P34
P35
N6
IBIAS
N5
N33
N1
Recharge
Cell_Voltage
N4
N31 N32
ILEAK
VBIAS
VREF
DET
Cell_voltage
N36
C1
N3
N34
VBIAS2
N2
(b)
N35
(c)
Fig. 3-12: (a) Block diagram of the temperature-insensitive self- recharging circuitry.
(b) Memory cell. (c) Comparator.
A temperature-insensitive self-recharging circuitry has been reported in these
papers [33, 34]. Referring to Fig. 3-12(a), it shows the block diagram of
temperature-insensitive self-recharging circuitry, and is composed of a memory cells, a
voltage comparator, and a control signal generator. Referring to Fig. 3-12(b), the
scheme that monitors the voltage drop of a memory cell is presented. As soon as cell
voltage drops by a threshold voltage (VTHN), the recharging operation is triggered to
recharge the memory cell to retain the stored data. Referring to Fig. 3-12(c), it shows
the schematic of the comparator which the inputs are the output of the memory cell and
a reference voltage VREFVDDVTHP, respectively. The trick of this design is that VTH
and ILEAKAGE are temperature dependent, and the duration of voltage drop of the
memory cell is determined by the deterioration of VTH and ILEAKAGE. Due to the trick of
design, the duration of voltage drop of memory cell also depends on temperature. This
39
leads to the fact that refresh period can vary with temperature variations.
The abovementioned schemes have a critical disadvantage that the refresh time in
these schemes are determined on replica cells not in the worst case but in the average
case. It is too difficult to monitor the DRAM cell which is a tail bit, thus larger deign
margin or on/off chip calibration technique are necessary for these schemes.
3.5.2 Sensor Based Self-Refresh Control Scheme

SEP
DTQ
VN
Internal Period
Selector
Internal Period
Controller
SEN
DTC
SEN
SEP
Timer for
TA > 75
Timer for
75>TA > 50
Timer for
TA > 50
DTQ
Temperature Detecting
circuit for TA = 50
N-NW
Temperature Detecting
circuit for TA = 75
(a)
(b)
Fig. 3-13: (a) Self refresh controller with temperature detecting circuit. (b) Timing
diagram for temperature detecting circuit.
Fig. 3-13(a) is a self refresh controller with temperature detecting circuit, and Fig.
3-13(b) is a timing diagram of the controller [35]. As the figure shown, a latch-type
comparator is used and voltage divider which is composed of resistor generates an
input signal at one input side. The resistor of N-well and poly-silicon are used in this
circuit at each input side. As temperature changes, resistance of the resistor of N-Well
varies much more abruptly than the resistor of poly-silicon. Furthermore, the resistance
of poly-silicon is almost a constant value with temperature variation. For this reason, it
makes the voltage difference between each input side as ambient temperature (T A)
increases, and this voltage difference is amplified by the latch-type comparator. So as
to detect three temperature range of TA>75, 75>TA>50, and 50>TA, two
40
temperature detecting circuits which can detect 50 and 75 respectively are used
in this controller circuit. The output signal of temperature detecting circuit is used to
select suitable internal self refresh period corresponding to ambient temperature.
T45
ENB
+
P1
IA
IO
Output
Current
Local VDD
45
P2
T(C)
T45
45
T(C)
(b)
VA
IA
Raa
A5
32RA
A4
16RA
A3
8RA
A2
4RA
IO
Self-refresh
Command
Oscillator
Self-refresh
Clock
Counter
IO
Sampling
Clock
Generator
MSB
CTRL
Temp.
Sensor
Sample
and
Hold
Temp.
Ro
45
Time
A1
2RA
A0
RA
MSB
M : 1
D1
D2
EN
T2
T1
T45
CTRL
Time
(a)
(c)
Fig. 3-14: (a) Schematic of the temperature sensor with binary weighted variable
resistors. (b) Simplified function diagram. (c) Block diagram and timing diagram for the
self-refresh period control with temperature sensor.
Fig. 3-14(a) shows the temperature sensor based on the low-voltage bandgap
reference circuit with the binary searching scheme, Fig. 3-14(b) shows the simplified
function diagram, and Fig. 3-14(c) shows the block diagram and timing diagram for
the self-refresh period control with temperature sensor [2]. This temperature sensor has
a trigger point at 45 and generates the converted output T45. However, without a
die-to-die post-fabrication adjustment, there are severe deviations from target trigger
point as much as 20 which stems from significant process variations. So as to
alleviate the problem of deviations, the reference searching scheme which is composed
41
of binary weighted variable resistors are proposed to raise the trigger point of the
sensor by shortening the binary weighted resistors. In this way, the searching scheme
can correct the trimming error caused by process variations. By usage of temperature
sensor with binary weighted variable resistors, the self-refresh period can be extended
if on-chip temperature is less than 45. To reduce static current dissipated by the
temperature sensor, the sensor output T45 is periodically latched and held during the
self-refresh mode.
I0
T100
+
P1
I2
R2
I1
I2
P2
I1
I0
50
OUTPUT
T50
Current
TEMPEN#
I0
T50
100
T(C)
T100
R0
R1
50
(a)
100
T(C)
(b)
Fig. 3-15: (a) Temperature sensor. (b) Simplified function diagram.

In this paper [36], the extensive usage of the on-chip temperature sensor which is
abovementioned is shown. Referring to Fig. 3-15 (a), the schematic of the temperature
sensor shows that combination with two temperature sensors is to detect two target
temperatures. Output signal can be obtained by the difference in the temperature
dependency of each current. For the compensation of process variation, the values of
R1 and R2 can be measured by test signals and modified by fuse blowing in the same
way [2]. Fig. 3-15 (b) shows the simplified function diagram, and there are two trigger
42
points at 50 and 100, respectively. However in this application, the on-chip

temperature detection is used to adjust the sense-amplifier-enable delay which varies
with temperature and affects read cycle time in SRAM. If there are no thermal
compensation, SAEN delay increases with high temperature not suitable for
high-speed operation, and SAEN delay decreases with low temperature, losing
operational margin.
Self refresh mode
Thermometer Block
Dual-slope
integrating
ADC
Control Logic
Burst Refresh signal
Thermometer
registers
Thermometer ON
T3
T4
Refresh signal
T2
T1
8K cycles
8K cycles
(a)
SW1
SW5
SW3 C2
SW5
RCOMP
IREF
SW2
IPTAT
Q2
OPAMP1
Deinteg
Integ
VX
OUT
OPAMP2
Q1:Q2 = 20:1
T1
T2
Deinteg
OPAMP2
N1
N2
RCTAT
SW4
CINT
OPAMP1
IPTAT
SW2
VN
SW3 C1
IPTAT
IREF
P5 P6
SW4
Q1
Integ
bch1
bch
RPTAT
P3 P4
chopper
P2
P1
(b)
Temperature
sensor
VX
Counter
(c)
(d)
Fig. 3-16: (a) Thermometer block. (b) Self refresh and thermometer control scheme. (c)
Temperature sensor. (d) Dual-slope integrating analog-to-digital Converter.
The new self-refresh period control scheme which is presented in [37] shows that
the thermometer block is composed of a temperature sensor, a dual-slope integrating
analog-to-digital converter, control logic, and thermometer registers, and the block
diagram of thermometer block is shown in Fig. 3-16 (a) The temperature sensor based
43
on the bandgap circuit uses a technique by comparing the difference in the base-emitter
voltage of two bipolar junction transistors at different current densities. It generates
three currents, IPTAT, ICTAT and IREF, giving the temperature information, is presented in
Fig. 3-16 (b). Referring to Fig. 3-16 (c), auto-zero technique and chopper technique are
supported to reduce input offset in the operational amplifier and mismatch in the
current mirrors like transistor P1 and P2, respectively. The dual-slope integrating ADC,
which is shown in Fig. 3-16 (c), generates a pulse width stored in the register. The
thermometer registers act like a counter to transform the pulse width into a digital code.
Thus after a series of operation, the self refresh period can be adaptive and
temperature-dependent.
By the way, The Joint Electron Device Engineering Council (JEDEC) has set an
industrial standard for low-power DRAMs in 2001 with a temperature-compensation
self-refresh (TCSR) scheme. But in the TCSR scheme, it is unable to measure the exact
on-chip junction temperature because temperature information is provided by a
thermometer external to the DRAM [37]. Thus for this reason, the TCSR scheme
cannot respond to the real information of on-chip temperature and reach the
requirement of good accuracy.
3.5.3 Temperature Sensor

The temperature sensor based on modulator is earlier published in these
papers [38-41]. They feature that high accuracy specification is met for applications of
temperature sensor. To convert temperature information to a digital value, both a
temperature-dependent signal and a temperature-independent signal are necessary for
this purpose. The operating principle of this circuit is shown in Fig. 3-17(a). It shows
that multiplies the difference between two base-emitter voltages VBE which is in the
44
form of the thermal voltage
and is added to a base-emitter voltage to yield a
temperature-independent reference voltage VREF, as illustrated in Fig. 3-17(b). These

two reference voltages are the inputs to the sigma-delta analog-to-digital converter.
Referring to Fig. 3-17(c), the negative feedback of the sigma-delta ADC ensures that
average current flowing into the integrator is zero. In the next step, the decimation
filter converts the bitstream output of the sigma-delta ADC to a digital representation
of the temperature. All of the elements of the temperature sensor are incorporated in
the Fig. 3-17(d).
VREF = VBE + VBE
1.2
+ VBE -
VPTAT
+
VBE
-
ADC
V (Volt)
DOUT
VBE
VPTAT = VBE
VREF
VBE
0
-273
-55
(a)
VBE
(b)
Sensor chip
VBE/R1
clk
VBE
125
330
Temperature ()
VBEtrim/R2
VINT
bs
PROM
2nd-order
modulator
Decimation
filter + control
Biasing +
oscillator
Calibration
transistor
I2C bus
interface
Address Serial
select
I/O
modulator
(c)
(d)
Fig. 3-17: (a) Operating principle of the temperature sensor. (b) Temperature
dependency of the key voltages in the sensor. (c) Block diagram of the modulator.
(d) Block diagram of the temperature sensor.
45
Rf
M6
VBIAS
M4
M3
R2
V2
V1
VOUT
+
R1
M5
M2
Rf
M1
Fig. 3-18: Schematic of a novel low overhead CMOS temperature sensor that uses a
differential amplifier to amplify the temperature dependence of mobility and to
minimize the temperature dependence of VTH.
Fig. 3-18 shows the schematic of a new low overhead CMOS temperature sensor
[42]. The voltage difference between V1 and V2 is amplified by the differential
amplifier, and the voltage value of V1 and V2 can be expressed as following equations
V gs1 V1
V gs 2 V 2
I 1 L1
n C OX W1
I 2 L2
n C OX W 2
VTH 1
(3-10)
VTH 2
(3-11)
where W and L are transistor sizes. Choosing R1 equals to R2, the output voltage (VOUT)
can be expressed as:
V OUT
Rf
R1
n C OX
I1
I
2
W
W2
1
(3-12)
The use of differential amplifier is to cancel the VTH dependence and to amplify the
mobility dependence. Furthermore, it is able to cancel common mode noise on V1 and
V2 for improving the reliability of the temperature sensor. But it consumes significant
amount of power consumption.
46
Chapter 4
Proposed Scheme
Since the data retention time of DRAM cells is strongly dependent on temperature
and process variations, it is useful to detect the on-chip conditions for compensating
the effect stemmed from various junction temperatures. As in previous mentioned
subsections, it is most effective to adopt adaptive refresh period for reducing data
retention power. This chapter describes our proposed TASFR control scheme which
can be considered as one of adaptive self refresh control scheme. First, we introduce
our design motivation, and a simple block diagram is shown. Next, the architecture of
proposed scheme will be introduced. Here, we incorporate three important features in
this scheme. Finally, the algorithm of proposed TASFR control scheme is elaborated in
this chapter.
4.1 Motivation of Proposed TASFR control Scheme

Test Chip
Conventional
Self-Refresh
clock
Proposed
TASF
scheme
eDRAM 65nm
8Mb macro
Adaptive SelfRefresh clock
Fig. 4-1: Block diagram of the proposed scheme.

Referring to previous section 3.5, adaptive controlling refresh period is a popular
technique in self refresh control scheme. As plotted in Fig. 4-1, the simplified diagram
47
of the proposed scheme is shown. First, the conventional refresh clock which is
generated from the eDRAM macro is an input signal to the proposed scheme. After
activating the proposed scheme, it will detect on-chip information and feedback an
adaptive self-refresh clock to the eDRAM macro in order to reduce power
consumption in colder junction temperature condition.
4.2 Structure of Proposed TASFR Control Scheme

Conventional refresh clock
External VREF
Adaptive refresh clock
Timing control circuit
Adaptive clock control

Replica cell
Sample period
1bit
Adaptive refresh
period structure
Control logic and Decoder circuit
Reference
voltage
source
Discrete-time
dynamic sampling
N bit
VREF[0~N]
Latch-type comparators
Replica cell
array
Replica cell
array
Replica cell
array
Cell array 1
Cell array 2
Cell array M
Replica cell
structure
Fig. 4-2: Simplified block diagram of the proposed TASFR control scheme.
The proposed TASFR control scheme takes several advantages of early published
paper [31-33, 43, 44], and we further develop a new self-refresh control scheme and
try to overcome their shortcomings. There are three major characteristics in the
proposed TASFR control scheme, such as replica cell array structure, discrete-time
dynamic sampling structure, and adaptive refresh period structure, as shown in Fig. 4-2.
The following subsections will explain these structures in detail.
48
4.2.1 Replica Cell Array Structure

Replica cell array
Comparator (SA)
WL
DL
CDL
1bit memory cell

SN
CDT
VREF
Output
Comparator
SN
CDT
SAEN
SAEN
VSA
SN
CDT
VZSA
SN
CDT
SN
CDT
VDL
VREF
SN
SAEN
CDT
Fig. 4-3: Replica cell array structure is incorporated into this scheme. It can make a
copy of the actual memory cell array and detect the replica cells to obtain the on-chip
information.
Replica cell array structure is used to track the memory cell information by
detecting the replica of actual cell array. It can make a copy of the actual memory cell
array and detect the replica cells to obtain the on-chip information. With reference to
Fig. 4-3, a large amount of memory cells and a voltage comparator are used to track
the voltage stored in memory cells once. The voltage stored in memory cells drops by
leakage current which has been introduced in the section 2.1. As the circuit operates in
the sampling time, the word-line (WL) of replica cells will turn on, and charge stored
in the replica cells array will be read out to the data-line (DL) by charge-sharing effect
simultaneously. Next, the control signal Sense-Amplifier-Enable (SAEN) will turn on
the comparator, and then the comparator starts to compares the voltage which is on the
DL to the reference voltage. If the voltage on the DL is smaller than the reference
voltage, the output voltage will turn to high level. On the contrary, if the voltage on the
DL is still larger than the reference voltage, the output voltage will keep in low level.
49
4.2.2 Differential Sampling Structure

External VREF
Control logic and Decoder circuit

Reference
voltage
source
N-bit
VREF[0~N]
VREF[0~N]
Latch-type comparators
Replica cell
array 1
Replica cell
array 2
Replica cell
array M
`
Replica cell array
Latch-type comparator
WL
DL0
1-bit
Memory cell
1-bit
Memory cell
1-bit
Memory cell
1-bit
Memory cell
VREF0
DL1
VREF1
Code 0
Comparator 0
+
Code 1
Comparator 1
DLN
1-bit
Memory cell
1-bit
Memory cell
VREFN
Code N
Comparator N
Fig. 4-4: Differential sampling structure is adopted in TASFR control scheme. Global
process variations and system offset can be cancelled by using differential sampling
structure.
As mentioned in previous subsection 4.2.1, the replica cell array structure unit
consists of 1K replica cells and one voltage comparator. Further, we will introduce
differential sampling structure in this subsection. With reference to Fig. 4-4, the
sampling cell is composed of N of replica cell array structure units. In TASFR control
scheme, N of replica cell array structure units are used to detect the currently
remaining charge which is stored in the replica memory cells which should normally
operate in the embedded-DRAM from VDD to voltage which half VDD adds one voltage
as a margin. This operating voltage range is derived from the data retention ability of
actual memory cells in the embedded-DRAM macro. For this reason, the external
reference voltage (VREF) has to be divided into N of reference voltages (VREF0~VREFN)
for comparators from VDD to voltage which half VDD adds one voltage as a margin.
50
Therefore, the N of corresponding reference voltages are generated by means of the

resistor ladder. In one sampling period, the control signal will turn on one of the
sampling cells, and comparators will read out N-bit digital code to show the currently
remaining charge which is stored in the memory cells.
4.2.3 Adaptive Refresh Period Structure

Adaptive refresh period structure

D

Timing control circuit
1bit

Adaptive clock control
Replica cell
Sample period
Q
C
R
Logic
Control
Circuit
Control signal
(1-bit)
12bit
Clock
divider
RESET
C
R
C
R
D-FF
D-FF
Q
C
R
C
R
D-FF
D-FF
Ripple
Counter 1
D-FF
Ripple
Counter 2
Q
C
R
D-FF
Replica cell
Sample period
Fig. 4-5: Adaptive refresh period structure. Conventional self-refresh period can be
divided into slower self-refresh period at lower temperature in order to reduce power
dissipation embedded-DRAM.
The adaptive refresh period structure is shown in the Fig. 4-5. Because the read
operation of embedded-DRAM memory cell is destructive, it is necessary for many
groups of sampling cells in the TASFR control scheme. In the corresponding sampling
period, each sampling cell will start individually to detect the replica memory cells
which belong to it by reading out the currently remaining charge which is stored in the
replica cells. Next, the N of read-out voltages are compared with the N of
corresponding reference voltages, the N-bit digital codes which can be considered as
the currently on-chip information are obtained. However, each sampling period for
each sampling cell is determined by the comparison result of the past two sets of N-bit
digital codes. There are M sets of sampling cells, and each of the digital code is
51
transferred to the control logic circuit. The control logic circuit will compute a 1-bit
control signal and then transfer to timing control circuit. The timing control circuit
receive the transferring 1-bit control signal, thus the adaptive clock control circuit will
start to change the adaptive refresh clock and replica cell sampling period. When the
TASFR control scheme ends of the operation, the adaptive refresh clock terminates to
change its clock period. As a result, the adaptive refresh clock can be divided
conventional self-refresh clock into binary-weighted times of its refresh period in order
to reduce power dissipation of embedded-DRAM when it operates in the self-refresh
mode.
4.3 Algorithm of Proposed TASFR Control Scheme

1 2
1st
2nd
0
0
0
0
0
0
0
0
t1
8
3rd
4th
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
t2
t3
1
0
0
0
0
0
0
0
t4
Fig. 4-6: A simplified case study is used to describe how the TASFR control scheme
works.
As abovementioned in previous section 4.1, the features of structures used in
TASFR control scheme are briefly introduced. Thus, we will describe the algorithm
and operation steps in this section. With reference to Fig. 4-3 for a simplified
schematic view of a replica cell array structure that detects the replica memory array,
which is a group of replica memory cell formed with a NMOS transistor and a
52
deep-trench storage capacitor. The replica cell array structure is provided for detecting
the voltage of the deep-trench storage capacitor stored in the replica memory array, and
the voltage comparator is provided for comparing the read-out voltage with a reference
voltage to generate a digital code 0 or 1. If the read-out voltage of the DL from the
deep-trench capacitors is lower than the reference voltage, then the digital code 1
will be generated; whereas larger than the reference voltage, the digital code 0 will
be generated. Each replica memory structure unit has a corresponding digital code; for
example, if there are N of replica memory structure units as shown in Fig. 4-4
corresponding to the N of digital codes based on the comparison results of the voltage
comparators to form a first state, and the other sampling cells are detected one by one
after a specific period of time, and correspond to the N of digital codes to form a
second state. The control logic circuit compares the first state and the second state to
determine whether or not to extend the refresh period, and further the sampling period
which is used to readout the next replica array will extend. With reference to Fig. 4-6
for a simplified schematic view of TASFR control scheme in accordance with a
preferred embodiment of the present invention, the deep-trench storage capacitor in the
replica memory cell is operated at a voltage from VDD to a voltage which half VDD
adds one voltage as a margin.
In time t=0, the first state with a combination of digital codes equal to all zero is
set as a default value. Moreover, initial sampling period (t1) is set to the conventional
self-refresh period divided the total amount of word lines in order to promise that the
stored data is correct in the worst case condition (i.e., high temperature and fast NMOS
fast PMOS process condition).
In time t=t1, the 1st sampling cell will perform a first detection, and the voltage
comparator will compare the readout voltage of the storage capacitor of the replica
memory cell with the corresponding reference voltage to form a second state with a
53
combination of digital codes equal to (00000000), wherein the first state and the
second state have the same combination of digital codes, thus the refresh period is
doubled (2t1) for refreshing the data stored in the storage capacitor for the DRAM
macro. When the second sampling is performed, the sampling time t2, which is 2 times
larger than that the sampling time t1 in the first sampling, will plus the second sampling
period 2t1. In other words, the time t2=t1+2t1 for the second sampling.
In time t=t2, the 2nd sampling cell will perform a second detection, and the voltage
comparator will also compare the readout voltage of the storage capacitor of the replica
memory cell with the reference voltage to form a second state with a combination of
digital codes equal to (00000000), wherein the first state uses the combination of
digital codes at t=t1 as a comparison standard, and whose combination of digital codes
is equal to (00000000). Since the first state and the second state have the same
combination of digital codes, therefore the refresh period is doubled previous refresh
period (2 2t1) for refreshing the data stored in the storage capacitor for the DRAM
macro. Such that when the third sampling is performed, the sampling time t3, which is
equal to the sampling time t2 of the second sampling, will plus the third refresh period
4t1. In other words, the time t3=t1+2t1+4t1 for the third sampling.
In time t=t3, the 3rd sampling cell will perform a second detection, and the voltage
comparator will also compare the readout voltage of the storage capacitor of the replica
memory cell with the reference voltage to form a second state with a combination of
digital codes equal to (00000000), wherein the first state uses the combination of
digital codes at t=t2 as a comparison standard, and whose combination of digital codes
is equal to (00000000). Since the first state and the second state have the same
combination of digital codes, therefore the refresh period is doubled previous refresh
period (2 4t1) for refreshing the data stored in the storage capacitor for the DRAM
macro. Such that when the third sampling is performed, the sampling time t 4, which is
54
equal to the sampling time t3 of the third sampling, will plus the third refresh period 8t1.
In other words, the time t4=t1+2t1+4t1+8t1 for the third sampling.
For the same method, in time t=t4, we can get a fourth state with a combination of
digital codes equal to (10000000) and third state with a combination of digital codes
equal to (00000000). Because of their difference between previous and current digital
codes, the self-refresh period for eDRAM macro and the next sampling period for
sampling process are not changed. Therefore, we can get the corresponding sampling
time t5=t1+2t1+4t1+8t1+8t1 for the fifth sampling.
If the voltage of each memory cell is lower than the reference voltage, the
self-refresh period detecting process will be terminated. That is to say, the LSB of the
digital code is set to 1 and terminate the tracking operation. Or, if the total amount of
sampling operations reach 16, and system will terminate the tracking operation. And
the adaptive self-refresh period can be feedback into eDRAM macro for refreshing the
stored data.
The operation steps have been illustrated in detail by a simplified case study. In
summary, a flow chart of an operating method of the proposed TASFR control scheme
in accordance with the present invention is shown in the Fig. 4-7.
55
Fig. 4-7: Flow chart of TASFR control scheme.
56
Chapter 5
Design Considerations and Analyses
After describing our proposed TASFR control scheme, the design issues will be
introduced at the first of this chapter. The mismatch stemmed from inter-die and
intra-die variations should be carefully taken into account. Next, the design
considerations and design steps are listed, and layout considerations also should be
taken into consideration for better performance. Finally, we will show the analyses of
our proposed TASFR control scheme that includes adaptive refresh period and refresh
power saving.
5.1 Design issues

Variations of MOSFET threshold voltage (VT) is the most important issue in
advanced process technologies. The main factors which cause to VTH variation include
short channel effect (SCE) and process variations. Therefore, we will talk about these
factors in the following sub-sections.
57
5.1.1 Short Channel Effects (SCE)

Barrier not changed much
Barrier reduced under large VDS
VD = 0
EC
VD = 0
EC
EB
EB
VD > 0
Long-channel device
Short-channel device
(a)
VD > 0
(b)
Condition: TT, 25C, VDS = 0.1V & 1.2V

0.79
0.78
VTH (V)
0.77
VDS=1.2V
VDS=0.1V
0.76
0.75
0.74
0.73
0.72
0.71
50
100
150
200
250
300
Leff (nm)
(c)
Fig. 5-1: (a) For long-channel devices, the drain voltage has negligible effect on the
barrier at the source-channel interface. (b) For short-channel devices, the drain voltage
tends to reduce the barrier at the source end. (c) VT across effective channel length at
VDS = 1.2V and VDS = 0.1V.
Drain Induced Barrier Lowering (DIBL) is one of the factors which cause to V T
variation. Referring to Fig. 5-1(a), there is a barrier at the source-channel interface and
another barrier at the drain-channel interface. These two potential barriers are treated
as independent in the long channel devices, because they will not affect each other in
the long channel devices. For short channel devices, however, the drain bias can
influence the barrier height at the source end of the channel, as illustrated in Fig. 5-1(b).
If the drain voltage is large enough and the channel short enough, the barrier E B from
the source to the channel is reduced. Thus, the threshold voltage is then decreased. The
DIBL effect is more pronounced that the threshold voltage is decreased as the channels
58
get shorter [10].

Short Channel Effect (SCE or VTH roll-off) is an undesirable phenomenon in short
channel devices where VT decreases as reducing the channel length. Variation in critical
device dimensions translates into a larger variation in the threshold voltage as SCE
worsens with increasing the DIBL effect which is abovementioned. For this reason,
non-uniform HALO doping or pocket implant are used to mitigate this problem by
making the depletion widths narrow and hence reducing the DIBL effect. As a
byproduct effect of HALO doping and pocket implant, a short-channel device shows
Reverse Short Channel Effect (RSCE) behavior where the VT decreases as the channel
length is increased [45]. As shown in Fig. 5-1(c), it shows the threshold voltage as a
function of channel length at VDS = 1.2V and VDS = 0.1V. In the super-threshold region
(VDS = 1.2V), a strong roll-off behavior is observed at the minimum channel length due
to the high DIBL effect. However, the HALO doping will make VT higher as channel
length decreases in smaller drain-source voltage case due to weaker DIBL effect and
stronger RSCE as in the sub-threshold region (VDS = 0.1V). Thus, the VT variation is
strongly dependent on drain-source voltage and channel length.
Reverse narrow width effect (Anomalous Narrow Width Effect in NMOS and
PMOS Surface Channel Transistors Using Shallow Trench Isolation)
If the drain-source voltage becomes large, higher lateral electric fields will occur
in short-channel MOSFETs. While the average velocity of carriers saturates at high
fields, the instantaneous velocity and then the kinetic energy of the carriers continue to
increase, especially as they accelerate towards the drain. This effect is referred to as hot
carrier effects. These hot carriers may collide with the silicon atoms at high speed,
thereby creating impact ionization that cause to new generated elections and holes. The
electrons are absorbed by the drain and the holes by the substrate, thus a finite
substrate current appears, as shown in subsection 2.1.4. Unfortunately, if the carriers
59
acquire a very large energy, they may be injected into the gate oxide. Thus these
electrons are trapped in the gate oxide, and this phenomenon results in that the VT will
be shifted [46].
5.1.2 Process Variations

Line-Edge Roughness, grain boundary
Insulator thickness,
electrons trapping
Gate
Stress, strain,
temperature
Source
Drain
Random dopant fluctuation
(a)
Number
Condition: TT, 25C, 1024 trails with local

variations
600
1Lmin
500
2Lmin
3Lmin
400
4Lmin
300
200
100
0
0.66
0.71
0.76
0.81
VTH (V)
(b)
Fig. 5-2: (a) Process variations. (b) Distributions of MOSFET threshold voltage (VT)
across multiples of short channel length with W=1m.
As technology continuously scaling down into deep sub-micrometer and further
60
down into nanometer era, semiconductor device will suffer from a significant
mismatch due to uncertainties in each step of the manufacturing process. Usually, the
process variations can be categorized into 2 groups, inter-die (die-to-die or D2D)
variations and intra-die (within-die or WID) variations. Inter-die variations belong to
systematic variations, which are related to processing temperature, equipment
properties, wafer placement, and so on. As shown in Fig. 5-2, intra-die variations, such
as Random Dopant Fluctuation (RDF), Line Edge Roughness (LER), oxide thickness
fluctuation, oxide degradation, and so on, are existent because of the inherent
unpredictability of semiconductor process. However, intra-die variations are more
difficult than inter-die variations to compensate due to characteristics of random
process variability.
It is well known that the standard deviation of the threshold voltage (VT) always
satisfies the relationship, as shown in the equation (5-1).
(VT )=
AVT
WL
(5-1)
where W is channel width, L is channel length. In a sense, the slope AVT is

normalized (VT ), but only with respect to channel length. Furthermore, the threshold
voltage formula of MOSFET and the analytical
(VT ) model which takes the
dopant-induced fluctuation into account are shown in the following equation [47].
VT =VFB
(VT )=
q
CINV
qNSUB WDEP
CINV
(5-2)
NSUB WDEP
3WL
(5-3)
where WDEP is the width of the depletion layer under the gate, NSUB is the substrate
doping concentration, and CINV is electrical gate dielectric capacitance. After
substituting the equation (5-3) into the equation (5-2), the original equation (5-2) can
be rearranged as following equation
61
(VT )=BVT
TINV (VT 0.1V)

WL
where the coefficient BVT as a normalization of
(5-4)
(VT ) and TINV is electrical gate
dielectric thickness. As shown in Fig. 5-2(b), the shorter channel length sets, the larger
threshold voltage fluctuation will performs between transistors. These results are
consistent with the equation (5-4). Referring to above statements, it can be concluded
that Radom Dopant Fluctuation (RDF) is the dominant component effect in intra-die
variations and continues to increase in newly developed nanometer technologies.
5.2 Design considerations

As mentioned in previous section, process mismatch is an important design concern
in nanometer technology era. In the following subsections, we will base on the impacts
of process-induced mismatch to consider how to design our proposed TASFR control
scheme. Furthermore, we can try to optimize the circuit performance.
62
5.2.1 Resolution
(a)
(b)
Fig. 5-3: Two steps of signal amplification in sense amplifiers. (a) Selected WL
turning on and charge shared between cell capacitance CS and data-line parasitic
capacitance CDL. (b) Sense amplifier starts to amplify the small voltage swing
data-line after charge sharing.
For the read operation, there are two steps of signal amplification in sense
amplifiers. First, the selected word-line will turn on, thus stored charge are shared
between cell capacitance CS and data-line parasitic capacitance CDL. As a result of
charge sharing, the signal voltage which is developed on the floating data-line can be
expressed by the following equations.
(5-5)
(5-6)
(5-7)
where VDL is data-line voltage, VSN is storage node voltage in memory cell, and
VDL(1) is the voltage difference on data-line after readout stored 1 cell. The
stored 1 cell is severely subjected to leakage current, that is why we only discuss the
63
case that memory cell is stored 1. Furthermore, we can derive the equation about the
storage node voltage which we should take into consideration in our design from the
above equation.
(5-8)
(5-9)
For this eDRAM designed by UMC, we can derive that CD is almost equal to 16fF and
CS is almost equal to 7fF from RC extraction results. These values should be put back
into equation (5-9), hence equation (5-9) can be modified into equation (5-10).
(5-10)
Second, offset voltage in sense amplifier is the other issue which will cause read
failure in eDRAMs. As a result of 10k trials for Monte-Carlo simulation with TT
corner, temperature 25, local statistical variations, shown in the Fig. 5-4, the voltage
difference (VIN) applied to the two input node of sense amplifier should be larger
than 70mV. That is to say, the signal VDL(1) develops on the data-line (DL) have to
be 70mV larger than the reference voltage (VDD/2) in order that the sensing yield can
achieve 100%.
VDD
SP(solid line)
VDD/2
SN(dotted line)
SP
Condition: TT, 25C, local statistical mismatch

with 10k trials
100
Yield (%)
90
80
VARY=1.3V
VARY=1.4V
70
VARY=1.5V
60
SN
VDD/2
0
50
SP(dotted line)
10
20
30
40
50
60
70
VIN (mV)
SN(solid line)
Fig. 5-4: Yield of latch-type sense amplifier in eDRAMs.

As above discussion, the minimum storage node voltage which should be read-out
64
and restored can be roughly calculated. We can recall the equation (5-10) and replace
the VDL(1) by the value of 70mV.
VSN = 3.29 VDL 1
0.7 = 0.9304
(5-11)
Therefore, the VSN cannot drop to the value which is less than 0.9304V; otherwise the
sensing result is going to fail to read the stored value. For this reason, voltage range for
cell tracking is from 1.4V to 0.9304V. That is, resolution can be derived from
469.6mV/N-bit. In order to prevent digital codes from occurrence of error code (bubble
bit), offset voltage of comparators should be less than half of 1-LSB. To choose N = 8
in this case, we can derive that offset voltage of comparators should be less than
29.35mV.
5.2.2 Resistor ladder

External VREF
External VREF
Series # of
unit Res
Unit Res
VREF[0]
Unit Res
VREF[1]
Reference
VREF[0:7]
voltage
source
VREF[2]
Unit Res
Fig. 5-5: Block diagram of the resistor ladder.

For differential sampling structure, the resistor ladder should be used to supply
various reference voltages as one input node of comparators for tracking voltage of cell
storage node in replica cell array. In our design, non-saliside p+ poly (RNPPO) resistor
are used to reduce area penalty of resistor ladder because of its internal high sheet
resistance about 385(/m). For process consideration, unit resistor cell widely being
65
chosen in UMC eDRAMs product for higher yield of implementation is used. Its
resistance is about 11.6607(k) and area is 0.55m by 16.035m. For the voltage
divider should supply 8 reference voltages in our application, we should choose
58.7mV as 1-LSB value. Thus, the voltage divider is chosen as 24 multiple of our unit
resistor. For area penalty and DC path current consideration, we choose 6 times UMC
unit resistor cell as our unit cell. Its area is 1629.5m2 and dc current is 0.83A. Block
diagram of the resistor ladder is briefly shown in Fig. 5-5, and design considerations
for area penalty and dc path current are shown in Table 5-1.
Table 5-1: Design considerations for area penalty and dc current consumption about
resistor ladder in our application.
Multiple
10
Resistance (k)
34.98
46.64
58.30
69.96
81.62
93.29
104.95
116.61
Total res. (M)
0.8396
1.1194
1.3993
1.6791
1.9590
2.2389
2.5187
2.7986
Area (m2)
817.2
1088.1
1358.8
1629.5
1900.1
2170.8
2441.5
2712.2
Current (A)
1.67
1.25
1.00
0.83
0.72
0.63
0.56
0.50
66
Condition: TT, 25C, 10k trials with statistical

variations
4500
VREF0=1.342V
4000
VREF2=1.225V
3500
Number (#)
VREF4=1.108V
3000
VREF6=0.992V
2500
2000
1500
1000
500
0
-1.0
-0.5
0.0
0.5
1.0
V (mV)
(a)
Condition: TT, 25C, 10k trials with statistical
variations
3500
VREF1=1.283V
3000
VREF3=1.167V
VREF5=1.050V
Number (#)
2500
VREF7=0.933V
2000
1500
1000
500
0
-1.0
-0.5
0.0
0.5
1.0
V (mV)
(b)
Fig. 5-6: Monte-Carlo simulation for resistor ladder for 10K trials with local
statistical variations.
However, fluctuation of these reference voltages will affect performance of
comparators which is used to track the voltage of cell storage node in replica cell array.
For this reason, we not only simulate the reference voltage fluctuation with power
supply VDD, which has been mentioned in previous subsection, but also simulate
process mismatch of RNPPO resistor by the Monte-Carlo simulation for 10k trials with
local statistical variations. As shown in Fig. 5-6, the voltage fluctuations of 8 reference
voltages are less than 1mV. Therefore, the performance of comparators is rarely
influenced by process variations of the resistor ladder. Furthermore, we can derive the
67
result that distributions of 8 reference voltages are rarely temperature-dependent. This

is because temperature variation is considered as global variations for resistor ladder
which is composed of non-saliside p+ poly (RNPPO) resistor.
5.2.3 Comparator
SAEN
M2
M4
SAEN
SAEN
M7 M2
M4
M8
SAEN
VSA
VSA
VZSA
CSA
M1
M3
VZSA
CSA
CZSA
VDL
SAEN
M5
SAEN
(a)
M1
M5
M3
com
M6
CZSA
VREF
M9
(b)
Fig. 5-7: Two types of latch-type sense amplifiers widely used in memory.
In our proposed TASFR control scheme, the block of replica cell structure is used
to track the charge which is stored in amount of memory cells in order to derive an
average characteristic from the replica cell array. Latch-type comparators are applied to
read the contents of several types of memory since they achieve a fast decision due to
strong positive feedback. Referring to Fig. 5-7, two types of latch-type comparators are
shown. For our application, the latch-type comparator shown in Fig. 5-7(b) is chosen
because its input and output are separated. Thus, the data-line can directly connect to
the input node of the comparator, and there is no other charge sharing occurred after
charge sharing between cell storage node and data line.
However, mismatch issue becomes much severer in nanometer process
technologies which will affect the sampling results detected by comparators. Thus,
offset voltage (VOS) in comparators should be carefully taken into consideration in our
design. For yield simulation, there are design guidelines in reference paper [48, 49]
68
that shows how to choose transistor size for comparator yield optimization. As
mentioned in previous subsection, transistor width and length of comparators should be
chosen for achieving specification about 30mV. As shown in Fig. 5-8, M1 and M3 are
more significant in yield consideration. Current source M9 can be smaller enough to
improve yield in spite of slower sensing speed. However in our design, yield
considerations for tracking dummy cell are more important than sensing speed. That is
because voltage drops in memory cell are about 6 orders slower than comparator
sensing speed.
Condition: TT, 25C, 10k trials with local statistical
variations
100
95
Yield (%)
90
M1, M3
M2, M4
85
M5, M6
M9
80
75
70
0
Width (m)
Fig. 5-8: Yield versus channel width with different MOSFET as channel length of
MOSFETs all are 0.5m in the latch-type comparator.
In different reference voltages applied to input node of comparator, offset voltages
of comparator are also different. For constant VDD, larger reference voltage (VINDC)
makes worse yield performance. In our case, the largest reference voltage should be
taken into consideration for worst yield concern. As shown in Fig. 5-9 (a)(b), VREF0
(=1.342V) and VREF7 (=0.933V) respectively represent the largest reference voltage
and smallest reference voltage which are applied to one input node of comparator for
tracking voltage in storage node of replica cell array. Thus, we can guarantee that offset
voltages of comparators are less than 30mV even if applied largest reference voltage
for worst case design. Further, with different VDD supplies, this is also a design concern.
Different VDD supplies influence yield. As shown Fig. 5-10, we can also guarantee that
69
offset voltages of comparators are less than 30mV in our design.

Condition: TT, 25C, 10k trials with local
statistical variations
100
100
90
90
80
Yield (%)
Yield (%)
Condition: TT, 25C, 10k trials with local

statistical variations
VREF[0]
VREF[2]
VREF[4]
VREF[6]
70
60
80
VREF[0]
VREF[2]
VREF[4]
VREF[6]
70
60
50
50
40
40
0
10
15
20
25
30
10
15
20
VIN (mV)
VIN (mV)
(a)
(b)
25
30
Fig. 5-9: Yield versus input difference VIN with different VINDC.
Condition: TT, 25C, 10k trials with local statistical

variations
100
90
Yield (%)
80
vdd=1.3v
vdd=1.4v
70
vdd=1.5v
60
50
40
0
10
15
20
25
30
V (mV)
Fig. 5-10: Yield versus input difference VIN with different VDD.
5.3 Analyses of Proposed TASFR Control Scheme

In this chapter, we will first discuss the analysis on the adaptive refresh period and
cell data retention time with different temperatures. Next, the refresh power saving
with different temperatures will be discussed. Comparison between our TASFR control
scheme and other self-refresh control will be displayed in the final.
70
5.3.1 Adaptive Refresh Period

The same word line is refreshed one more time after total refresh cycles in the
self-refresh mode. On the other hand, the total refresh time should be shorter than the
cell data retention time. The total refresh cycles and total refresh time are determined
as following equations.
Total refresh cycles = WL# refreshed BANK#
(5-12)
Total refresh time = TSFR Total refresh cycles
(5-13)
where WL# represents the total amount of word line in each bank, refreshed BANK#
represents that the total amount of banks simultaneously refreshed at the same time,
and TSFR represents the self-refresh period.
In this case, the 8M-bit eDRAM are constructed by four 2M-bit sub-banks which
all are the same, and each sub-bank is composed of 16 sections which has 128
word-lines. Therefore, the total amount of word lines is 2048 (12816) in each bank.
Furthermore, the eDRAM macro can be divided into two groups which are composed
of two banks. Two banks (1st and 2nd) in each group are alternately refreshed, and 1st
bank in each group are simultaneously refreshed at the same time, and 2nd bank in each
group are so on. Thus self-refresh period is internally divided by 2, and refreshed
BANK# is 2. The same word line is refreshed one more time after 4096 refresh cycles
in the self-refresh mode. For this reason, our initial sampling time (t1) is derived from
the conventional self-refresh period (TSFR) divided by 4096.
71
Initial sampling period (msec)
2.5
TT corner
FF corner
SS corner
FS corner
SF corner
2.0
1.5
0
25
50
75
100
125
Temperature (C)
Initial sampling period (msec)
(a)
2.5
VDD+10%
VDD
VDD-10%
2.0
1.5
0
25
50
75
100
125
Temperature (C)
(b)
Fig. 5-11: The self-refresh period oscillator is generated from eDRAM macro. (a)
Self-refresh period variations with five process corners. (b) Self-refresh period
variations with core supply voltage fluctuations in the range of VDD10% VDD
10%.
First, we should discuss the conventional self-refresh period which is generated
from the eDRAM macro. As shown in Fig. 5-11 (a), self-refresh period is decreased as
increasing temperature at different process corner. As shown in Fig. 5-11 (b), voltage
fluctuations of the core supply voltage VDD (=1.2V) also affect the self-refresh
oscillator. This design consideration of 8Mbit eDRAM is because the conventional
72
self-refresh period should be more frequently at higher temperature. And the

conventional self-refresh period has been verified by UMC at higher temperature for
cell data retention ability in worst case condition. After dividing conventional
self-refresh period by the 12-bit ripple counter, this period can be used as the initial
sampling time (t1) in our circuit. Therefore, the proposed TASFR control scheme will
also guarantee to retain the stored data in cell array at higher temperature.
Condition: VARY =1.4V
Fig. 5-12: Self-refresh period across various temperatures with five process corners.
As shown in Fig. 5-12, the adaptive refresh period computed by TASFR control
scheme with various temperatures are shown. In order to keep data stored in real
memory cell array, we have to guarantee that the total refresh time for each cell on the
same word line should be shorter than the cell data retention time. Thus, we derive the
result that the refresh period is shorter than the cell data retention time. Furthermore,
the conventional refresh period can be extended in different temperature conditions.
For our algorithm, the refresh period is increased more stringently than other published
paper in order to keep stored data safely. Furthermore, the adaptive refresh period
which is computed by TASFR control scheme is not sensitive to global process corners
across various temperatures.
73
Condition: TT corner
Fig. 5-13: Self-refresh period across various temperatures with different supply
voltage VARY.
If array supply voltage (VARY) which is generated from the on-chip regulator
fluctuates with different operating condition on the chip, performance of the TASFR
control scheme is also influenced. The reasons are that the initial high voltage 1
stored in memory array will be changed with VARY, thus the digital codes generated
from latch-typed comparators will be influenced. Therefore, self-refresh period
computed by TASFR control scheme will be changed with different array supply
voltage. Referring to Fig. 5-13, the self-refresh period across various temperatures with
different supply voltages is shown.
74
5.3.2 Power Reduction in Self-Refresh mode

Condition: TT corner, VARY =1.4V
Condition: TT corner, VARY =1.4V
95.09%
(a)
(b)
Fig. 5-14: (a) Normalized power consumption of AC component of data retention

power with and without TASFR control scheme across various temperatures. (b)
Power reduction of AC component of data retention power with TASFR control
scheme with various temperatures
Referring to Fig. 5-14 (a), simulation results of AC component of data retention
power with and without proposed TASFR control scheme at TT corner are shown.
With conventional self-refresh period, the ac component of data retention power is 20
times larger than the power consumption with TASFR control scheme at room
temperature. As increasing temperature, self-refresh period which computed by TASFR
control scheme extend fewer and fewer for shorter cell data retention time. As shown
in Fig. 5-14 (b), 95.09% reduction of AC component of data retention power at room
temperature can achieve.
75
Chapter 6
Macro implementation
First, we give a brief introduction to the architectures of the embedded-DRAM
which is designed by UMC as IP (Intellectual Property) to provide an overview for
understanding the structure of embedded-DRAM. There are several techniques which
have been widely used in embedded-DRAM, such as that scrambling techniques are
used to optimize DRAM macro [50], data line transposition techniques are used to
reduce interference noise [6, 51]. Then, we will describe the architecture of TASFR
control scheme for embedded-DRAM macro and discuss its layout floor plan. Finally,
the test chip design is presented in this chapter as well.
6.1 Macro of Embedded-DRAM

m
Row
Address
Buffers
Column
Address
Buffers
Data
In/Out
Buffers
D
e
c
o
d
e
r
MC
Wordlines
Control
Circuitry
R
o
w
DL0
VPP, VDL,
VBB, VREF
Array
DLm-1
n
Voltage
Generators
Bitlines
Sense amplifier
Column Decoder, I/O Interface
Fig. 6-1: Conventional DRAM basic architecture [6, 7].

In this section, the conventional DRAM basic architecture is first described. As
shown in Fig. 6-1, the DRAM chip configuration is composed of three blocks, such as
memory cell arrays, peripheral circuits and input/output (I/O) interface circuits. We
76
will introduce these components one by one and finally give a simplified block
diagram of our used embedded-DRAM.
6.1.1 Memory Cell Arrays
BL7
BLB7
BLSA
BL5
BLB5
BLSA
BL3
BLB3
BLSA
BL1
BLB1
BLSA
BL7
BLB7
BLSA
BL5
BLB5
BLSA
BL3
BLB3
BLSA
BL1
BLB1
BLSA
WL0
WL2
WL1
WL3
BLSA
BLSA
BLSA
BLSA
BLSA
BLSA
BLSA
BL6
BLB6
BL4
BLB4
BL2
BLB2
BL0
BLB0
BL6
BLB6
BL4
BLB4
BL2
BLB2
WL5
WL7
BL0
BLB0
WL4
WL6
BLSA
Fig. 6-2: Scrambling techniques [50] have been widely used in DRAM circuit.
In memory cell arrays, it comprises a matrix of N rows and M columns and can
store M N of binary data, as shown in Fig. 6-1. Furthermore, scrambling techniques
are also used in embedded-DRAM. The used type of scrambling [50, 52] includes
folded data-line arrangement, address decoder scrambling, contact and well sharing, and
twisted data-line arrangement, as shown in Fig. 6-2. As folded data-line arrangement, it
effectively reduces the word-line to data-line (WL-DL) interference. As twisted
data-line arrangement, it effectively reduces the data-line (DL-DL) interference. Thus, it
is concluded that a combination of twisted data-line arrangement and folded data-line
arrangement [6, 51] minimizes interference noises. As mentioned in contact and well
sharing technique, it is useful to optimize cell area and layout geometry. In summary,
this scrambling techniques are more effective and commonly used to optimize
memorys layout geometry, address decoder, cell area, chip performance, yield, and I/O
pin compatibility.
77
6.1.2 Peripheral Circuits

VDD
VREF
Gen.
VREF
Periphery
VDL Buffer
VDL
VPP
Gen.
VBB
Gen.
VPP / VWL
VDD
Half-VDD
Gen.
VBB
VDL
VDL / 2
VDL Buffer
Memory array
Fig. 6-3: Internal voltage generators for modern DRAMs [53].

In peripheral circuits, it composes a row decoder circuit, a column decoder circuit,
timing and control blocks, read/write circuits, sense amplifiers, and on-chip voltage
generator. It plays an important role between memory arrays and I/O interface circuits
that they can communicate with each other. After receiving external control signals and
address signals, row and column decoder circuits will decode the address signal to
select assigned memory cells. The read/write circuits and sense amplifiers are
controlled by timing and control blocks simultaneously, thus a row or column is driven
by a driver. As a result, the external binary data are written into memory cell arrays or
the stored data are read out to I/O interface circuits.
As shown in Fig. 6-1, the on-chip voltage generators support various kinds of
power supply voltage for circuit operation such as VPP, VDL, VBB, and VREF.
Furthermore, theses supply voltages have been generated internally by using a single
external supply voltage (VDD), as shown in Fig. 6-3. The positively boosted voltage
(VPP) is used for word line boostrapping so as to eliminate the voltage drop caused by
threshold voltage of the access transistor. The negative substrate bias voltage (VBB)
which biases on substrate is used for memory cells in order to stabilize memory cell
78
operations. The half-VDD (VDL) is used for half-VDD data-line pre-charge to support
reference voltage of sensing. The reference voltage (VREF) is used to govern the
characteristics of the voltage down convertors, which generate abovementioned
voltage such as VDL, and VBB. Referring to different types of read/write circuits
published in previous paper, it shows several sensing schemes, such as VDD/2 sensing
[54], 2/3VDD sensing [55], VDD sensing [56], and GND sensing [57-63]. For
embedded-DRAM used in SoC systems, VDD/2 sensing scheme is widely used in
general product.
6.1.3 I/O Interface Circuits

In I/O interface circuits, it is composed of data input buffer, data output buffer,
and row/column address buffers. This converts external signals, such as addresses,
clock, control signals, and data inputs, to the corresponding internal signals that
activate the peripheral circuit. In addition, it outputs read data from the array as the
data output of the chip. Data input and output buffers, a write control buffer, and
control clock circuits are also typical components of the I/O interface circuit [6, 7].
1Mb
Cell
Array
1Mb
Cell
Array
1Mb
Cell
Array
Read/Write path
Read/Write path
MAIN CONTROL + Voltage regulator
1Mb
Cell
Array
WL driver + local
control
Read/Write path
1Mb
Cell
Array
WL driver + local
control
1Mb
Cell
Array
WL driver + local
control
WL driver + local
control
1Mb
Cell
Array
1Mb
Cell
Array
Read/Write path
Fig. 6-4: Annotated macro layout of the 8M-bit embedded-DRAM in 65nm EDRAM
low leakage process.
79
6.2 Test Chip Design

Pattern
Generator
VDDIO VDD
VDDQ
eDRAM 8Mb
macro w/
TASF scheme
Logic
Analyzer
Oscilloscope
Test Chip (CQFP100)

CLOAD = CPAD + CPCB + CPROBE
~ 60pF
Fig. 6-5: Block diagram of the test chip design for the 8M-bit eDRAM macro with
TASFR control scheme in 65nm EDRAM low leakage process.
The block diagram of the test chip design for the 8M-bit eDRAM macro with our
TASFR control scheme is shown in Fig. 6-5. The power pads of the test chip macro
(VDD and VDDIO) are separated from that of additional circuitry for power measurement.
Furthermore, power of additional circuitry and local output buffers are connected to
power cell pins (VDDQ) in order to reduce pin number. At the outputs of the macro, we
use an output capacitor with 60pF to simulate the output loading which include the
parasitic loading of the I/O pads, test boards and the probe of the tester. For this reason,
digital pads with large driving ability are chosen. The die size is 1.975 mm by
3.989mm, and it is an area limited design (limited by large 8M-bit eDRAM macro in
QFP 100 package type which our laboratory cooperate with UMC. Fig. 6-6 shows the
layout shot of the test chip in 65nm EDRAM low leakage process.
80
TASF scheme
URAM 8Mb Macro
Fig. 6-6: Annotated layout shot of the test chip in 65nm EDRAM low leakage process.
81
Chapter 7
Measurement results and conclusions
To verify the proposed TASFR control scheme for reducing additional AC
component of data retention power, an 8Mbit eDRAM with proposed TASFR control
scheme was fabricated in 65nm, 6-metal EDRAM low leakage process with a core
supply voltage of 1.2V and IO supply voltage of 3.3V. This TASFR control scheme is
effectively reducing 95.92% of AC component of data retention power at room
temperature. Finally, we summarize this thesis followed by conclusions and future
work discussions.
7.1 Measurement Results

Temp. = ~25, default period
533.4ns
Fig. 7-1: Conventional self-refresh period generated from the self-refresh oscillator.
Measurements of 8Mbit eDRAM test-chip confirm that it is functional at three
temperatures 25, 55 and 85. Fig. 7-2 shows oscilloscope plot for conventional
self-refresh period, and the conventional self-refresh period is 533.4ns at room
temperature (=25).
82
Temp. = ~85, TASFR control scheme
500.0ns
(a)
2.2421s
(b)
8.755s
(c)
Fig. 7-2: Oscilloscope plot showing adaptive self-refresh period at different
temperatures.
83
Fig. 7-2 shows oscilloscope plot for adaptive refresh period at three different
temperatures after computing by the TASFR control scheme. At high temperature
(=85C), the self-refresh period which is computed by the TASFR control scheme is
not changed because conventional self-refresh period should guarantee to retain the
stored data with more frequent self-refresh operation. As ambient temperature
decreasing to 55, the self-refresh period is 2.2421s which is four times longer than
the conventional self-refresh period. At room temperature (=25), the self-refresh
period is 8.755s which is sixteen times longer than the conventional self-refresh
period.
Thanks to adaptive self-refresh period computed by TASFR control scheme, the AC
component of data retention power at room temperature can be effectively reduced. As
shown in Fig. 7-3, 8M-bit eDRAM macro with TASFR control scheme can achieve
95.92% reduction of AC component of data retention power at room temperature (=25
). By the way, 78.77% reduction of AC component of data retention power at
intermediate temperature (=55). In order to guarantee to retain the stored data, it
makes sense that the self-refresh period should not be changed at higher temperature
(>85).
84
(a)
95.92%
(b)
Fig. 7-3: (a) Power consumption of AC component of data retention power with and
without TASFR control scheme across various temperatures. (b) Power reduction of
AC component of data retention power with TASFR control scheme across various
temperatures.
7.2 Summary and Conclusions

According to developing modern SoC system and 3-Dimension IC, on-chip
junction temperature gradually becomes higher and higher. However, in standby mode,
junction temperature will cool down. For embedded-DRAM system, the self-refresh
period is determined by worst case for cell data retention ability. In higher junction
temperature, using the conventional self-refresh period can guarantee to retain stored
85
data. However, in lower junction temperature, the conventional self-refresh period will
always result in additional AC component of data retention power. To solve this
problem, we propose a Temperature-Aware SelF-Refresh (TASFR) control scheme to
reduce additional AC component of data retention power at lower temperature.
The proposed TASFR control scheme is composed of three features: replica cell
structure, discrete-time dynamic sampling, and adaptive refresh period. By
discrete-time dynamic sampling the replica cell array without destructing cell structure,
the on-chip cell information can be derived. Thus, on-chip information can be
transformed into digital signal. Furthermore, digital signals are used to extend the
conventional self-refresh period with power function of two.
We implement an 8Mbit eDRAM macro with the proposed TASFR control scheme
in 65nm EDRAM low leakage technology. Measurement results show that 95.92%
reduction of the ac component of data retention power at room temperature. Finally,
we demonstrate the die photo in Fig. 7-4 and sum up the main performance factors in
Table 7-1.
86
Fig. 7-4: Die photograph of the 8M-bit eDRAM macro with proposed TASFR control
scheme in UMC 65nm EDRAM low leakage process. Top level of test-chip is filled
with dummy metal layer obscures the features in the interior of the die photograph
Table 7-1: Performance summary table of proposed TASFR control scheme
Technology
65nm, 6 metal layer
Capacity
8Mb
Configuration
512k 16
Power Supply
VDD = 1.2V
VDDIO = 3.3V
Total refresh cycles
4096
Die Size
1.975 3.989 mm2
Data retention power reduction

ratio with proposed TASFR
control scheme
95.92% @ 25
78.77%@ 55
0% @ 85
87
88
65nm
VDD=1.2
VDDIO=3.3
Trench cell
8Mb
4K
95.92%
@ 25
78.77%
@ 55
0%
@ 8
95.92%
Supply
Voltage (V)
Cell type
Capacity
Total refresh
cycles
Refresh IAC
(Low Temp.)
Refresh IAC
(Middle Temp.)
Refresh IAC
(Low Temp.)
IAC reduction
(Low temp.)
tracking +
Dynamic
sampling
Cell
NTHU
This work
Technology
Characteristics
Affiliation
Features
Works
23.78%
0%
@ 60
4.12%
@ 50
23.78%
@ 30
32
1Kb
N/A
3.3
0.35m
with ISUB-VT
mirroring
Cell tracking
NSYSU
[33]
TVLSI'05
35.0%
100uA
@ >45
N/A
65uA
@ <45
N/A
128Mb
N/A
1.8
0.15m
BGR-based
temp. sensor
Samsung
[2]
JSSC'03
75.0%
75%
@ 60
<75%
@ 50
<75%
@ 20
8K
256Mb
Stack cell
N/A
0.16m
dual-slop
ADC
Sensor with
Infineon
[37]
JSSC'03
52.9%
91uA
@ 75
N/A
N/A
4K
64Mb
N/A
3.3
0.3m
self-refresh
scheme
Dual-period
Hitachi
[64]
JSSC'98
93.0%
10uA
@ 75
4uA
@ 55
0.7uA
@ 25
N/A
16Mb
Stack cell
3.6
75.0%
115uA
@ 75
60uA
@ 50
33uA
@ 25
4K
16Mb
N/A
5.0
0.6m
detector by
RDIFFUSION
direct SN
tracking
0.5m
Temp.
Matsushita
[35]
SVLSIC'93
Cell with
Matsushita
[32]
JSSC'95
57.14%
7uA
@ 70
N/A
3uA
@ 25
512
4Mb
Stack cell
2.6
0.8m
direct SN
tracking
Cell with
Hitachi
[31]
JSSC'91
7.3 Future Works

With regard to future work of our TASFR control scheme, first, we can use offset
cancellation technique to reduce offset voltage in the comparators. Thus, the resolution
of dynamic replica cell tracking can be smaller for more accuracy.
Second, sparkles correction (bubble bit correction) technique should be used in our
scheme for digital codes comparison. Hence, the determination for increasing refresh
period will be more accurate in our initial design purpose.
Finally, we can use a counter to count the total amount of tracking cycles. Thus, the
refresh period can be closer to real data retention time for reducing more ac component
of data retention power.
89
Reference
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
Y. Taito, et al., "A high density memory for SoC with a 143MHz SRAM
interface using sense-synchronized-read/write," presented at the IEEE
International Solid-State Circuits Conference Digest of Technical Papers, 2003.
S. Jae-Yoon, et al., "A 1.8-V 128-Mb mobile DRAM with double boosting
pump, hybrid current sense amplifier, and dual-referenced adjustment scheme
for temperature sensor," IEEE Journal of Solid-State Circuits, vol. 38, pp.
631-640, 2003.
A. Valero, et al., "An hybrid eDRAM/SRAM macrocell to implement first-level
data caches," presented at the 42nd Annual IEEE/ACM International
Symposium on Microarchitecture, 2009.
S. Tomishima, et al., "A 1.0-V 230-MHz Column Access Embedded DRAM for
Portable MPEG Applications," IEEE Journal of Solid-State Circuits, vol. 36, pp.
1728-1737, 2001.
S. Tomishima, et al., "A 1.0 V 230 MHz column-access embedded DRAM
macro for portable MPEG applications," in IEEE International Solid-State
Circuits Conference, 2001, pp. 384-385, 469.
K. Itoh, VLSI Memory Chip Design: Springer, 2001.
K. Itoh, et al., Ultra-Low Voltage Nano-Scale Memories: Springer, 2007.
K. Zhang, Embedded Memories for Nano-Scale VLSIs: Springer, 2009.
A. Wang, et al., Subthreshold Design for Ultra Low-Power Systems:
Springer-Verlag, 2007.
B. L. Anderson and R. L. Anderson, Fundamentals of Semiconductor Devices:
McGraw-Hill, 2005.
H. Falk, "Prolog to: Leakage current mechanisms and leakage reduction
techniques in deep-submicrometer cmos circuits," Proceedings of the IEEE, vol.
91, pp. 303-304, 2003.
K. Roy, et al., "Leakage current mechanisms and leakage reduction techniques
in deep-submicrometer CMOS circuits," Proceedings of the IEEE, vol. 91, pp.
305-327, 2003.
S. Mukhopadhyay, et al., "Modeling and analysis of loading effect on leakage
of nanoscaled bulk-CMOS logic circuits," IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, vol. 25, pp.
1486-1495, 2006.
H. Takato, "Embedded DRAM Technologies," in Proceeding of the 30th
EuropeanSolid-State Device Research Conference, 2000, pp. 13-18.
90
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
E. Gerritsen, et al., "Evolution of materials technology for stacked-capacitors in

65 nm embedded-DRAM," Solid-State Eletronics, vol. 14, pp. 1767-1775,
2005.
H. Ishiuchi, et al., "Embedded DRAM technologies," in International Electron
Devices Meeting Technical Digest, 1997, pp. 33-36.
H. Takato, "Embedded DRAM Technologies," presented at the Proceeding of
the 30th EuropeanSolid-State Device Research Conference, 2000.
K. Itoh, et al., "VLSI memory technology: Current status and future trends,"
presented at the Proceedings of the 25th European Solid-State Circuits
Conference, 1999.
P. W. Diodato, "Embedded DRAM: more than just a memory," IEEE
Communications Magazine, vol. 38, pp. 118-126, 2000.
D. Somasekhar, et al., "2GHz 2Mb 2T Gain-Cell Memory Macro with 128GB/s
Bandwidth in a 65nm Logic Process," presented at the IEEE International
Solid-State Circuits Conference Digest of Technical Papers, 2008.
D. Somasekhar, et al., "2 GHz 2 Mb 2T Gain Cell Memory Macro With 128
GBytes/sec Bandwidth in a 65 nm Logic Process Technology," IEEE Journal of
Solid-State Circuits, vol. 44, pp. 174-185, 2009.
W. K. Luk and R. H. Dennard, "2T1D memory cell with voltage gain,"
presented at the Symposium on VLSI Circuits Digest of Technical Papers,
2004.
C. Mu-Tien, et al., "A 65nm low power 2T1D embedded DRAM with leakage
current reduction," presented at the IEEE International SOC Conference, 2007.
W. K. Luk, et al., "A 3-Transistor DRAM Cell with Gated Diode for Enhanced
Speed and Retention Time," presented at the Symposium on VLSI Circuits
Digest of Technical Papers, 2006.
L. Xiaoyao, et al., "Process Variation Tolerant 3T1D-Based Cache
Architectures," presented at the IEEE/ACM International Symposium on
Microarchitecture, 2007.
L. Xiaoyao, et al., "Replacing 6T SRAMs with 3T1D DRAMs in the L1 Data
Cache to Combat Process Variability," IEEE Micro, vol. 28, pp. 60-68, 2008.
Y. Mori, et al., "New Method for Evaluating Electric Field at Junctions of
DRAM Cell Transistors by Measuring Junction Leakage Current," IEEE
Transactions on Electron Devices, vol. 56, pp. 252-259, 2009.
H. Tanaka, et al., "A Precise On-Chip Voltage Generator for a Gigascale
DRAM with a Negative Word-Line Scheme," IEEE Journal of Solid-State
Circuits, vol. 34, pp. 1084-1090, 1999.
Y. Tsukikawa, et al., "An efficient back-bias generator with hybrid pumping
circuit for 1.5-V DRAMs," IEEE Journal of Solid-State Circuits, vol. 29, pp.
91
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
534-538, 1994.
M. Kyeong-Sik and C. Jin-Yong, "A fast pump-down VBB generator for
sub-1.5-V DRAMs," IEEE Journal of Solid-State Circuits, vol. 36, pp.
1154-1157, 2001.
K. Sato, et al., "A 4-Mb pseudo SRAM operating at 2.61 V with 3A data
retention current," IEEE Journal of Solid-State Circuits, vol. 26, pp. 1556-1562,
1991.
H. Yamauchi, et al., "A circuit technology for a self-refresh 16 Mb DRAM with
less than 0.5 A/MB data-retention current," IEEE Journal of Solid-State
Circuits, vol. 30, pp. 1174-1182, 1995.
W. Chua-Chin, et al., "A Temperature-Insensitive Self-Recharging Circuitry
Used in DRAMs," IEEE Transactions on VLSI Systems, vol. 13, pp. 405-408,
2005.
T. Tung-Han, et al., "Power-Saving Nano-scale DRAMs with an Adaptive
Refreshing Clock Generator," presented at the IEEE International Symposium
on Circuits and Systems, 2008.
Y. Kagenishi, et al., "Low power self refresh mode DRAM with temperature
detecting circuit," presented at the Symposium on VLSI Circuits Digest of
Technical Papers, 1993.
K. Sohn, et al., "An autonomous SRAM with on-chip sensors in an 80-nm
double stacked cell technology," IEEE Journal of Solid-State Circuits, vol. 41,
pp. 823-830, 2006.
K. Jung Pill, et al., "A low-power 256-Mb SDRAM with an on-chip
thermometer and biased reference line sensing scheme," IEEE Journal of
Solid-State Circuits, vol. 38, pp. 329-337, 2003.
A. Bakker and J. H. Huijsing, "Micropower CMOS temperature sensor with
digital output," IEEE Journal of Solid-State Circuits, vol. 31, pp. 933-937,
1996.
M. A. P. Pertijs, et al., "A high-accuracy temperature sensor with second-order
curvature correction and digital bus interface," presented at the IEEE
International Symposium on Circuits and Systems, 2001.
M. A. P. Pertijs, et al., A CMOS smart temperature sensor with a 3
inaccuracy of 0.1C from -55C to 125C," IEEE Journal of Solid-State
Circuits, vol. 40, pp. 2805-2815, 2005.
M. A. P. Pertijs, et al., A CMOS smart temperature sensor with a 3
inaccuracy of 0.5C from -50C to 120C," IEEE Journal of Solid-State
Circuits, vol. 40, pp. 454-461, 2005.
C. Qikai, et al., "A CMOS thermal sensor and its applications in temperature
adaptive design," presented at the IEEE International Symposium on Quality
92
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
Electronic Design, 2006.

M. Hashimoto and R. Baumann, "Investigation of cell leakage and data
retention in eDRAM," presented at the Proceedings of the 26th European
Solid-State Circuits Conference, 2000.
K. Joohee and M. C. Papaefthymiou, "Block-based multiperiod dynamic
memory design for low data-retention power," IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, vol. 11, pp. 1006-1018, 2003.
K. Tae-Hyoung, et al., "Utilizing Reverse Short-Channel Effect for Optimal
Subthreshold Circuit Design," IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, vol. 15, pp. 821-829, 2007.
B. Razavi, Design of Analog CMOS Integrated Circuits McGraw-Hill Science,
2001.
K. Takeuchi, et al., "Understanding Random Threshold Voltage Fluctuation by
Comparing Multiple Fabs and Technologies," presented at the IEEE
International Electron Devices Meeting, 2007.
B. Wicht, et al., "Yield and speed optimization of a latch-type voltage sense
amplifier," IEEE Journal of Solid-State Circuits, vol. 39, pp. 1148-1158, 2004.
A. Nikoozadeh and B. Murmann, "An Analysis of Latch Comparator Offset
Due to Load Capacitor Mismatch," IEEE Transactions on Circuits and Systems
II: Express Briefs, vol. 53, pp. 1398-1402, 2006.
M. C. T. Chao, et al., "Fault models for embedded-DRAM macros," presented
at the 46th ACM/IEEE Design Automation Conference, 2009.
M. Aoki, et al., "A 60-ns 16-Mbit CMOS DRAM with a transposed data-line
structure," IEEE Journal of Solid-State Circuits, vol. 23, pp. 1113-1119, 1988.
C. M. Chang, et al., "Testing Methodology of Embedded DRAMs," presented
at the IEEE International Test Conference, 2008.
Y. Nakagome, "Voltage regulator design for low voltage DRAMs," presented at
the Symposium on VLSI Circuits, Memory Design Short Course, 1998.
N. C. C. Lu and H. H. Chao, "Half-VDD bit-line sensing scheme in CMOS
DRAMs," IEEE Journal of Solid-State Circuits, vol. 19, pp. 451-454, 1984.
S. H. Dhang, et al., "High-speed sensing scheme for CMOS DRAMs," IEEE
Journal of Solid-State Circuits, vol. 23, pp. 34-40, 1988.
S. Romanovsky, et al., "A 500MHz Random-Access Embedded 1Mb DRAM
Macro in Bulk CMOS," presented at the IEEE International Solid-State Circuits
Conference Digest of Technical Papers, 2008.
J. Barth, et al., "A 300 MHz multi-banked eDRAM macro featuring GND sense,
bit-line twisting and direct reference cell write," in IEEE International
Solid-State Circuits Conference Digest of Technical Papers, 2002, pp. 156-157
vol.1.
93
[58]
[59]
[60]
[61]
[62]
[63]
[64]
H. Pilo, et al., "A 5.6 ns random cycle 144 Mb DRAM with 1.4 Gb/s/pin and
DDR3-SRAM interface," presented at the IEEE International Solid-State
Circuits Conference Digest of Technical Papers, 2003.
H. Pilo, et al., "A 5.6-ns random cycle 144-Mb DRAM with 1.4 Gb/s/pin and
DDR3-SRAM interface," IEEE Journal of Solid-State Circuits, vol. 38, pp.
1974-1980, 2003.
J. Barth, et al., "A 500MHz multi-banked compilable DRAM macro with direct
write and programmable pipelining," presented at the IEEE International
Solid-State Circuits Conference Digest of Technical Papers, 2004.
J. E. Barth, Jr., et al., "A 500-MHz Multi-Banked Compilable DRAM Macro
with Direct Write and Programmable Pipelining," IEEE Journal of Solid-State
Circuits, vol. 40, pp. 213-222, 2005.
J. Barth, et al., "A 500MHz Random Cycle 1.5ns-Latency, SOI Embedded
DRAM Macro Featuring a 3T Micro Sense Amplifier," presented at the IEEE
International Solid-State Circuits Conference Digest of Technical Papers, 2007.
J. Barth, et al., "A 500 MHz Random Cycle, 1.5 ns Latency, SOI Embedded
DRAM Macro Featuring a Three-Transistor Micro Sense Amplifier," IEEE
Journal of Solid-State Circuits, vol. 43, pp. 86-95, 2008.
Y. Idei, et al., "Dual-period self-refresh scheme for low-power DRAM's with
on-chip PROM mode register," IEEE Journal of Solid-State Circuits, vol. 33,
pp. 253-259, 1998.
94

96 CW Thesis

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

96 CW Thesis

Uploaded by

Copyright:

Available Formats

Temperature-Aware Self-Refresh Control

Scheme for Low Power Embedded-DRAM

(Prof. Meng-Fan Chang)

Temperature-Aware Self-Refresh Control

Student: Chih-Wen Cheng

Advisor: Prof. Meng-Fan Chang

Temperature-Aware Self-Refresh (TASFR) Control Scheme

Student: Chih-Wen Cheng

Advisor: Prof. Meng-Fan Chang

Submitted to Department of Electrical Engineering

Low Power Embedded-DRAM Applications ............................................... 1

Challenges of Low Power Embedded-DRAM ............................................. 3

Basic Operations of DRAM .......................................................................... 4

Read Operation ................................................................................. 5

Refresh Operation ............................................................................. 7

Structure of This Thesis ................................................................................ 8

Leakage and Temperature Dependency ................................................ 10

Sub-threshold Current (I1)............................................................... 11

Gate-Induced Drain Leakage (I2) .................................................... 13

Gate-Oxide Tunneling Current (I3) ................................................. 15

Hot carrier injection Current (I4)..................................................... 17

Reverse-Biased Junction BTBT Current (I5) .................................. 19

Punch-Through Current (I6) ............................................................ 20

Leakage with Temperature Dependency ..................................................... 21

Cell data retention ....................................................................................... 23

Cell Structure .................................................................................. 23

Data Retention Time ....................................................................... 27

Power Consumption .................................................................................... 30

Conventional Self-Refresh Mode ............................................................... 31

Temperature Dependency in Self-Refresh Mode ........................................ 32

Retention Time with Temperature Dependency.............................. 33

Power Dissipation with Temperature Dependency ......................... 34

Previous Works ........................................................................................... 36

Replica-Cell based Self-Refresh Control Scheme .......................... 36

Sensor Based Self-Refresh Control Scheme ................................... 40

Temperature Sensor ........................................................................ 44

Proposed Scheme ..................................................................................... 47

Motivation of Proposed TASFR control Scheme ....................................... 47

Structure of Proposed TASFR Control Scheme .......................................... 48

Replica Cell Array Structure ........................................................... 49

Differential Sampling Structure ...................................................... 50

Adaptive Refresh Period Structure ................................................. 51

Algorithm of Proposed TASFR Control Scheme ........................................ 52

Design considerations and Analyses ....................................................... 57

Short Channel Effects (SCE) .......................................................... 58

Process Variations ........................................................................... 60

Design considerations ................................................................................. 62

Resistor ladder ................................................................................ 65

Analyses of Proposed TASFR Control Scheme .......................................... 70

Adaptive Refresh Period ................................................................. 71

Power Reduction in Self-Refresh mode .......................................... 75

Macro Implementation ............................................................................ 76

Memory Cell Arrays ....................................................................... 77

Peripheral Circuits .......................................................................... 78

I/O Interface Circuits ...................................................................... 79

Test Chip Design ......................................................................................... 80

Measurement Results .................................................................................. 82

Summary and Conclusions ......................................................................... 85

Future Works ............................................................................................... 89

Fig. 3-2: Various logic-compatible embedded-DRAM cell structures. .......................... 25

Fig. 4-7: Flow chart of TASFR control scheme. ............................................................ 56

1.1 Low Power Embedded-DRAM Applications

multiple-access (WCDMA) phones, personal digital assistants, and hand-held personal

1.2 Challenges of Low Power Embedded-DRAM