Professional Documents
Culture Documents
Stephan Busse Dr. Marc Hiller Patrick Himmelmann Dr. Klemens Kahlen
Siemens AG Siemens AG Siemens AG Siemens AG
Nuremberg Nuremberg Nuremberg Nuremberg
Germany Germany Germany Germany
Abstract – As process interruptions can be very for high power compressor applications by using the
expensive, reliability and availability values of a variable same methods and assumptions for all topologies as far
frequency drive (VFD) are often two of the most important as possible.
requirements when selecting a drive. Reliability figures
are hard to compare because methods and values As a result, the same assumptions for the converter
depend on suppliers using different assumptions and control, the recooling unit and line-side transformers have
definitions of failures. The main medium voltage VSI been made because the available technologies for these
(voltage source inverter) topologies (3-level, 5-level, system components are state-of-the-art and can be
multilevel) for oil & gas applications will be compared with applied to all different topologies.
respect to their main power components (semiconductors,
capacitors, PCBs, etc.) considering component stress and The main differences between the topologies are the
field data experience. The total reliability of these drive type, arrangement and number of the main components
systems (including transformers, recooling unit, control in the power section (e.g. power semiconductors, power
cabinet) will be calculated using the same methods and section-related PCBs, DC link capacitors, other passive
assumptions. elements).
Index Terms — Reliability, MTBF, Medium-voltage The total silicon area has been calculated in order to
drives, Topology, Availability. allow an objective comparison of the quantity of power
semiconductors used in each topology. This allows a
I. INTRODUCTION comparison that is independent of the semiconductor
packaging. An example illustrates this: two 1200A-IGBT
Interruptions of "critical-to-service" processes can be modules containing 2x 12 silicon chips are equal to one
expensive if they happen unexpectedly. Because of this, 2400A-IGBT module with 24 silicon chips. Applying a
reliability and availability are often two of the most semiconductor-technology-specific relative reliability (in
2
important criteria for a customer to consider when FIT per cm of silicon area) allows an objective
specifying and selecting equipment. In general terms, the comparison of the considered topologies which use very
reliability states the probability that the drive will perform different semiconductor technologies.
its intended function for a specified amount of time without
failure. On the other hand, the availability is an indicator The benefit of this approach is that a physical
for productive uptime. parameter gives a higher degree of transparency
compared to a pure number count approach, which
A review of several leading oil & gas companies’ heavily depends on manufacturer-dependent
specification reveals requirements for mean time between arrangement of switches. The common basis of the failure
failures (MTBF) between 4 to 6 years and target rates of semiconductors and PCBs is based on the
availability figures of more than 99% are specified. Quite experience and field data obtained by several generations
often there are also references to redundant (e.g. n+1) of products in recent decades using LV IGBTs, HV IGBTs
components as part of the quest for improved reliability. and IGCTs.
Specifications often require evidence to back up vendor
quoted MTBF calculations or data based on field In case of the DC link capacitors, the energy (in kJ)
experience. stored in the film caps has been calculated to evaluate the
reliability of the total quantity of capacitors in each
The numerical definition of MTBF and availability imply topology. Independent of the nominal voltage, the same
a high degree of transparency and clarity, but the FIT/kJ-value for all topologies has been assumed.
procedure on how these values are derived is not
harmonized and therefore specific to the particular II. TYPES OF FAILURES
supplier. Even the definition of a failure can be interpreted
in many ways, and therefore might differ. In addition, There are several root causes for failures, and it is
other assumptions such as operating hours per year, important to distinguish between certain ones (see also
ambient conditions and maintenance intervals as well as IEC 60050-191). In this paper, the word “failure” is used to
the degree of detail (main or all components) might vary describe that a component (e.g. a semiconductor) doesn’t
significantly. fulfill its original function. This basic failure will cause a
failure of the whole device and subsequently the whole
To avoid some of the above mentioned problems, this drive system if the basic component or the device is not
paper compares the main medium voltage VSI topologies implemented with redundancy.
When it comes to calculating system figures, it is mathematical calculations of a system.
important to distinguish between different failure causes.
In addition to the above mentioned failures, there are
x Statistical random failures also failures that are either tolerable or negligible and
During the regular lifetime of components, therefore can be excluded.
a small quantity will fail due to “non-
reproducible” events. Here it is called Tolerable failures do not result in a stop or shutdown
“random” because the time when it occurs but may result in (a partial) deviation of performance; e.g.
is not defined. Random failures are if a UPS (uninterrupted power supply) fails, it can’t provide
caused by accidental events such as a backed up supply voltage, which is tolerable as long as
particle radiation, voltage transients, and the normal power supply is stable. If this is not tolerable,
damage caused when carrying out service the system has to be designed in a different way.
work, leading to momentary excessive
stress. This type of failure is not related to Negligible failures of single elements will not cause a
the length of service or the age of the stop (failure) of the drive system. As an example: an
device [12]. operator panel is required for commissioning and
x Systematic and operating errors adapting parameters. If this panel fails there is no need to
Improper design in development and stop the drive and it can still be accessed by other
engineering phase, incorrect handling/ channels (e.g. PROFINET communication).
manufacturing, transport, installation/
commissioning, etc. Using proven hardware, components and concept as
Improper operation, use in an unapproved well as thorough testing combined with good quality
application, an inadmissible ambient or systems and continuous field observations help to avoid
environmental conditions, etc. many failure types.
End of lifetime of single elements or a
whole device, etc.
III. TERMINOLOGY
The comparison of the different topologies do not
contain systematic failures, as they are considered to be It is important to define the “key words” and
avoidable and should be addressed and prevented by abbreviations used in this paper, because mathematical
using good quality systems, correct installation and statistical calculations have to be interpreted and can
procedures, inspections, tests, operating and only be compared if there is a common understanding of
maintenance strategies, etc. IEC 61508-4, which deals the key figures and data. IEC standard 60050-191
with functional safety, states the following in Chapter 3.6.5 describes most of the terms (but sometimes there are
(random failures): “A major distinguishing feature between special definitions e.g. for the software industry)
random hardware failures and systematic failures (see
3.6.6), is that system failure rates (or other appropriate
measures), arising from random hardware failures, can be A. Failure rate
predicted with reasonable accuracy, but systematic
failures, by their very nature, cannot be accurately A failure rate can be determined, based on statistics of
predicted. That is, system failure rates arising from failed components over a certain time interval and a
random hardware failures can be quantified with certain quantity of elements. The “failures in time” (FIT) is
reasonable accuracy, but those arising from systematic a common unit for the failure rate, defined as the
9
failures cannot be accurately quantified statistically expected number of failures per 1 billion (10 ) hours.
because the events leading to them cannot easily be
predicted.” [13] Number of failures
FIT
10 9 h
Failure rates are either provided by suppliers or
manufacturers for each component or need to be
obtained based on recorded failure data from the field. To
know the failure rate over time is also important for
predicting the life time of a device.
¦ R t
manufacturers) and the second method (based on system x
Rsystem t 1 RC t
n x n x i i
failure data from the field) to obtain a system failure rate. i C
This can be the case, for example, if the values based on i 0
field data are still not available, or the amount of field data
is too small. This can be the result of products often being This formula only applies if the failure of one
used in many different applications, industries and component does not affect other components.
countries leading to field data which cannot necessarily
be transferred to other applications. The following diagram describes a system with two
components, where at least one component must be in
In real life, it is easier to determine statistical statements operation (n=1 and x=1). Four states are possible for such
if units are produced in high quantities, as is the case for a simple system. Three of them do not lead to system
consumer electronics or automotive parts. failure; however, one state, where both subcomponents
have failed, causes the system to fail.
B. Mean time between failures (MTBF)
1
MTBF
O
C. Reliability
Rt e Ot
Fig. 8: Modular multilevel converter (M2C) topology MV IGBTs are available in either single-side cooled
module packages or double-side cooled press pack
packages. While IGBT modules are usually also used in
B. Series connected (SC) 2L-H bridge converter other applications (e.g. traction drives), IGBT press pack
devices are usually single-source devices and their usage
Just the same as the M2C, the SC-2L-H bridge is currently limited to fewer applications. The single-
converter (see Fig. 9) uses the same LV components for source situation might result in problems regarding spare
IGBTs and capacitors, which are manufactured in high parts availability over the long term. Due to the low
volumes. Depending on the motor voltage and the production quantities, innovation cycles take longer and
number of 2-level cells connected in series per phase, the amount of feedback from problems in the field is
drives with a 7-level up to 15-level output behavior can be limited as there is a lower number of devices in operation.
realized.
Due to the centralized DC link capacitor, the amount of
Additionally, the line-side rectifier can use LV line the total installed film capacitor energy of the 3L-NPP is
diodes on the cell input side. By phase shifting the line lower than in the cell-based drives mentioned before.
transformer secondaries, e.g. 30-pulse or 36-pulse line- Even though the total capacitance is less, the value of the
side performance is possible. total capacitance, and therefore stored energy, in a single
location, is many times greater than in topologies that are
In the SC-2L-H bridge topology, redundancy can be based on a distributed power architecture design.
implemented by adding 3 (or 6) additional cells and
bypass switches to all the cells enabling an n+1 (or n+2) On the line-side, the 3L-NPP can also be connected to
redundancy. Having this type of redundancy means that any conventional 12-pulse to 36-pulse diode rectifier
the drive voltage and current capability do not have to be using press-pack diodes and RC snubbers.
reduced when a cell fails. This technology also allows the
drive to remain operational with reduced power even if no The press-pack IGBTs and diodes in the 3L-NPP mean
redundant cells have been implemented. that devices can be simply connected in series without
excessive stray inductances that would otherwise result in
high switching losses. Furthermore, the conduct-on-fail
capability of press pack devices is the deciding factor for
this type of application.
By adding
x 12 (24) additional press pack IGBTs and
x 12 (24) additional press pack diodes
an n+1 or (n+2) redundancy can be realized with no
decrease in voltage and current in case of a failure.
Fig. 10: 3L-NPP (neutral point piloted) topology Fig. 12: SC-3L-H bridge topology
SC-2L-H 60 1.7 kV IGBT 15 PCBs with none 21 (w/o red.) 3 additional cells per drive 30 pulse for
bridge modules internal power 25 (n+1, n+2) 15 cells
supply 36 pulse for
18 cells
1 bypass switch per cell
3L-NPP 36 4.5 kV PP- 36 IGBT gate sine wave 5 12 additional 24 additional 12-36 pulse
IGBTs with units with filter IGBTs/diodes IGBTs/diodes
internal chip- separate central per drive per drive
diodes power supply
SC-3L-H 24 4.5 kV PP- 24 IGCT gate dv/dt filter 9 Not feasible and therefore not 3x12 pulse
bridge IGCTs units with available on the market (36 pulse
(5-level separate central total)
converter) 48 4.5 kV PP- power supply
diodes
24 4.5 kV PP- 24 IGBT gate units
IGBTs with separate
central power
36 4.5 kV PP- supply
diodes
The reliability of the drives is calculated and compared This is possible because each topology requires a
at a drive system level. The drive system (Fig. 13) used comparable number of components for the control unit,
for the comparison includes I/O units, current/voltage transformers, circuit breakers,
x the line-side transformer sensors, etc.
x the line-side rectifier and motor-side inverter
The same is true for the recooling unit due to the fact
x the recooling unit that the drive efficiency – and therefore the size of
x the converter control recooling unit – are comparable for every drive topology
The motor is not considered. having the same power rating. Furthermore, an increase
in motor power resulting in larger pumps and water/water
heat exchangers does not significantly impact the failure
rate.
As a result, the cell-based drives using LV IGBTs have In all cell-based topologies, the DC link capacitor has to
a lower relative Si FIT rate per MVA (see Total Si FIT rate be designed according to the reactive power oscillating
per MVA w/o GU/CCB in TABLE II) compared to the with twice the electrical motor frequency provided by the
drives using higher blocking press pack devices (4.5 kV) single-phase cells to the motor. Only the 3L drives (Fig.
which show a higher FIT rate due to their more complex 10, Fig. 11) have a central DC link, where the total
electrical and mechanical structure. reactive power adds up to zero due to the symmetrical 3-
phase structure. This results in a considerably lower
installed capacitor energy per MVA for the 3L-NPP drive.
In addition to the power semiconductors, the The cell-based drives – including the SC-3L-H bridge
corresponding PCB-based driver technology has a concept – require a relatively high installed capacitor
considerable impact on the reliability of the power section. energy to compensate for the energy pulsation caused by
While IGBT gate units (GU) are designed for a very low the single-phase structure of the cells. Furthermore, the
gate power to control the MOS gate of the IGBT, an IGCT nominal DC link capacitor voltage of all topologies, except
gate unit has to provide a gate current in the range of the the M2C, has to cover line voltage tolerances (e.g. +/-10%
at the PCC) under all load conditions. In the M2C reciprocal of the failure rate, a higher percentage
topology, the DC link topology can be controlled represents a longer MTBF.
independently from the rectified line voltage leading to an
optimized capacitor rating. Noredundancy
TABLE III 110% 102%
101% 100% 95%
INSTALLED CAPACITOR ENERGY AND FIT RATE OF 100%
85%
CAPACITORS OF THE 6.6 KV MV DRIVE TOPOLOGIES 90%
BEING CONSIDERED 80%
Nom. Nom. Total FIT rate 70%
relativeMTBF
motor motor avg. cap of total 60%
voltage current energy caps per
50%
[V] [A] per MVA MVA
40%
M2C 6600 1200 239% 309%
30%
20%
SC-2L-H 6600 1250 394% 489% 10%
bridge 0%
M2C SCͲ2LͲHͲbridge 3LͲNPP SCͲ3LͲHͲ SCͲ3LͲHͲ
3L-NPP 6600 1100 100% 100% bridge(IGCT) bridge(IGBT)
124%
115%
101% 102% 100%
100% 95% VI. CONCLUSION
85%
notapplicable
notapplicable
50% specific definitions and interpretations and obtain a clearer
understanding of MV drive topologies with respect to their
reliability and redundancy capability. A comparison based
0%
M2C SCͲ2LͲHͲbridge 3LͲNPP SCͲ3LͲHͲ SCͲ3LͲHͲ
on just the parts count would lead to wrong conclusions.
bridge(IGCT) bridge(IGBT) The approach described in this paper used as many
common assumptions as possible, and only differed with
Fig. 16: Relative MTBF for the different drive topologies
respect to components that are specific to each topology.
(including cell/IGBT redundancy)
As a result, it can be seen that nearly all drive
Beyond the associated costs, the step from (n+1) to
topologies have a comparable system MTBF (without
(n+2) redundancy does not significantly increase the
redundancy). Due to the existing uncertainty of the MTBF
reliability of the cell-based concepts. In other words, the
data (i.e. few devices installed in the field are comparable
benefit of (n+2) redundancy does not significantly exceed
to other consumer goods) the small differences between
the disadvantage due to the increased component count.
the topologies are negligible.
In the 3L-NPP, the effect of the reduced total reliability
Other benefits that specific topologies (e.g. lower
due to the increased component count already exceeds
harmonics of multilevel drives, superior fault behavior of
the theoretical advantage of the (n+2) vs. the (n+1)
decentralized DC concepts) can offer, and which can also
redundancy. This is why the MTBF for the (n+2) is lower
increase the availability (e.g. time to repair, serviceability)
than for (n+1) redundancy.
have not been considered but are further differentiating
factors.
nored.Ctrl. withred.Ctrl.
noCell/IGBTred. noCell/IGBTred.
nored.Ctrl. withred.Ctrl. A second result is that topologies allowing an (n+1)
(n+1)Cell/IGBTred. (n+1)Cell/IGBTred.
250% redundancy to be implemented have a significantly higher
216%
reliability when compared to drives without redundancy.
200%
179%
For all topologies, except the SC-3L-H bridge, an (n+1)
167% redundancy in the power section can be implemented.
150% 143% 145% This leads to an MTBF that significantly higher (factor 1.4
relativeMTBF
117%
124% to 1.8) as without redundancy, but only for the cell-based
115%
101% 102% 106%
100%
107% topologies (M2C, SC-2L-H bridge). The benefit of an
95% 95%
100% (n+1) redundancy in the 3L-NPP concept is limited.
85%
[1] IEC 60050-191 International electrotechnical Stephan Busse graduated from the
vocabulary; chapter 191: dependability and quality University of Applied Science in
of service Nuremberg (Germany) in 2004 with a
degree in electrical engineering. He has
[2] M. Hiller, S. Sommer, M. Beuermann; “Converter been working as Medium Voltage Drive
topologies and power semiconductors for Product Manager for many years.
industrial medium voltage converters”; Industry
Applications Society Annual Meeting, 2008. IAS
'08. IEEE 5-9 Oct. 2008 Page(s):1 - 8, Edmonton, Marc Hiller graduated from the
Canada University of Federal Armed Forces
[3] M. Hiller, D. Krug, R. Sommer, S. Rohner, „A New Munich (Germany) with a PhD degree
Highly Modular Medium Voltage Converter from the faculty of electrical engineering.
Topology for Industrial Drive Applications“, He has been an R&D and project
EPE2009, Barcelona, Spain manager for industrial Medium Voltage
Drives for 10+ years.
[4] M. Hiller, R. Sommer, M. Beuermann; „Medium-
Voltage Drives - An overview of the common
converter topologies and power semiconductor Patrick Himmelmann graduated from
devices”; IEEE Industry Applications Magazine; the Karlsruhe Institute of Technology
Mar/Apr 2010 (Germany) in 2011 with a degree in
[5] ABB Switzerland Ltd. Semiconductors; electrical engineering. He has been
Application Note 5SYA 2042-04 “Failure rates of working as an R&D Engineer for Medium
HiPak modules due to cosmic rays” Voltage Drives for three years.