Professional Documents
Culture Documents
Authorized licensed use limited to: SRM University. Downloaded on March 01,2010 at 04:50:21 EST from IEEE Xplore. Restrictions apply.
peak performance than its implementation in software VHDL SimPowerSystem User
System Simulink
running in state-of-the-art microprocessors. The results
presented in [1] and [2] show that the architecture
performs very well for large matrices especially when
the solver runs continuously. Because of the real time SimPowerSystem
constraint and the nature of the equations involved, Valdate Simulink
these scenarios are rarely applicable to real time VHDL
Library
simulation of power electronics circuits. This paper Power System
Authorized licensed use limited to: SRM University. Downloaded on March 01,2010 at 04:50:21 EST from IEEE Xplore. Restrictions apply.
PEs may run at different time step depending on the An EFSM is defined as 6-tuple where the sixth
natural frequency of the sub-circuit it is modeling. element of the tuple is a counter holding the elapsed
Therefore, the system should provide a way of up time since it has moved to a new state [4]. Because the
sampling and down sampling the PEs inputs and devices modeled here do not use elapsed time we
outputs to the same clock domain so that they can removed it from the PE EFSM representation. In this
exchange information with each other. The scheduler work an EFSM is defined as 5-tuple (S, I, V, 6, init)
synchronizes the function of all PEs and is responsible where: S is a finite set of states; I is a finite set of
for coordinating the following tasks: simulation events (i.e., transitions); V is a finite set of variables
initialization; generation of the multiple time steps that represent inputs, outputs and parameters; 6 is a
reference clocks; synchronization of the simulation and finite set of transition rules; and init represents the
exchange data between linear and nonlinear. initial state and initial values of all variables. The
Decoupling the electrical network in concurrent PEs transition rules are represented as a logical conjunction
reduces the overall complexity of the model which is of integer linear inequalities on the variables. The
highly desirable. However, because the PEs are EFSMs are made periodic by introducing a dummy
simulated independently of each other they only transition as the last event of a path starting at the
exchange information at the end of a time step. It is initial state so that it returns to the initial state [4]. In
well known that this strategy can introduce one or two order to execute an event a the value of the associated
timesteps delay in the feedback loop formed by the transition rule P must be true. Input events are
linear and non linear sub-circuits affecting the accuracy represented as a?x and output events are represented as
of the simulation and in certain cases causing it to b!y{y := E(x)} where x and y denote input and output
become instable [16]. Also, the use of iterative variables, respectively. b!y( {y:=E(x)}[P] denotes that
methods to solve this problem is not usually possible the output event b is executed at the end of the time
because of the real time constraint. step satisfying P and its output value y is E(x) [4]. This
notation will be used in the following sections to
4. Distributed Concurrent Modules describe PEs.
The EFSM representing a PE model is mapped to
Concurrent periodic Extended Finite Sate Machines hardware such that the EFSM variables (input and
("EFSMs") are used in distributed real time systems as output voltages and currents, some logical signals, etc.)
a formal representation of concurrent process [4]. We are part of the data path implemented using MAC
have modeled the PEs as a parallel composition of ("Multiply-And-Accumulate") units [ 1 ] [2] [11]. The
concurrent periodic EFSMs synchronized by a remaining signals are part of the control path and are
scheduler. Figure 3 shows their interfaces. intended to send information to or receive information
from the scheduler. The EFSM S and (I, 6) tuple
CLK - High frequency clock used as the reference elements are respectively the state and the branches of
clock for the whole system. the PE state machine implementing the control path.
TSI to TSn -Multiple time step synchronizing signals; All PEs follow the same standard interface
STC_1 to STC_n -The PE inputs are sampled when a specification which is based on the following
STC for the module is asserted; processing model: at every time step sample inputs,
REG_1 to REG_n -Asserted at the end of each process information, and hold computed data at
timestep to store the newly computed result at the outputs. The PE and the scheduler interfaces use the
output registers of the PE; following main signals:
EOC_1 to EOC_n - Tells the scheduler all PEs Special care was taken in the development of the
completed their processing before the end of the non linear models (described in section 5) as well as in
timestep. the design of the simulator itself to reduce the
occurrence of unrealistic oscillations caused by the
RST
CLK EOC
EOC1 E
EN
decoupling strategy used. Also, notice that the
simulation becomes more stable and more accurate as
S CLK
F CLK EOC2
SIC
Sub
sSTC
the time step decreases when compared to the circuit
STC nl
EOCn
natural frequencies. The scheduler can synchronize the
Voltage
PEs to let them run in series or in parallel using the
C ircut
Voltage
REG
Currents Currenrc_it 4 Voltages RST
LogicalF Logical
CLK RREG nl
same or different timesteps. Because non linear PEs
Signals, ,_ Signals can run at a timestep that is a fraction of the time
Figure 3. PE and Scheduler Standard Interface required for the linear sub-circuit PEs. Simulating the
first in series with non linear PEs do not increase much
the total timestep but reduces the decoupling delay to
Authorized licensed use limited to: SRM University. Downloaded on March 01,2010 at 04:50:21 EST from IEEE Xplore. Restrictions apply.
U)
C)
(I)
(A
0~5
4
I~ 1 I.
one timestep instead of two as is the case for parallel
simulation. This improves the simulation stability.
Also, multiple timestep simulation allows us to
optimize the FPGA resource allocation. For example, if
the linear sub-circuit requires to be simulated at 10 gs
timestep and a PWM converter requires less than 1 s,
we can tell GenVhdl to allocate less hardware for the
state space solver in order to allow more hardware for
the PWM converter. Figure 4 shows an example of a
series and a parallel sequencing.
Linear PE
short:
timestep
Sample
inputs
Linear PE
Nonn
Linear
PE
long timestep
M
Process
data
U0
Non 1/o
Linear
PE
0..
1_1
short:
timestep
1 Store at
< outputs
PE
Linear
Nons
Linear
PE
long timestep
timestep
n
> ADC/DAC
0
1/0 cards
Non :1/0
Linear
PE
implement but it has no physical relationship with how
power electronics circuit works.
In general, power electronics circuits work based on
averaging the current flowing through an inductor or
the charge stored in a capacitor. Thus, an averaging
based method is the natural choice because it
resembles the normal functioning of the circuit. We
have studied the moving average, fixed interval
average and low pass filtering. The fixed interval
average method was found to be the easiest to
implement and the one that produced the best results. It
consists in deriving the average simulated at each
timestep of the fast clock during one timestep of the
slow clock. For example, the average current between
times Ta and Tb as a function of the fast clock timestep
At is defined by (1):
i(t)
twI
b
T
n
a
n
i(T+(k 1))
k=l
i(Ta+(k l)jV
AI
(2)
largely used consist in simply keeping one sample for 5.2. Up sampling
every M samples at the input and discarding the
remaining (M-1) ones [10]. This method is simple to The fast clock PE inputs are sampled at the
beginning of each slow clock timestep. The simplest
method to up sample them is to keep their value
Authorized licensed use limited to: SRM University. Downloaded on March 01,2010 at 04:50:21 EST from IEEE Xplore. Restrictions apply.
constant during the whole slow clock timestep [10]. 6. Power Electronics Devices
Because the maximum bandwidth of the input signals
can be comparable to the frequency of the slow clock Currently the VHDL system library includes five
this approach can incur in non negligible simulation power electronic devices: diode, MOSFET, thyristor,
errors. A more accurate approach is to use a predictor power switch and power bridge. We present the models
to extrapolate the value at the next timestep based on of the thyristor and the power bridge in the following
the past history of the slow clock sampled data, and sections. The diode, MOSFET and power switch are
update it with the new sampled value at the beginning not presented here because their model can be
of the next slow clock timestep. We have considered extrapolated from the two models presented. We
only lower order predictors for two reasons: higher should notice that a similar approach can also be used
order methods have smaller region of stability; lower to model others types of non linear devices.
order methods consume lesser FPGA resources.
According to our simulation results, Adams-Bashforth- 6.1. Modeling of the thyristor PE
2 produced about half the error of Mid-Point or Euler.
Besides, it consumes little hardware so that we chose Figures 8 and 9 show the thyristor VHDL entity
to implement the Adams-Bashforth-2 extrapolation definition and its electrical and EFSM representation
rule [13] [14]. We can rearrange its interactive equation respectively. We should notice the module is fully
to use only addition, subtraction and shift operations: configurable so that it can be used in many different
k1 = f (x(n),y(n))+ O.5 f(x(n),y(n)), (3) applications. The same is also true for the others
k2= k1 - O.5 f(x(n -1),y(n -1)), (4) modules of the VHDL library. The EFSM
configuration variables are passed to the model as
y(n + 1) = y(n) + k2 *AT, (5) VHDL generic parameters while the I/0 and control
As shown in figure 6, the past samples are used to variables are passed as signals [12].
calculate the extrapolated value y(n+1). Afterwards,
we use a linear regression to interpolate the upsampled entity thyristor is
results between y(n) and y(n+1) at the fast clock generic (Vf natural:= 16; Ic natural := 0);
Tsl natural:= 1; Ts2 natural:= 1;
timesteps. In hardware, the predictor is implemented Ron natural:= 16; Lon: natural:= 0;
by the circuit shown in figure 7. NBits: natural:= 32; NBitsRadix: natural:= 8;
port (
iy(t) CLK: in std logic;
AT At
TS_sync: std_logic;
predicted RST: in std logic;
y(n+l) EN: in std_logic;
Reg_output: in std_logic;
--
Authorized licensed use limited to: SRM University. Downloaded on March 01,2010 at 04:50:21 EST from IEEE Xplore. Restrictions apply.
Using backward Euler discretization rule we can Bl1on, the ON/OFF state of bridge, during the current
write (7) which is implemented in hardware using the and the last two timesteps. Notice the RC snubber is
same approach explained in section 5. simulated as part of the linear sub-circuit so the current
flowing through it is not readily available to the UPB.
Iak (n) = Al Vak (n) + A2Iak (n -1)), (7) However, to accurately model the switching, the UPB
Where: Al = TT , A==
^A2 to should take it into account. The UPB model includes a
first order predictor to estimate this current. When the
Lon + Ron n + Ron TS new bridge voltage VQ 1 arrives one time step later the
a-
Rs
v v v
Os
C new value is used to correct the estimated value. Figure
Ar K sample!{Vak=inp}[STC== 1 ] 11 shows the UPB data path.
turnon!{lak=lcalc}[P1]
G
Rs Cs
Ql EOsi s Bridge #1
K sample!{VQ1 =inp}[STC==1 ]
A GI Dl :.Rsl +)flywheel!{bl=(la/2}[Dl=on& D2=on]
I
Authorized licensed use limited to: SRM University. Downloaded on March 01,2010 at 04:50:21 EST from IEEE Xplore. Restrictions apply.
Where n is the timestep number and Ad, Bd, Cd and 9. Experimental Results
Dd are the discrete state space matrices. The high level
architecture of the state space solver is shown in figure In this section we present the simulation results using
12. It implements the linear sub-circuit PEs. The the FPGA simulator on a three-phase DC-AC converter
control signals MemAddr, SigAddr, doing_X_not_U, shown in figure 13 [5]. The circuit works on the
etc. are generated by its internal state machine. principle of converting energy from the DC voltage
Its basic module is the VVM ("Vector-To-Vector source into a 60 Hz sinusoidal current flowing through
Multiplier") which multiplies one column of a matrix the three-phase charge RL according to the
(Ad, Bd, Cd or Dd) by a vector (X or U). It includes a configuration parameters set to the PWM modulator.
MAC unit, a blockRAM memory bank configured as The gates of the MOSFET power switches are
512 entries x 32 bits, multiplexers and a state machine. controlled by a sinusoidal PWM modulator which had
The MAC units can run at a minimum clock frequency
of 150 MHz with a latency of 6 clock cycles [1] [2]. its carrier frequency set to 2 kHz at a modulation index
We should notice the matrices can have very of 0.85. The PWM was configured to generate a 60 Hz
different dimensions. Also, the minimum dimension sinus at the output when the high frequency carrier is
can vary from 1 up to hundreds of states. GenVhdl filtered out. The results are comparable to those
maps the ODE state space formulation into the FPGA obtained with SymPowerSystems from Mathworks,
using a priori algorithm. GenVhdl takes into account which is a commercial power system simulator tool
some parameters such as the number of VVMs to use largely used by electrical utilities and research centers
for the simulation, matrices dimensions, MAC pipeline [6]. The simulation used a fixed timestep Ts shown in
latency and the number of clocks per timestep to the schematic diagrams.
calculate the optimum distribution of VVMs per Lon=225uH@ O1us
Ron=0.l
Discrete,
Ts, l.-005.
matrices that minimizes the timestep. The current
implementation can solve a state space with 10 states
in less than 0.4 gs.
mem 0 MAC 0
MemAdd n
en
clk
blockRam ~~~~~Register file
mem 1MAC 1 ~~~~~X(O)
addr q mem res ~~~X(1)
___add r q f mem ries
SigAddr X U SigAddr
U(O) X C__
U(n)
Figure 12. State Space Solver Macro Architecture
8. Others Modules
Besides the modules presented in the previous
sections, the VHDL system library includes PEs to
realize many others functions such as voltage and
current sources, Delta-Sigma and PWM modulators, PI
and PID controllers, digital filters, Clark/Park
transforms, etc. The implementation of some of these
modules and their experimental results are presented in
[15] [16]. Figure 14. Simulation Results Using FPGASim
Authorized licensed use limited to: SRM University. Downloaded on March 01,2010 at 04:50:21 EST from IEEE Xplore. Restrictions apply.
10. Conclusion
[5] T.L. Skvarenina, "The Power Electronics Handbook,"
This work presented the implementation of a CRC Press, 2002.
DRTPSS which is fully realized in a FPGA. The [6] SimPowerSystems, Matlab Inc., 2005.
results demonstrate that modern FPGAs are effective
platform for implementing simulation algorithms and [7] ISE Development System, Xilinx Inc., 2005.
can compete favorably with high-performance
microprocessors and DSPs in applications where the [8] M. Matar, M. Abdel-Rahman, A. Soliman, "FPGA-Based
algorithms can be parallelized. The test environment Real-Time Digital Simulation," International Conference on
built consists of an AMD XP2400+ microcomputer Power Systems Transients (IPST'2005), 2005.
and a Digilent Inc. XUP Virtex II Pro Development
[9] G.R. Morris and V.K. Prasanna, "Pipelined Datapath for
FPGA Card with a 2VP30-7-FF896 Virtex II Pro an IEEE-754 64-Bit Floating-Point Jacobi Solver," 9th High
FPGA. Currently, we can simulate small and medium Performance Embedded Computing Workshop, 2005.
size power electronics networks such as DC-AC
converters, AC-AC cicloconverters, etc. with a [10] J. Franca, A. Petraglia and S. K. Mitra, "Multirate
timestep smaller than .4 gs. We should notice that the Analog-Digital Systems for Signal Processing and
smallest timestep reported in the literature is around 2 conversi6n," Proceedings of the IEEE, Vol. 85, No. 2,
gs. The non linear PEs can run at a time step of about
February 1997.
0.2 gs. Therefore, the state space solver is the system [11] B. Parhami, "Computer Arithmetic: Algorithms and
bottleneck. We recently found inefficiencies in the Hardware Designs," Oxford University Press, 2000.
MAC design that should allow us to run it at 200 MHz.
Also, we devised a better way of coordinating the [12] D.J. Smith, "HDL Chip Design: A Practical Guide for
VVMs to increase the number of operations per Designing, synthesizing and simulating ASICs and FPGAs
timestep. We estimate these changes combined should Using VHDL or Verilog," Doone Publications, 8th Ed. 2000.
allow the linear PEs to achieve a timestep of 0.2 gs.
[13] A. Ralston and P. Rabinowitz, "A First Course in
Numerical Analysis," 2nd Ed. Dover Publications Inc., 2001.
11. Acknowledgement
[14] L.O. Chua and P.Y. Lin, "Computer-Aided Analysis of
J. C. G. Pimentel thanks Xilinx Inc. for its support Electronic Circuits: Algorithms and Computational
providing the FPGA development kit used for the Techniques," Prentice Hall, 1975.
simulation and experimental results, and Dr. Yosef
Tirat-Gefen at Castel Research Inc., Dr. Guilherme [15] J.C.G. Pimentel, H. Le-Huy and G. Sybille, "A VHDL
DeSouza at Univ. of Missouri-Columbia and Dr. Library of IP Cores for Power Drive and Motion Control
Antonio Mesquita at COPPE/UFRJ for valuable Applications," CCECE'2000, 2000.
feedback during the preparation of this paper. [16] J.C.G. Pimentel, H. Le-Huy and G. Sybille, "An FPGA-
Based Real Time Power System Simulator for Power
12. References Electronics," 7th MAPLD International Conference, 2004.
[1] G.R. Morris, V.K. Prasanna and R.D. Anderson, "A [17] T. Maguire and J. Giesbrecht, "Small Time-step (<
Hybrid Approach for Mapping Conjugate Gradient onto an 2,uSec) VSC Model for the Real Time Digital Simulator,"
FPGA-Augmented Reconfigurable Supercomputer," International Conference on Power Systems Transients
FCCM'06, 2006. (IPST'2005). 2005.
[2] X. Wang, M. Leeser, and H. Yu, "A Parameterized [18] C. Dufour, J. Belanger, "A Real-Time Simulator for
Floating-Point Library Applied to Multispectral Image Doubly Fed Induction Generator based Wind Turbine
Clustering," 7th MAPLD International Conference, 2004. Applications", Proceedings of IEEE 35th Power Electronics
Specialists Conference (PESC 2004), June, 2004.
[3] J.C.G. Pimentel and H. Le-Huy, "Developing a New
Architecture for Digital Real-Time Power System Simulators [19] J.C.G. Pimentel, "A High Performance Architecture for
Based on Pentium II and FPGAs," in ICDS'97 Conference Real Time Simulators Based on FPGA Hardware
Proceedings, 1999. Acceleration," Real Time Simulation Systems, December,
2006 (submitted for presentation).
[4] T. Kitani, Y. Takamoto, K. Yasumoto, A. Nakata and T.
Higashino, "A Flexible and High-Reliable HW/SW Co-
Design Method for Real-Time Embedded Systems",
Proceedings of the 25th IEEE International RTSS, 2004
Authorized licensed use limited to: SRM University. Downloaded on March 01,2010 at 04:50:21 EST from IEEE Xplore. Restrictions apply.