You are on page 1of 7

2017 IEEE 11th International Symposium on Embedded Multicore/Many-core Systems-on-Chip

Real-time UAV Attitude Heading Reference System


Using Extended Kalman Filter for Programmable SoC
Shunsuke Mie1, Yuichi Okuyama1, Yusuke Sato1, Ye Chan2, Nam Khanh Dang1 and Ben Abdellash Abderazek1
1
Graduate School of Computer Science and Engineering
The University of Aizu, Japan
2
Universities' Research Centre
University of Yangon, Myanmar

Abstract—We implemented an extended Kalman filter (EKF)


hardware for a real-time attitude heading reference system
(AHRS). Attitude estimation is an important process for small
unmanned aerial vehicle (UAV) control. Small UAVs mounts
smaller sensors due to the limitation of space and power. However,
effects of noise components in sensor output become large if the
size of sensors become smaller. We considered using EKF to
reduce effects of noises by sensor fusion. Moreover, real-time and
highly accurate attitude estimation is required for an autonomous
flight of UAVs. To reduce loads of processors in a real-time
calculation of EKF, we implemented a circuit of EKF by C-based
hardware design in Programmable SoC. We achieved five times
faster EKF processing on FPGA than the processing on ARM
Cortex-A9 @677Hz with single core processing.
Fig. 1. UAV System Overall
Keywords— Attitude Heading Reference System; Extended
Kalman Filter; FPGA; C-based hardware design
For real-time implementation, digital signal processors
I. INTRODUCTION (DSP) integrated CPU is used in several small UAVs systems
[2]. On the other hand, some papers tried to adapt Field
Autonomous flights of small unmanned aerial vehicles Programmable Gate Array (FPGA) to small UAV for this
(UAVs) are widely used for surveillance, observation, problem. Work in [1] uses FPGA for sound filtering and image
investigation in dangerous places and the place where humans processing. Also, work in [4], the FPGA has used image
cannot enter. Also, autonomous small UAVs are developed for processing for self-localization of the UAV. We developed
reconstruction support from disasters, home delivery services EKF circuit for FPGA since faster and efficient processing
and so on. Autonomous flight requires high accurate attitude was needed to implement real-time processing. EKF
estimation and self-localization. In this paper, we focused on algorithm contains matrix multiplications and matrix
the first one. Attitude and heading reference system (AHRS) inversion operation that have high spatial and temporal
computes the estimation of an attitude of a vehicle. AHRS read parallelism. In such a case, we can improve the performance
the observed values from sensors, then, estimate the attitude. of flight system using FPGAs.
The system must estimate the attitude in real-time to give a Some approaches of speeding up EKF with FPGA are
proper control to actuators or motors. proposed [5], [6]. Two kinds of system architecture are
In developments of AHRS, we face to less accuracy of considered in those articles. The first method implements an
estimations and the limitation of computing resources. The accelerator as an external co-processor on a memory bus. The
system that is shown in Fig. 1 obtains data from sensors and second one is designing an accelerator and embed it in a soft-
controls attitude of vehicles in real-time. However, small core processor to implement custom instructions for the
sensors produce inaccurate values affected by noises. processor. Work in [6] implements both methods and evaluate
Accuracy estimation reduces the noises from sensors. In for mobile robot’s position estimation with simultaneous
general, small UAV systems use low-pass or high-pass filter localization and mapping. As a result, the accelerator
for noise filtering [1]. Besides, some AHRSs use extended implemented as custom instructions obtains better
Kalman filter (EKF) to improve attitude estimation accuracy computation performance in comparison with the other
[2] [3]. AHRS can obtain better accuracy than low-pass and methods. This method got speeding up 4 times and 2 times
high-pass filter by EKF. However, the computational compared with software implementation and accelerator on
complexity of EKF prevents to implement real-time attitude the bus. In contrast, [5] is using the second method that
estimation in typically embedded microprocessor on small offloads a part of processing to FPGA as a co-processor. This
UAV system. implementation uses system-on-a-chip FPGA (SoC FPGA)
[7], [8] that is integrated hardened processor which usually use

978-1-5386-3441-7/17 $31.00 © 2017 IEEE 136


DOI 10.1109/MCSoC.2017.26
ARM processor and FPGA. The advantage of this chip is high instead of Euler angles to avoid gimbal lock. The system
clock frequencies of the included processor compared with the translates the quaternion to Euler angles in each cycle.
soft-core processors. Thus, we can obtain better performance
depending on the target system.
The two types of methods are proposed for implementing
of hardware EKF accelerators that contain hardware about all
EKF operations running on matrix operation unit. We can be
expected higher processing performance on the first method,
even though it will increase the development time, area and
reduce flexibility. As for the problem of development time and
flexibility, [6] proposed an implementation using C-to-
Hardware (C2H). This tool can create hardware directly from
ANSI C source code. As a result, this paper obtains 4 times
speeding up using 18% FPGA resources. These methods 㻌
which generate hardware from software program are called Fig. 2. Overview of proposed AHRS
high-level synthesis (HLS) and gathering attention recently.
Work in [4] proposed a pose estimation hardware III. EKF ALGORITHM IN PROPOSED AHRS
accelerator that is made by Vivado HLS, and obtain speeding In this section, we show you EKF algorithm that is used by
up 1.6 times. From this result, we think this method is also AHRS. At first, we describe a basic part of EKF algorithm,
effective to improve processing performance even in EKF, and then, we explain how to adapt the EKF to proposed AHRS.
thus, we use HLS for implementation AHRS circuit.
In this paper, we propose AHRS that has better attitude A. Extended Kalman Filter
estimate accuracy and faster computation speed. For We introduce EKF algorithm to estimate small UAV’s
improvement accuracy, we use EKF that is adapted for a non- attitude. EKF is a kind of Kalman filter adapted to non-linear
linear system from Kalman filter for AHRS. Also, we achieved systems. TABLE I. shows variables, those dimensions and
real time by implementing the EKF algorithm on FPGA. description used in EKF algorithm description. Variable k is
The remainder of this paper is organized as follows. Section the time of calculating. We assume that system inputs of  ∈
III describes extended Kalman filter and the method of  ,  ∈  ,  ∈  are the state, observation and control
adaption to the proposed AHRS. In this paper, the proposed vectors respectively. This algorithm has two steps that are
circuit is implemented by using C-based RTL design. In Prediction Step and Update Step. These steps process every
Section IV, we brief the design method and discuss time sampling time. In the Prediction Step,  and  are
implementation and optimization of the proposed circuit using predicted using the estimated value in previous time
C-base RTL design. Section V presents the evaluation of the represented in equations (1) and (2). The Update Step
circuit and discussion about the result. The last section gives calculates estimated the state vector with predicted value, as
the conclusion of this paper and introduces the future work. shown in equations (3) - (5).
II. PROPOSED SYSTEM ARCHITECTURE OVERVIEW Prediction Step:
We describe the proposed AHRS in this section. This
 = ( ,  ) +  
system inputs 9 values from sensors and outputs 3 values that
are Euler angles of the small UAV’s attitude. EKF is employed
in our system to mix the 9 values to eliminate the sensor noises.
This technique is known as sensor fusion.  = !
 "
 ! + # 
A. System Architecture
The proposed AHRS is shown in Fig. 2. The system has 9 Update Step:
input values from sensors as follows;
$ =  % (%"  % + &)  
  =  ,  ,   from gyroscope
 =  + $  − ℎ( )  
 = 
 ,
 ,
  from magnetometer

 =   ,  ,   from accelerometer  =  − $ % P*  



The attitude of small UAV is described by Euler angle (Pitch, The function  in equation (1) compute the predicted state
Roll, Yaw) that are generally calculated by only the output of of the system from the previous estimated state, and the
a magnetometer ( ). This system, however, also uses  and function ℎ in equation (4) transforms from estimated state to
to fuse sensors for noise elimination. Besides, the proposed measurement a state space. These non-linear functions are
system holds attitude by a quaternion = ( ,  ,  ,  ) linearized at the point of  . Therefore, Jacobian is calculated

137
from functions  and ℎ . Jacobian, H and F, the matrix is − − − 
represented in equations (6) and (7). 1   
O = Q−   R (ST − IUTV )WX
2 
-()    − 
! = / 
- 
0
O [ 
(* ) =  + Y 0 0 0Z 
|O|
-ℎ()  
% = 1 
-  3) Observation Model
0

Observation model is constructed from the quaternion in the


TABLE I. VALIABLES USED IN EKF ALGORITHM state vector and observed value from the magnetic sensor that
Symbol Size Description is represented by
 ,
 ,
 . Equation (11) is shown the
23  Predicted state vector observation equation. First step, the obtained value from the
23  Updated State vector magnetic sensor is rotated body to navigate frame by equation
43 × Predicted estimate covariance (14), and then heading is calculated in equations (12) and (13).
43 × Updated estimate covariance Finally, in equation (11), the predicted measurement value is
53  Control vector
calculated from the predicted state.
73 × Jacobian of state transition matrix
83  Process noise
93  Measurement noise 2(  −   ) 
×
⎡ ⎤
: Covariance matrix of  2(  +   )
ℎ() = ⎢  
 × Covariance matrix of ; 

<3 × Jacobian of state measurement matrix ⎢2(0.5 −  −  )⎥
"
>3  Measurement vector ⎣ &I ⋅a ⎦
?3 × Kalman gain
@ A →  State prediction function
C  →  Mesurement prediction function
 +


e
 
B. Attitude and Heading Reference System a=Q R
0
This section mentions EKF which is adapted to AHRS.
Some researchers used EKF for AHRS [2], [9] to improve the

accuracy. In this paper, we also adapted EKF for AHRS.


 
1) State Vector
f
 g = &I ⋅ f
 g
We used a quaternion for orientation representation. It can


be solved gimbal lock when we used Euler angle for the
representation. In addition, the quaternion can reduce the
computation complexity compared with Euler angle. This &I  
method is used for AHRS by some recent research [2], [9]. The  +  −  −  2(  −   ) 2(  +   )
symbol 0, 1, 2, 3 are using for quaternion representation = h 2(  +   ) 
−  +  −  2(  −   ) i
as the content of the state vector. Furthermore, the state vector 2(  −   ) 2(  +   )  −  −  + 
has ω bias of gyroscope too that is {I , I , I }. As a
result, the state vector is shown in equation (8).
4) Jacobian Matrix
 = J0 1 2 3 I I I K 
"
  There are two Jacobians in this system, these are H and F of
function ℎ and , respectively. Equation (15) represents how
2) Process Model to calculate in Jacobian, the value of O and
are depended on
A process model is calculated from the quaternion, the bias dimensions of x that is the state vector and function f. In this
of gyroscope in the state vector and measured value from case, the dimension of the state vector is 7 (= O). Parameter
gyroscope that is represented as LMN . The predicted
is determined by the dimensions of ℎ and . Therefore,

estimation is calculated by equations (9) and (10). These for % and ! are 6 and 7.
equations multiply differentiated quaternion, observed value
from gyroscope that is subtracted by bias in the state vector
and value dt which is sampling rate of sensors. After
calculation of n, the n is normalized. Finally, the normalized
value is added to the current state. 㻌

138
- - 
 checking. This step is repeated until the functions work
⎡ ⋯ ⎤ correctly.
-
- ⎢  - ⎥
j= = ⋮ ⋱ ⋮ ⎥ In the second step, the C-based program is translated to RTL
- ⎢- - ⎥ code using C synthesizer. We define interfaces and directives
⎢  ⋯ to optimize the defined C functions. The synthesizer reports
⎣ - - ⎦ the number of calculation cycle, the latency in the cycle, the
maximum clock frequency and the occupied FPGA resources.
Hence, developers can optimize the C code to back to the first
5) Noise components step if they need.
For the variance of noises, We set n = 0.000012 for In the final step, the generated RTL code is checked by
quaternion, nI = 0.00303 for biases to angular rate C/RTL co-simulation. This step contains following stages.
measurements, nT = 0.0254 for acceleration, n 
= 0.221
respectively. Thus, we get covariance matrices as follows. 1. Execute the test bench code in C to generate input and
&= diag(nT , nT , nT , n
 
, n 
, n )   output vectors for RTL design
2. Simulate the generated RTL using the input vector

# = diag(n , n , n , n , nI , nI , nI )    generated by stage 1.


3. Compare output signals with the result of 1.
IV. IMPLEMENTATION OF PROPOSED SYSTEM In these procedures, developers make a circuit by checking
We describe an implementation of proposed system. The the result of each steps and optimize. Moreover, this tool also
Board that is named Zynq [7] of designed system, C-based creates control unit and interface circuit automatically. The
FPGA design with Vivado HLS [10] and implementation of interface can select AXI Master, AXI Lite and AXI Slave.
AHRS circuit are explained in this section. B. Overview of AHRS circuit
We explain the details of the system implementation in this
A. C-based RTL Design Flow section. Our circuit is implemented by C-based FPGA design
We describe the flow of implementation using Vivado HLS that is described in the previous section. We implemented
that is provided by Xilinx Inc. Fig. 3 shows steps of hardware AHRS in C language. Fig. 3 shows a flowchart of AHRS. We
design with this tool. used variables in TABLE I. .

p
Fig. 3. Design Flow in Vivado HLS Fig. 4. Flowchart of AHRS

In the first step, we implement algorithms and test-bench. At the first, the state vector, the error covariance matrix, the
Functions are written in C-based languages that are C, C++ or noise measurement matrix and noise model covariance matrix
SystemC, and then, the implemented programs are tested with are initialized. The dataset for evaluation of this
test bench also written in C-based languages. The test bench implementation is embedded in the program. The main loop
code includes dummy inputs, calling functions and result contains time update and the measurement update processes.

139
In time update step, the system updates the estimated state /* matrix service functions */
vector x*A and covariance matrix P*A that represent the /* define D as matrix dimension */
/* Type data_t is 32bit floating point*/
probabilistic model of state description. In measurement
update step, we update state vector  and FRYDULDQFH matrix void matrix_mult(data_t a[D][D],
 with estimatedA A , and observation values obtained data_t b[D][D], data_t c[D][D]){
from sensors. The output is reused in the time update step. In for(int i = 0; i < D; i++)
this paper, we used values from embedded data in the program for(int j = 0; j < D; j++)
instead of sensor data to concentrate with the evaluation of this for(int k = 0; k < D; k++)
c[i][j] += a[i][k] * b[k][j];
circuit. }
In the implementation of a circuit for proposed AHRS, most
of resource and time-consuming parts are matrix void matrix_inverse(data_t a[D][D],
multiplications and inversion. TABLE II. shows the number data_t inv_a[D][D]){
of these operations. Matrix multiplication is executed 10 times, data_t buf;
int i, j, k;
and matrix inverse operation is executed once. These two have
/* initialization */
high computing complexity, however, also have high for(i = 0; i < D; i++)
parallelism, hence, we can improve performance using FPGA. for(j = 0; j < D; j++)
In addition, we describe implementation each critical modules inv_a[i][j] = (i == j) ? 1.0 : 0.0;
of this system with Vivado HLS in the next subsection. /* Gaussian elimination */
for(i = 0; i < D; i++) {
TABLE II. NUMBERS OF MATRIX OPERATIONS IN AHRS CIRUICT buf = a[i][i] != 0 ? 1 / a[i][i] : 0;
for(j = 0; j < D; j++) {
Type of operationss Number of execution a[i][j] *= buf;
Multiplication 10 inv_a[i][j] *= buf;
Inversion 1 }
Addiition 2 for(j = 0; j < D; j++) {
Subtraction 1 if (i != j) {
buf = a[j][i];
Transportation 2
for (k = 0; k < D; k++) {
C. Arithmetic Operations a[j][k] -= a[i][k] * buf;
inv_a[j][k] -= inv_a[i][k] * buf;
A part of arithmetic operation implementation is shown in }
Fig. 5. The function matrix_mult contains straightforward }
implementation for matrix-matrix multiplication using a }
simple triple loop. The function includes the pipeline directive }
}
for optimization at synthesizing. This increases the
performance about 18 times and increases resources about 2 Fig. 5. Synthesizable C-code for matrix multiplication and inversion
times in our case. The matrix_inverse function employs. The
Gaussian elimination algorithm for matrix inversion. We also V. EVALUATION AND DISCUSSION
use a pipelining directive in the most inner loop of the time-
In this section, we evaluate basic matrix operations and the
consuming loop. As a result, this optimization makes
whole AHRS system described in section IV. We used Vivado
performance increase about 2.9 times and increases resources
Design Suite HLx 2016.2 for synthesis tool and Vivado HLS
about 1 % in our case.
2016.2 for C-based implementation. These are provided by
Function % and ℎ described as equations (7) and (11) Xilinx Inc. Moreover, we implement our system on Zynq-
require addition and multiplication, hence, the synthesizer
7020 (XC7Z0201CLG484C).
adds DSPs automatically. However, in this instance, we cannot
1) Evaluation environment
improve performance effectively. For this problem, we adapt
allocation directive to restrain using DSPs. We used Zynq-7020 board that includes xc7z20 for
The data type (data_t) in this calculation is single precision performance evaluation of AHRS as a system. This chip has
consisting 32-bit floating point data format which is defined two areas which are Processing System (PS) and
IEEE-754 standard. Programmable Logic (PL). PS includes ARM Cortex-A9
processor which can configure clock frequency, peripheral
controller and memory interface, thus this system can work
independently. The other hand PL is FPGA and connected with
PS via some ports.
Our proposed circuit is contained in PL, and connected to
32-bit general purpose AXI master and slave port. Slave port
uses MMI/O register for control circuit from the processor in
PS. The AHRS circuit accesses data that is the state vector and
the noise covariance matrix via the AXI Master port.

140
TABLE V. RESOURCE UTILIZATION OF MODULED FUNCTIONS
Resource BRAM DSP FF LUT
Utilization
C(2) 0(0%) 7(3%) 1389(1%) 1807(3%)
<(2) 0(0%) 7(3%) 1062(~0%) 1607(3%)

C. Performance of proposed AHRS Circuit


TABLE VI. and TABLE VII. show the synthesis results of
AHRS circuit. The maximum frequency of the system is
106MHz and the maximum latency and interval are 5598, and
5697 cycles. According to the results, this system calculates one
sample in 425 μsec . Thus, the system can deal 2.3 KHz of
samples from sensors.

Fig. 6. Proposed Programmable SoC Architecture TABLE VI. CALCULATION CYCLES OF PROPOSED CIRCUIT
Minimum Maximum
B. Performance of arithmetic operations Latency (cycles) 5598 5696
TABLE III. shows the number of resources used for a matrix Interval (cycles) 5599 5697
multiplication unit and utilization of programmable logic part
of Zynq-7020. We set the number of dimensions of the matrix TABLE VII. RESOURCE UTILIZATION OF PROPOSED CIRCUIT
to 7 that correspond with the dimension of the state vector . Used resources BRAM DSP FF LUT
The clock frequency of this circuit is 106MHz. The maximum Total 45 48 13270 27088
clock cycle and the interval to compute the multiplication are Available 280 220 106400 53200
211 and 211 cycles, respectively. Utilization (%) 16 21 12 32

TABLE III. RESOURCE UTILIZAITON OF MATRIX MULTIPLICATION Since this circuit calculates the processes of time update and
BRAM DSP FF LUT measurement update autonomously, processors will be
Number of ussed 0 10 2746 2362 released from the work to maintain the dataflow of matrix that
resource general matrix accelerator implementation need.
Utilization (%) 0 4 2 4
D. Comparison with other configurations
TABLE IV. shows the synthesized results of the matrix For the evaluation of proposed AHRS system, we compare
inversion function. The maximum frequency of this module is the performance of three kinds of system configurations show
118MHz. The maximum clock cycle and the interval for the as follows.
computation of the matrix inversion are 1943 and 1493,  RISC-SYS1
respectively.
 RISC-SYS2
TABLE IV. RESOURCE UTILIZATION OF MATRIX INVERSION  Proposed System
BRAM DSP FF LUT We evaluate three types of AHRS systems. RISC-SYS1 and
Number of used 0 10 2766 3633 2 are only software implementation and running on RISC
resource
Utilization (%) 0 4 2 6
processor that is ARM Cortex-A9. The difference of these is a
configuration of the clock frequency. RISC-SYS1 is
We show the basic performance of matrix operations in this configured with 667MHz, the maximum frequency of target
board. On the one hand RISC-SYS2 is configurated with
section. In the system description and synthesis, each function
written in C are embedded to a higher module to reduce a 72MHz that estimate the result of a software system running
overhead of function call by the synthesizer. However, we on Cortex-M4 processor which is often used for small UAV
systems such as LibrePilot [11] and Cleanflight [12]. Finally,
understand the basic behavior the synthesized matrix
operations from C, and we can use the result to estimate the the block diagram of Proposed System is )LJ. The task is
system resource occupation. divided two, attitude estimation and conversion quaternion
between Euler angle. First task is computed in AHRS circuit,
TABLE V. shows the synthesized results of functions ℎ() and second one is calculated by ARM processor. The
%(). The maximum frequency of function ℎ() is 121 MHz configuration of the processor is 667MHz.
and function % is 111MHz. From this result, the FPGA
resource utilization is reduced by optimization that is In this evaluation, we assume that proposed system read
described in section III. input samples from the static memory array embedded in the
program. The dataset contains 9000 samples of 3 sensors data
that injected noise intentionally. We measured the calculation

141
time of for samples in dataset and divide the time by the REFERENCES
number of samples to estimate the time per one sample. [1] R. Konomura and K. Hori, “Phenox: Zynq 7000 based quadcopter robot,”
in 2014 International Conference on ReConFigurable Computing and
TABLE VIII. AHRS EXECUTION RESULTS FPGAs (ReConFig14), 2014, pp. 1–6.
[2] J. F. Guerrero-Castellanos, H. Madrigal-Sastre, S. Durand, N. Marchand,
Processing Maximum W. F. Guerrero-Sanchez, and B. B. Salmeron, “Design and
time (msec) sampling freq. (Hz) implementation of an Attitude and Heading Reference System (AHRS),”
(A) RISC--SYS1 0.345 2898 in 2011 8th International Conference on Electrical Engineering,
(B) Proposed System 0.062 16129 Computing Science and Automatic Control, 2011, pp. 1–5.
(C) RISC--SYS2 3.198 312 [3] J. L. Marins, Xiaoping Yun, E. R. Bachmann, R. B. McGhee, and M. J.
Zyda, “An extended Kalman filter for quaternion-based orientation
TABLE VIII. shows AHRS calculation time per one sample. estimation using MARG sensors,” in Proceedings 2001 IEEE/RSJ
Since the current sampling frequency of typical sensors are International Conference on Intelligent Robots and Systems. Expanding
the Societal Role of Robotics in the the Next Millennium (Cat.
100Hz when set to 667MHz to CPU frequency, real-time can No.01CH37180), vol. 4, pp. 2003–2011.
be achieved without using a special purpose circuit shown in [4] H. Chenini, D. Heller, C. Dezan, J. Diguet, and D. Campbell, “Embedded
(A) of TABLE IV. In a CPU setting to a 72MHz assuming an real-time localization of UAV based on an hybrid device,” in 2015 IEEE
actual small UAV system, we couldn’t obtain real-time International Conference on Acoustics, Speech and Signal Processing
performance in the system shown in (B). However, the system (ICASSP), 2015, pp. 1543–1547.
with a combined CPU and AHRS circuit could achieve real- [5] D. Pritsker, “Hybrid implementation of Extended Kalman Filter on an
time. In addition, the system (B) can handle 16 kHz of the FPGA,” in 2015 IEEE Radar Conference (RadarCon), 2015, pp. 0077–
0082.
sampling rate of sensors that is needed more precise
[6] V. Bonato, R. Peron, D. F. Wolf, J. A. M. de Holanda, E. Marques, and J.
navigation systems. M. P. Cardoso, “An FPGA Implementation for a Kalman Filter with
Application to Mobile Robotics,” in 2007 International Symposium on
VI. CONCLUSION Industrial Embedded Systems, 2007, pp. 148–155.
In this paper, re proposed an extended Kalman filter (EKF) [7] Xilinx, “Zynq-7000 All Programmable SoC Data Sheet: Overview,” 2017.
hardware for real-time attitude heading reference system [8] Intel FPGA, “Intel SoCs Overview,” 2016. [Online]. Available:
(AHRS). EKF reduces effects of noises by sensor fusion, and https://www.altera.com/products/soc/overview.html. [Accessed: 19-Jul-
2017].
real-time and highly accurate attitude estimation is required
[9] A. M. Sabatini, “Quaternion-Based Extended Kalman Filter for
for autonomous flight of UAVs. To reduce loads of processors Determining Orientation by Inertial and Magnetic Sensing,” IEEE Trans.
in real-time calculation of EKF, we implemented a circuit of Biomed. Eng., vol. 53, no. 7, pp. 1346–1356, Jul. 2006.
EKF by C-based hardware design that is called high-level [10] Xilinx, “Vivado High-Level Synthesis,” 2017. [Online]. Available:
synthesis (HLS) in Programmable SoC using Vivado HLS. In https://www.xilinx.com/products/design-tools/vivado/integration/esl-
EKF, most of resource and time-consuming parts are matrix design.html. [Accessed: 10-Jul-2017].
multiplication and inversion. We used straightforward [11] LibrePilot, “LibrePilot – Open – Collaborative – Free,” 2016. [Online].
implementations on these parts and added some optimizations Available: https://www.librepilot.org/site/index.html. [Accessed: 19-Jul-
2017].
for HLS. As a result, we achieved 5 times faster EKF
processing on FPGA than the processing on ARM Cortex-A9 [12] Cleanflight, “Cleanflight ·.” [Online]. Available: http://cleanflight.com/.
[Accessed: 19-Jul-2017].
running on 677Hz with single core processing. Our system can
handle 16KHz of sampling rate of high-speed sensors in spite
of 100Hz of the typical current system.

142

You might also like