You are on page 1of 24

Simulation Modelling Practice and Theory 80 (2018) 104–127

Contents lists available at ScienceDirect

Simulation Modelling Practice and Theory


journal homepage: www.elsevier.com/locate/simpat

Fault-diagnosis for reciprocating compressors using big data


and machine learning
Guanqiu Qi a,b, Zhiqin Zhu a,∗, Ke Erqinhu c, Yinong Chen b, Yi Chai d, Jian Sun e,d
a
Collaborative Innovation Center for Industrial Internet of Things, College of Automation, Chongqing University of Posts and
Telecommunications, Chongqing 400065, China
b
School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ 85287, USA
c
Research Center of China National Offshore Oil Corporation, Beijing 100010, China
d
College of Automation, Chongqing University, Chongqing 400044, China
e
College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China

a r t i c l e i n f o a b s t r a c t

Article history: Reciprocating compressors are widely used in petroleum industry. A small fault in recipro-
Received 15 September 2017 cating compressor may cause serious issues in operation. Traditional regular maintenance
Revised 16 October 2017
and fault diagnosis solutions cannot efficiently detect potential faults in reciprocating com-
Accepted 30 October 2017
pressors. This paper proposes a fault-diagnosis system for reciprocating compressors. It
applies machine-learning techniques to data analysis and fault diagnosis. The raw data is
Keywords: denoised first. Then the denoised data is sparse coded to train a dictionary. Based on the
Reciprocating compressor learned dictionary, potential faults are finally recognized and classified by support vector
Big data machine (SVM). The system is evaluated by using 5-year operation data collected from an
Cloud computing offshore oil corporation in a cloud environment. The collected data is evenly divided into
Deep learning two halves. One half is used for training, and the other half is used for testing. The results
RPCA
demonstrate that the proposed system can efficiently diagnose potential faults in com-
SVM
pressors with more than 80% accuracy, which represents a better result than the current
practice.
© 2017 Elsevier B.V. All rights reserved.

1. Introduction

Reciprocating compressors are widely used in petroleum industry. It is important to keep reciprocating compressors
working properly. Reciprocating compressors used in offshore oil and gas production usually operate in a high-temperature,
high-pressure, flammable, explosive, corrosive working environment. Due to the harsh sea working environment, it is diffi-
cult to perform maintenance, fault detection, and reparation on reciprocating compressors. Comparing with other types of
equipment used on land, offshore reciprocating compressors have higher reliability requirements [1]. In history, the explo-
sion of offshore oil platform caused a huge and irreparable loss, such as British North Sea Piper Alpha platform [2], and
Deepwater Horizon platform [3]. It is important to identify and repair any defects of equipment in time. Any small fault
that is not repaired in time may finally result in a disaster.


Corresponding author.
E-mail addresses: guanqiuq@asu.edu (G. Qi), zhuzq@cqupt.edu.cn (Z. Zhu), keerqh@cnooc.com.cn (K. Erqinhu), yinong@asu.edu (Y. Chen),
chaiyi@cqu.edu.cn (Y. Chai), cq_jsun@163.com (J. Sun).

https://doi.org/10.1016/j.simpat.2017.10.005
1569-190X/© 2017 Elsevier B.V. All rights reserved.
G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127 105

Regular maintenance and fault diagnosis after a system failure are the two traditional methods that are still used today.
Unfortunately, they cannot efficiently detect faults in advance to avert the disaster. Periodical human inspection is lack of
timeliness in fault detection. Human inspection often focuses on exterior and cannot easily detect the interior faults. Thus, it
is difficult for current methods to predict the potential faults in advance. Fault diagnosis is usually a post-processing method.
The installation and working condition of reciprocating compressor vary, and many parameters affect the working status
of reciprocating compressor. In most cases, one type of signal cannot point out the potential faults. Actually, the interaction
of different factors and parameters causes the faults of reciprocating compressor. Therefore, it is always difficult to map
related parameters to corresponding faults on reciprocating compressor. With the development of online testing technology
[4–6], real-time information needs to be processed by hundreds of engineers in offshore oil fields. The analysis costs are
expensive.
Fault diagnosis of reciprocating compressors often uses a real-time signal-processing model [7–9]. Reciprocating compres-
sors in offshore oil platforms usually work in a salty and foggy environment, and the harsh environment makes traditional
solutions not accurate. In addition, the real-time information often contains different parameters, such as vibration, tem-
perature, and displacement. The relationships among these parameters are complex as they affect each other. The external
environment, such as temperature, currents, and so on, also affects the working status of reciprocating compressor. Thus, it
is difficult to determine the root cause for a system failure. It is important to identify the relevant information in a large
number of data from heterogeneous multiple sources [10,11].
As the heart of a drill platform, even any inefficient operation and unplanned shutdown of a reciprocating compressor
are not acceptable, which can seriously harm the crude oil extraction. Traditional fault-diagnosis methods focus on moni-
toring, and they can only detect faults after the compressor fails to work. We cannot rely on the traditional measurement
technology alone to guarantee the early detection of potential failures to prevent damages. This paper proposes a real-time
fault-diagnosis system using the big-data and self-learning approach to minimize failures and their high repair costs. Based
on a data-driven classification method, an automatic recognition model is integrated into the proposed fault-diagnosis sys-
tem. The main contributions of this paper are as follows [12]:

1. A new fault-diagnosis model of reciprocating compressor is proposed, which integrates and customizes existing machine
learning techniques;
2. A large size of data from reciprocating compressors is processed in cloud environment to perform the real-time data
analysis;
3. A multiple-category SVM (Support Vector Machine) is developed for recognizing normal and faulty data to identify po-
tential faults;
4. The proposed model is robust to the impact of changes in external environment, and can determine the normal working
status of reciprocating compressor to ensure the high accuracy of fault identification; and
5. The effectiveness and efficiency of the proposed fault-diagnosis model are evaluated and verified by using the data col-
lected from real reciprocating compressors.

The rest of the paper is structured as follows: Section 2 introduces the proposed framework; Section 3 presents the
fault-diagnosis process; Section 4 discusses the robustness of fault diagnosis; Section 5 evaluates the proposed framework
and analyzes experiment results; and Section 6 concludes the paper.

2. Architecture

Oil exploration along a long coastline with a large number of oil and gas reserves at the bottom of the sea requires
many drilling platforms as the geographic conditions of the diverse ocean can vary a lot from surface water to deep oceanic
trenches. Thus, compressors in different ocean areas face different conditions.

2.1. Data analysis architecture

Real-time fault diagnosis system is used to reduce the maintenance costs and improve working efficiency of equipment
[12–15]. Until now, a large number of methods are based on pressure, vibration, and acoustic emission (AE) signals, and
they have been used to diagnose faults in reciprocating compressors [1].
A large amount of data needs to be processed to monitor the status of compressors as each compressor generates about
3GB of data per hour. A drilling platform often has hundreds of compressors, but it is often located far away from land and
does not have enough computing capacity to process large-size data. Thus, large volumes of data are sent from platforms to
a cloud environment for processing [16].
Two types of data can be obtained from compressors: structured and unstructured data. Structured data is related to the
status of compressors such as temperature, speed, and acceleration. Unstructured data is from video surveillance. This paper
focuses on analyzing structured data. The proposed data analysis system has two parts:

1. Learning part: It analyzes data to develop a model for the prediction of future working status;
2. Analysis part: It uses the generated model to predict the status of compressors and identify potential faults.
106 G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127

Fig. 1. Big data analysis architecture in a cloud environment.

The proposed analysis framework is shown in Fig. 1. The data analysis process runs in a cloud environment. The proposed
framework has three parts:

• Data management: It saves all the data from compressors in repositories hosted on a cloud.
• Data analysis: It uses data mining and machine learning methods to filter, classify, and analyze data. First, it filters all
noises in the original data. Then the filtered data is classified and analyzed. Finally, it formalizes a monitoring model
based on the analyzed data. The formalized model is used to predict potential faults. As more data is collected, the
model is continuously updated.
• Visualization: The analyzed data is presented as charts and tables to personnel for real-time support, machine manage-
ment, and decision making.

2.2. Two-level data analysis framework

Fig. 2 shows the concurrent design for data analysis of reciprocating compressors. This is similar to the scalability archi-
tecture commonly used in SaaS [17,18]. Many machines work at the same time and the corresponding data is continuously
sent to a cloud for analysis. The data is classified and assigned to different clusters for analysis. Each cluster has multiple
servers to handle different tasks in parallel.
The two-level architecture not only automatically balances the workloads across multiple clusters and servers, but also
scales up and down with increasing and decreasing loads. The high-level load balancer differentiates and allocates data
analysis tasks to different clusters based on domain information. The compressors from the same ocean areas have the
same or similar conditions. To improve data analysis, it is better to cluster the compressor data based on the ocean area.
The specialized data is stored in the same or closely related databases. The clustered data is assigned to one or several
closely related servers for processing. The servers need to use the designated databases corresponding to the clustered data.
The same type of data from one domain is assigned to one or several clusters that specialize in analyzing this type of data.
At the low-level, each cluster has its own local load balancer that dispatches data analysis tasks to different servers within
the cluster. The analyzed data of each server is saved in local database and shared with other servers in the same cluster.
The data collector merges all analyzed data from each cluster. The finalized results are sent to human users.

2.2.1. Data clustering


The raw data from drilling platform is not organized. For increasing the efficiency of data analysis, the data is assigned to
different clusters based on domain information in the proposed two-level data analysis framework. The domain information
is closely related to the raw data. In one ocean area, the same type of data from different drilling platforms may have a
big difference, and thus, it is not accurate to cluster the raw data based on geographical position. The raw data should be
clustered based on the data itself.
G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127 107

Fig. 2. Two-level data analysis framework.

Different clustering algorithms have been used in data analysis. K-means algorithm requires a priori knowledge of the
number of cluster centers. It is difficult to know the number of cluster centers before clustering. According to the statistics,
the raw data from a drilling platform is a collection of Gaussian distributions. Thus, this paper uses Gaussian-means algo-
rithm to partition the raw data into different zones automatically and adaptively. The data with certain possibility can be
chosen randomly.
The raw data set can be formalized as X = {xn }N D
n=1 ⊂ R . It defines a Gaussian kernel density estimate with bandwidth
σ:
 
1
N
 x − x n 2
p( x ) = K  σ  , K (t ) ∝ e−t/2 , (1)
N
n=1

Algorithm 1, as expectation maximization (EM) algorithm, is used for calculating the best Gaussians. EM algorithm is an
iterative algorithm for solving maximum likelihood problems, and it involves the following two steps.

• The Expectation step (or E-step) calculates the posterior probability. Each data point is assigned probabilistically to each
cluster.

Algorithm 1 Gaussian-means algorithm for clustering.


Input:
Raw data xn
Output:
Clustered data {zn }N n=1
1: for n ∈ {1, . . ., N } do
2: x ← xn
3: repeat
exp(− 12  (x−xn )/σ 2 )
4: ∀n : p(n|x ) ← N
exp(− 12  (x−xn )/σ 2 )
n =1

5: x← N n=1 p(n|x )xn
6: until x’s update < tol
7: zn ← x
8: end for
108 G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127

Fig. 3. Sequential adding method with load balancing.

• The Maximization step (or M-step) categorizes the observed data according to the probability, and updates the hypoth-
esis accordingly.
After clustering, the clustered data is sent to load balancer. According to domain information and computation capacity
of each cluster, the balancer assigns the clustered data to different clusters for processing.

2.2.2. Load balancing


Traditionally, the sequential adding method is used to add new workloads to current cluster until the cluster is full. Then
the system adds a new cluster. This approach is easy to implement, but it does not address load balancing as the new cluster
may have lighter workload comparing with existing clusters at least at the beginning. Furthermore, the larger in size, the
more imbalance the new cluster is.
This paper proposes an N-cluster (N is the natural number, N ≥ 1) method for workloads from different domain, as shown
1
in Algorithm 2. If N old clusters are fully used, the system adds a new cluster. Then the system moves N+1 contents from

Algorithm 2 Sequential adding algorithm with load balancing.


Input:
Fully used N clusters, one adding cluster
Output:
All clusters containing workload contents
1: if N clusters are full then
2: if N  1 then
1
3: Move N+1 contents of existing clusters to the new adding cluster
4: Return N + 1 clusters
5: else
6: Return Null
7: end if
8: else
9: Return Null
10: end if

1
each old cluster to new cluster. The freed spaces of the N old clusters and the newly added cluster ( N+1 of cluster) can
be used for future workloads. Then, the new workloads are distributed alternatively into all N + 1 clusters. When all N + 1
clusters are full, the system repeats the same process again. The process of workload assignment is illustrated in Fig. 3. The
strategy of Algorithm 2 is to maintain the high processing efficiency of each cluster. When the old cluster is fully used, the
new workloads are assigned to newly added cluster. The old cluster continues processing the unfinished workloads until
they are completely processed. The unprocessed workloads are dynamically assigned to newly added cluster.

2.2.3. Priority scheduling


The efficiency of fault-diagnosis is important for reciprocating compressors. The fault-diagnosis result must be returned
promptly without any delay. A small delay may cause irreparable disaster. Reciprocating compressor is a complicated in-
G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127 109

Algorithm 3 Priority-based round-robin scheduling algorithm.


Input:
S, P (si )
Output:
si
1: while True do
2: i = (i+1) mod n;
3: if i == 0 then
4: cp = cp - gcd(S);
5: if cp ≤ 0 then
6: cp = max(S);
7: if cp == 0 then
8: return null;
9: end if
10: end if
11: end if
12: if P (si ) ≥ cp then
13: return si ;
14: end if
15: end while

tegration of different components. The working status of each component directly affects the whole working status of the
reciprocating compressor. When all components are in good working statuses, the reciprocating compressor can work prop-
erly. Therefore, we need to analyze a large number of data from different components in real time to ensure that any fault
can be detected and identified immediately.
According to our two-level data analysis framework, each cluster may have many servers, and each server may have
different computation capacity. The previous loading balancing solution distributes the workloads to different clusters. To
provide real-time fault-diagnosis results, the data analysis system needs to maximize the usage of each server. New work-
loads are assigned to cluster continuously, and the fault-diagnosis system needs a set of analyzed results from different
components. This paper uses the priority-based round-robin scheduling algorithm to make sure that the set of analyzed re-
sults can be returned in time. The priority-based round-robin scheduling algorithm shown in Algorithm 3 has the following
definitions:

• A server set S = {s0 , s1 , . . . , sn−1 };


• P(si ) indicates the priority of si ;
• i indicates the server selected last time, and i is initialized with -1;
• cp is the current priority in scheduling, and cp is initialized with 0;
• max(S) is the maximum priority of all the servers in S;
• gcd(S) is the greatest common divisor of all server priorities in S;

The priority-based round-robin scheduling is better than the traditional round-robin scheduling when the processing
capacity of each server is different. The priority of each server is based on its computing capacity. Each server is assigned
an integer weight to indicate its computing capacity. Servers with higher priorities receive the new time slots first than
those with lower priorities. Meanwhile, servers with higher priorities get more time slots than those with lower priorities,
and servers with equal priorities get equal time slots. The priority-based round-robin scheduling algorithm ensures more
powerful server has longer time slot to process the data analysis.

2.2.4. Concurrent data collection


Concurrent data collection algorithm shown in Algorithm 4 is proposed to solve the collection of analysis workloads.
Based on domain information di , the clustered raw data from machine mi is assigned to cluster ci for processing. Finally the
results from each cluster are merged to generate final results.

3. Fault diagnosis

Any fault may affect the normal operation of an industrial system. As shown in Fig. 4, the external environment and the
internal components affect the working status of reciprocating compressor. Many critical faults, like fire and flood, seriously
jeopardize the safety of such industrial system. Some critical faults can be detected in efficient ways. For instance, fire can
be detected by smoke detector. Comparing with the impacts of external environment, the impacts of internal components
are more difficult to be detected and identified. Although varieties of approaches are applied to fault diagnosis, accidents
still happen in the industrial production. Due to the complexity of industrial system, fault diagnosis is always a challenging
110 G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127

Algorithm 4 Concurrent data collection algorithm.


Input:
Clustered data from machine mi , domain di , cluster ci
Output:
Analyzed result ri , intermediate analyzed result iri
n
1: ci = i=1 mi
2: for each ci do
3: return iri
4: end for
n
5: di = i=1 ci
6: ri = (n, di , iri )
7: for merge all di do
8: return ri
9: end for

Fig. 4. Relationship between reciprocating compressor and impacts of external environment and internal components.

work. Especially for reciprocating compressor in petroleum industry, any small fault may cause serious disaster. Numerous
disasters happened in petroleum industry.
To avoid such disasters, different approaches of fault analysis are applied to reciprocating compressor. The fault analysis
includes fault detection, fault identification, and fault analysis. Fault detection means finding the existence of a fault. Fault
identification means recognizing the fault identity, such as type, location, etc. Fault diagnosis means finding the cause of a
fault. However, in reciprocating compressor, the faults of valve and piston rod are gradually accumulated by abrasion, aging,
and so on. Traditional methods, which focus on monitoring, are difficult to detect the potential faults in a direct way, when
compressors are in normal working status. Many problems are always detected too late or not at all without appropriate
monitoring, for example.

• Loose or damaged mechanical connections;


• Liquid carry-over damages valves, pistons and piston rings;
• Clogged valves cause increased pressure in the cylinder and poor compressor performance;
G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127 111

Fig. 5. Fault diagnosis of reciprocating compressor data.

• Worn seals cause gas leakage and loss of efficiency; and


• Damage due to piston rod overloading or loss of rod load reversal.

A compressor failure can have the following serious consequences.

• Major damage and resulting downtimes;


• Loss of production;
• Safety risk to workers;
• Inefficient operation;
• Unnecessary and unplanned maintenance work; and
• High costs for repairs, risk management and environmental protection.

In most cases, the faults of compressor are not caused by mutations. It means that the potential faults can be reflected
in normal working data. To efficiently diagnose the faults of reciprocating compressor, a novel big-data based framework
is proposed to analyze the large-size data of reciprocating compressor in real time. The proposed system monitors and
analyzes five types of data as follows.

• Vibration: frame vibration frequency, cylinder vibration frequency, bearing vibration frequency
• Pressure: cylinder pressure
• Temperature: environment temperature, suction valve temperature, discharge valve temperature, packing temperature,
bearing temperature
• Noise: noise amplitude
• Position: rod position

Any monitored data, which deviates from the normal working range, may cause chain reactions in the reciprocating
compressor. If the potential faults are recognized in time, the identified faults of reciprocating compressor can be classified
for reparation to avoid the disaster. A minor change in compressor is difficult to be detected visually. However, the data
analysis can detect the potential faults in advance. For example, the leakage of compressor valve can be detected by the
decreasing pressure value. If the pressure value is less than a certain value, it means the leakage occurs.

3.1. Fault diagnosis

Fig. 5 shows the fault-diagnosis process. The operating data is acquired by the sensors of reciprocating compressor. The
acquired data is about the vibration statuses of different component, temperature, and humidity. The source data of recip-
rocating compressor D consists of data vectors d1 , d2 , . . . dn , where di is the operating data at one moment. The source data
has been randomly divided into training data and testing data [19,20]. In this paper, half of data is used for training and
the rest of data is used for testing. Then both training and testing data is de-noised by robust principal component analysis
(RPCA) [21–23]. All spike noises of data are isolated by RPCA. Then an online dictionary learning and sparse coding process
are implemented for data feature extraction. All source data is used in online dictionary learning [24–26]. The calculated
sparse coefficients of training data are used in multiple category SVM classifier training [27–30]. The trained SVM classifier
is used in testing data classification.
The fault-diagnosis processes of reciprocating compressor data shown in Fig. 5 has three main processes.

• RPCA: As an unsupervised dimensionality reduction method, it is applied to source data for spike noise reduction.
• Dictionary learning: It extracts the features of denoised operating data.
112 G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127

• SVM classification: It is an efficient classifier for vector classification with high accuracy. The learned data is classified
by a multiple category based SVM classifier.

All the data is denoised by RPCA first, and then used for dictionary training. The trained dictionary is used for sparse
representation of all denoised data. At last, the sparse coefficients obtained from the training data are used for SVM classifier
training, and the sparse coefficients obtained from the testing data are classified by the trained classifier. As each sparse
coefficient is corresponding to a source data vector, when the sparse coefficient is classified, the data is classified.

3.2. Robust principal component analysis

The size of source data obtained from server is large, and the source data contains a lot of noises. Directly doing fault
classification on source data with noises may not produce accurate results. A common strategy is to de-noise the source
data first. Principal component analysis (PCA) is a way to de-noise data using Gaussian distributed noise, widely used in
signal and image processing [31–33]. However, signals of compressors usually come with spike noises that are difficult to
be denoised by PCA. To solve the spike noise issue, a novel RPCA method is proposed to process compressor signals.

3.2.1. Problem formulization of data denoising


In compressor data, the rank of data matrix is low in most cases. In that case, the data from compressor can be decom-
posed into a low rank matrix and a sparse spike noise matrix. Then the source data can be rewritten as follows:

D = L + S, (2)
where D is the source data of compressor, L is the low rank matrix of signal, and S is a sparse noise matrix. To find L and
S as the best descriptions of source data, a restriction function is proposed in Eq. (3).

min rank(L ) + γ S0 , (3)


L, S

If one solves the problem in Eq. (3) for appropriate γ , one may recover the pair (L, S) that generates the data D. However,
Eq. (3) is non-convex, minimizing both rank(L ) and S0 is an NP-hard problem. To solve this non-convex problem, Wright
relaxes L0-norm to L1-norm and replaces rank with a nuclear norm [34]:

min L∗ + λS1 , (4)


L, S

This relaxation can be motivated by observing that L∗ + λS1 is the convex envelope of rank(L ) + λS0 . So that, L0 and
S0 can be calculated by L0 , S0 = arg minL,S L∗ + λS1 , where D = L0 + S0 , L0 is the low rank matrix, and S0 is a sparse
matrix.

3.2.2. Augmented lagrange multiplier (ALM) for RPCA


To solve that problem, ALM algorithm [35] is applied to RPCA. In ALM, it defines:

D = ( L, S )
f (x ) = A ∗ +λE 1
h ( x ) = D − S − L, (5)
Then the Lagrange function is shown in Eq. (6)

L(L, S, Y, μ ) = L ∗ +λS1 + < Y, D − L − S >


+ μ / 2 D − L − S F , (6)
The optimization flow is like the general ALM method. The initialization Y = Y0 ∗ is to make the objective function value
< D, Y0 ∗ > reasonably large.
According to the objective Eq. (6), the objective function of S can be rewritten as:

f (S ) = λ||S||1 + < Y, D − L − S >


+μ/2 · ||D − L − S||F 2 , (7)

f (S ) = λ||S||1 + < Y, D − L − S >


+μ/2 · ||D − L − S||F 2 + (μ/2 )||μ−1Y ||2 , (8)

f (S ) = λ||S||1 + (μ/2 )(2(μ−1Y · (D − L − S ) )


+||D − L − S||F 2 + ||μ−1Y ||2 ), (9)
G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127 113

Algorithm 5 RPCA via ALM.


Input:
Source data D ∈ Rm×n
Output:
(L∗k+1 and S∗k+1 )
1: Y0∗ = sgn (D )/J (sgn (D ));
2: while not converged do
3: //Line 4–12 solving the problem of (L∗k+1 , S∗k+1 ) = arg min L(L, S, Yk∗ , μk )
L,S
4: A0k+1 = A∗k , Ek0+1 = Ek∗ , j = 0;
5: while not converged do
6: (U, S, V ) = svd (D − Ekj+1 + u−1
k
Yk∗ );
j+1
7: Lk+1 = US [S]V T ;
μk−1
Sk+1 = Sλμ−1 [D − Sk+1 + μ−1
j+1 j+1
8:
k
Yk∗ ];
9: end while
10: Yk∗+1 = Yk∗ + μk (D − L∗k+1 − S∗k+1 ); μk+1 = ρμk ;
11: k = k + 1;
12: end while

f (S ) = (λ/μ )||S||1 + 1/2||S − (D − L − μ−1Y )||F ,


2
(10)
∗ ∗
Then RPCA problem can be solved by an iterative algorithm. In Algorithm 5, any accumulation point (L , S ) of (L∗k , S∗k )k∈Z+
is an optimal solution to RPCA problem and the convergence rate is at least O(ρ k ) in the sense that:

| Lk ∗ +ξ  Sk 1 −  L∗ ∗ −ξ  S∗ 1 |= O(ρk−1 ), (11)
When the low rank matrix and the sparse matrix of source data D are separated, the noises of source data are eliminated.

3.3. Sparse coding

When the input source data is denoised, a sparse coding algorithm with online dictionary learning is used. It extracts
the core information of denoised data so that features of data are more separable. Each denoised source data li is a vector,
that contains the information of reciprocating compressor. As the sparse theory [36], each vector li is decomposed into two
parts as dictionary and sparse vector:

li = Dicxi +λxi 1 , (12)


where Dic is a dictionary that contains common information of all the vectors li in L. xi is a sparse feature vector of li . In
this case, the classification of corresponding feature vectors can help to classify the input vector li .
Then the problem can be written as follows:

m
min li − Dicxi 22 + λxi 1 , (13)
i=1

Optimization of Eq. (13) can find a value of xi for each li .


The optimization algorithm contains two steps. The first step is sparse coding process, shown in Algorithm 6. The second
step is dictionary learning process, shown in Algorithm 7.
When the dictionary is trained, an orthogonal matching pursuit (OMP) algorithm as a resolution algorithm is used to
obtain the sparse vector. In Fig. 6, all the data is used for the dictionary learning. When the dictionary is trained, the testing
and training data is represented by using the trained dictionary.

3.4. SVM classifier

SVM classifier is used to classify the denoised data. This classifier uses a Gaussian radical basis function (RBF) kernel
[37] that is denoted in Eq. (14).
 2
k(vi , v j ) = φ (vi )T φ (v j ) = exp(−γ vi − v j  ), (14)
where γ is the parameter of RBF widths, vi and vj represent the ith and jth feature vector of training images, and φ is the
infinite-dimensional feature mapping function of RBF kernel.
114 G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127

Algorithm 6 Online dictionary learning.


Input:
x ∈ Rm ∼ p(x )
Output:
A trained dictionary Dic
1: A0 ∈ Rk×k , B0 ∈ Rk×k
2: for t = 1 to T do do
3: Sparse coding: compute using LARS

xt = arg min lt − Dict xt 2 + λxt 1
2

xt ∈Rk
4: At ← At−1 + αt αtT
5: Bt ← Bt−1 + αt αtT
6: Compute Dict using Algorithm IV using Dict−1 as warm restart, so that
Dt = arg min 1t ( 12 T r (DT DAt ) − T r (DT Bt ))
D
7: end for

Algorithm 7 Dictionary Update.


Input:
Dic = [d1 . . ., dk ] ∈ Rk×k
B = [b1 . . ., bk ] ∈ Rk×k
A = [a1 . . ., ak ] ∈ Rk×k
Output:
An updated Dic
1: Repeat
2: for j = 1 to k do do
3: Update the jth column to optimize
4: u j ← A[1j, j] (b j − Da j ) + d j
1
dj ← u
max(u j  ,l ) j
2
5: end for
6: Until convergence

In this work, a one-on-one strategy is applied to train l (l − 1 )/2 non-linear SVMs, where l is the number of fault cat-
egories. Given training vectors vi ∈ Rn , (i = 1, . . . , s) in two classes with class label yi ∈ (−1, 1 ). SVM solves the following
constraint convex optimization problem:
 

s
min (1/2 )ψ  + C 2
ξi
i=1

s.t. yi (ψ T φ (vi ) + b) ≥ 1 − ξi ξ j > 0, (15)


s
where C i=1 ξi is the regulation term for the non-linearly separable datasets. φ (vi ) + b is the hyper-plane. The optimal ψ
should satisfy Eq. (15).

s
ψ= yi αi φ (vi ), (16)
i=1

where α i is the Lagrange multiplier. The decision function for these two classes is expressed as follows:
 

s
sign(ψ T
φ (v ) + b) = sign yi xi k ( vi , v ) + b , (17)
i=1

Using this one versus one SVM classifier, the denoised data can be classified into a few groups.

3.5. Real data analysis and diagnosis

Traditional methods of fault diagnosis, like human inspection, are difficult to detect the potential faults. Although the
potential faults do not always have obvious features, they are related to some types of monitored data from reciprocating
compressor. The potential faults are usually the results of interactions of different data. Only one type of data cannot locate
G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127 115

Fig. 6. Sparse representation of reciprocating compressor data.

Table 1
Normal and abnormal working range.

Status Frame Bearing Suction Discharge Noise Cylinder Rod


vibration vibration valve valve amp. pressure position
Freq. Freq. Temp. Temp.

Normal 10,0 0 0– 250– 50– 90– 100– 300– 0–


status 12,0 0 0 HZ 300 HZ 60 °C 100 °C 110 dB 320 mpa ± 3 cm
Suction valve 11,0 0 0– 250– 55– 90– 115– 300– 0–
Reed broken 13,500 HZ 350 HZ 70 °C 100 °C 145 dB 325 mpa ± 3 cm
Suction valve 10,0 0 0– 230– 35– 90– 105– 240– 0–
leakage 12,500 HZ 320 HZ 45 °C 100 °C 120 dB 270 mpa ± 3 cm
Discharge valve 11,500– 250– 50– 105– 125– 330– 0–
Reed broken 14,500 HZ 330 HZ 60 °C 125 °C 155 dB 370 mpa ± 3 cm
Discharge valve 11,0 0 0– 265– 50– 75– 130– 240– 0–
leakage 13,0 0 0 HZ 320 HZ 60 °C 93 °C 155 dB 280 mpa ± 3 cm
Piston-rods 14,0 0 0– 350– 55– 95– 140– 320– 2–
settling 16,500 HZ 420 HZ 66 °C 108 °C 190 dB 355 mpa ± 5.5 cm

the potential faults, but the machine learning techniques are applied to the monitored data for exploring hidden faults. For
instance, when the potential faults exist, the working frequency and temperature may change.
Table 1 chooses parts of monitored data to show the normal and abnormal working range based on statistics. The pro-
posed system mainly analyzes five types of faults, such as suction/discharge valve reed broken, suction/discharge valve leak-
age, and piston-rods settling. The details of each type of faults are shown in Section 5.

4. Robustness of fault diagnosis

The potential faults are identified based on the normal working status of each component in reciprocating compressor.
The proposed solution uses three steps (denoise, dictionary learning, and classification) to identify potential faults. The
sparse coding is applied to decouple the denoised data. The obtained eigenvalues, that can describe the working status
accurately, are used for classification.
However, the working status of reciprocating compressor is closely related to external environment, such as temperature,
hydrologic status, atmospheric pressure, wind speed, currents, crustal movement, and so on. The data of reciprocating com-
116 G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127

pressor from different sea region is different. Even the data of same reciprocating compressor from different season is still
different. It means that even if a reciprocating compressor is in normal working status, the raw data from this reciprocating
compressor still fluctuates.
The fault diagnosis system needs to ensure normal working status and faulty status of reciprocating compressor can
be successfully differentiated. When a reciprocating compressor works properly, the normal working status must not be
identified as faulty status, vice versa. In order to keep the high accuracy of fault identification, it is important to treat these
external interferences as factors in the dictionary learning process. Robustness of fault diagnosis needs to be considered to
make sure the system can correctly identify the normal working status of reciprocating compressor in different time period.
The training data often includes noises. If the noises cannot be handled properly, it can mislead the training process and
its results. The proposed model can adaptively learn the corresponding dictionary from returned compressor data following
the change of environment. The different model is learned in different environment to achieve the accurate fault diagnosis.
Least squares support vector machines (LS-SVM) are transductive to solve large scale problems by adaptive learning.
According to dynamic features and data self-features, the method of least squares is used to minimize the sum of squared
errors made in the results of every single equation, and reduce the effects of disturbance.
Comparing with standard SVM classifier, LS-SVM uses target values instead of threshold values in the constraints, sim-
plifies the problem via equality constraint and least squares, and converges to minimum mean squared error solutions. The
LS-SVM solution involves solving a set of linear equations to make the solution easier to implement than SVM that involves
solving a quadratic programming problem with linear inequality constraints.
LS-SVM maps training samples to a kernel space, where a hyperplane is fitted on the points. The training results of LS-
SVM are not sparse. Unlike SVM, LS-SVM incorporates all training vectors in the result. However, sparseness can be imposed
using pruning techniques. Weighted LS-SVM as sparse modification is used to obtain a more robust estimate by reducing
the effects of non-Gaussian noise (e.g. outliers). Based on the previous result, it uses a bottom-up approach to calculate one
or more weighted LS-SVM by starting from an unweighted standard solution. The influence of each data point is adjusted by
the weight on the estimated linear fit to an appropriate level. The weight of each data point depends on its distance from
the fitted line. The farther away is the point, the less weight it obtains. According to the weights, the hyperplane is fitted to
the bulk of data with the least squares approach for minimizing the effect of outliers.
Given a training set of n samples {xi , yi }ni=1 with input data xi ∈ dR and output value yi ∈ R, the objective function of
LS-SVM is expressed as follows:

1 T γ 
n
min F (w, b, e ) = w w+ vi e2i , (18)
w,b,e 2 2
i=1

subject to

yi − [Twϕ (xi ) + b] = ei , i = 1, 2, . . . , n, (19)


where γ is a regularized parameter to control the tradeoff between the training error minimization and smoothness of the
estimated function, w is the normal to the hyperplane, vk is the weighting factor, ei is the error of the ith sample points,
ϕ x is a nonlinear function that maps x to a high-dimensional feature space, and b is a bias term.
In terms of Lagrangian function, the objective function of LS-SVM can be transformed into the system of linear equations
as follows:
    
0 1T b 0
= (20)
1T K + γ1 α y

where y = (y1 , . . . , yn ), 1 = (1, . . . , 1 )T , Ki j = ϕ T(xi )ϕ (x j ) = K(xi , x j ), K is the kernel matrix. In this paper, radial basis func-
tion (RBF) kernel is used, where the RBF kernel can be expressed as follows:
2

k(x − xc ) = exp −x − xc (2σ )2 , (21)

The error variables from the unweighted LS-SVM eˆi = αˆi /γ (case vi = 1, ∀i) are weighted by weighting factors vi according
to Equation (22a)–(22c), where sˆ = 1.483MAD(eˆi ) is a robust estimate of standard deviation by median absolute deviation
(MAD). c1 and c2 are set to 2.5 and 3 respectively. ϕ : Rd → Rnh is the feature map to the high dimensional feature space in
standard SVM.

1, |eˆi /sˆ| ≤ c1 ; (22a)
vi = (c2 − |eˆi /sˆ| )/(c2 − c1 ), c1 ≤ |eˆi /sˆ| ≤ c2 ; (22b)
10−8 , otherwise. (22c)
Since the dimension of reciprocating compressor data is not large, a single layer SAE can extract enough information to
analyze. At the same time, a single layer SAE is efficient in data extraction to ensure real-time analysis. To fit temperature
and humidity of the given sea areas of the drilling platform, a single-layer stacked auto encoder (SAE) is proposed and
shown in Fig. 7.

H = g ( W 1 I + B1 ) , (23)
G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127 117

Fig. 7. Structure of proposed single-layer SAE for self-adaptive learning.

where I= {i1 ,i2 . . . ,im }, in ∈ RN , n ∈ {1, 2 . . . , m}. g(z ) = 1/(1 + exp(−k )) is the non-linear sigmoid activation function applied
component-wise to the vector z. H ∈ Rm , W1 ∈ Rm×m , and B1 ∈ Rm are the vector of hidden layer values, weights and bias
vector of the hidden layer, respectively. W1 represents the coefficients between input layer and hidden layer. In the output
layer,
O = g ( W 2 H + B2 ) , (24)
where H ∈ Rm
is the output value vector of output layer. W2 ∈ and B2 ∈ Rm×m Rm
are output layer weights and bias vector
of the output layer respectively.
The training of single-layer SAE aims at finding the optimal W1 , B1 , W2 , B2 by solving the following optimization prob-
lem.

m


min(J ) = min Jre + Jwd + β mL(ρ  ρ j) , (25)
j=1

where J is the cost function of the single-layer SAE model. The squared reconstruction error, Jr e, can be denoted as Eq. (26).
1
m
∼ 2
Jre = xi − xi  , (26)
2
i=1

Jw d is a weight decay term. It can be represented as Eq. (27).


1
Jwd = λ(W1 2F + W2 2F ), (27)
2

where •F is the F-norm of the matrix and λ is the weight decay parameter. mL(ρ  ρ j ) is the sparse penalty term, which
can be denoted as the following mathematical formula.
∧ ∧ ∧
mL(ρ  ρ j ) = ρ log(ρ / ρ j ) + (1 − ρ )log((1 − ρ )/(1 − ρ j )), (28)
118 G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127


where ρ j is the average activation of hidden layer coefficients H = h1 , h2 . . . h j , and β controls the weight of sparse penalty
term. This method can calculate the matrix W2 .

5. Experiments and analyses

Reciprocating compressors are positive displacement machines, in which a piston as the compressing and displacing
element has a reciprocating motion within a compressor cylinder driven by a crankshaft to deliver fluids/gases at high
pressure. There is at least one suction valve and one exhaust (discharge) valve for every compression chamber. Each valve
opens and closes with every cycle of the piston. For example, a valve used in a compressor operates at 1200 r.p.m. for
24 hours a day and 365 days a year. The number of opens and closes is 72,0 0 0 times per hour, or 1,728,00 times per day,
or 630,720,0 0 0 times per year.
It briefly introduces the three basic components of reciprocating compressor.

• Suction valve: It is used to allow fluid flow into the compressor. When it is open during the recession of piston, fluid
can flow into the compressor from outside. When it is closed, no fluid can flow into the compressor. It is a non-return
valve in a pump suction to prevent the pump draining or depriming when not in service.
• Exhaust valve (Discharge valve): It is used to evacuate fluid from the compressor. It also regulates the fluid flow to
respond to different situations that require a change in the volume or speed of that flow.
• Piston rod: It is a connecting rod that transmits power to or is powered by a piston. It connects a rotating wheel to a
reciprocating shaft.

The working statuses of valves and piston rods directly affect the performance of reciprocating compressor. Even any
fault of valves or piston rods may cause disaster. Thus, the valves and piston rods are expected to be efficient, durable,
and quiet during compressor operation. Efficiency demands include aerodynamic flow efficiency and volumetric efficiency.
A durable valve should provide maintenance free operation over several thousand hours plus relative ease in service and
repair. As mentioned, each valve has a large number of opens and closes in one year. It is difficult for each valve and piston
rod to work appropriately during one year. Therefore, it is important to monitor their working statuses to ensure that they
can work properly, and potential faults can be detected in advance.
The proposed fault-diagnosis system is used to analyze the data of suction valve, exhaust valve, and piston rod. Experi-
ments are done to evaluate the proposed system by using data from CNOOC. The data is collected from operating compres-
sors in offshore oilfields, and contains fifteen million groups of data from different operation conditions. These conditions
contain normal operation conditions and five faulty conditions: (1) suction valve reed broken, (2) suction valve leakage, (3)
exhaust valve reed broken, (4) exhaust valve leakage, and (5) piston-rods settling. The size of each group is 21 KB. The total
size of fifteen million groups is about 300 TB. All the data is split into three groups for simulation. Each group has five
million groups of raw data and its size is about 100 TB.
All experiments are done by using Matlab 2014a and Visual Studio 2013 community edition mixed in a private cloud
with 300 servers. Each server uses 8-core & 16-thread Intel Xron E5-2670 CPU with 16GB memory. The accuracy rate of
fault diagnosis as the most important criterion is used for evaluation.
There are two steps in the experiment. The first step is data pre-processing, and the second step is fault diagnosis. In
data pre-processing, the original data is divided into normal and faulty data. Then the faulty data is randomly picked as
training and testing data. The proposed solution does fault analysis on training and testing data respectively. The analyzed
results of testing data are used to verify the analyzed results of training data. Only verified results are used in dictionary
learning of training data.

5.1. Data pre-processing

In data pre-processing, the original data is classified into normal and faulty-condition data. Only faulty-condition data
is used to do fault diagnosis. According to the obtained operating data, experiments are used to verify the validity and
effectiveness of data pre-processing.
In data pre-processing, a half of normal condition and faulty condition data is used as training data, and the rest of
the data is used as testing data to test the proposed system. Before doing fault diagnosis, the proposed system differenti-
ates normal condition and faulty condition by using SVM classifier. The data pre-processing is simulated on the mentioned
three groups of raw data respectively. The training and testing data is randomly selected from normal and faulty data in
each group. For each group, the same experiment is repeated ten times by using different training and testing data. The
accuracy of fault condition discrimination is important to the dictionary learning process. High accuracy of fault condition
discrimination can ensure that it obtains the high quality dictionary.
For the three simulations of data pre-processing, the accuracy rates of ten attempts are shown in Tables 2–4 respectively.
In the first simulation, the second attempt has the highest accuracy rate 100%, and the 5th attempt has the lowest accuracy
rate 96.9%. The average accuracy rate is 97.97%. Almost 98% of faults can be identified. In the second simulation, the 5th
attempt has the highest accuracy rate 99.9%, and the 6th attempt has the lowest accuracy rate 97.3%. The average accuracy
rate is 98.68%. More that 98.6% of faults can be identified. In the last simulation, the third attempt has the highest accuracy
G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127 119

Fig. 8. Fault diagnosis results in first simulation.


120 G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127

Fig. 9. Accuracy rate of fault diagnosis in first simulation.

Table 2
Accuracy of fault condition discrimination in first simulation.

Attempt 1 2 3 4 5
Accuracy(%) 99.3 100.0 98.3 97.5 96.9
Attempt 6 7 8 9 10
Accuracy(%) 97.0 97.5 97.5 97.0 98.7

Table 3
Accuracy of fault condition discrimination in second simulation.

Attempt 1 2 3 4 5
Accuracy(%) 99.5 98.4 98.6 98.5 99.9
Attempt 6 7 8 9 10
Accuracy(%) 97.3 98.2 98.4 98.7 99.3

Table 4
Accuracy of fault condition discrimination in third simulation.

Attempt 1 2 3 4 5
Accuracy(%) 98.6 98.2 99.7 97.5 99.1
Attempt 6 7 8 9 10
Accuracy(%) 98.1 97.2 97.9 98.2 99.2

rate 99.7%, and the 7th attempt has the lowest accuracy rate 97.2%. The average accuracy rate is 98.37%. More than 98.3% of
faults can be identified.
The average accuracy rates of fault condition discrimination are 97.97%, 98.68%, and 98.37% for three simulations re-
spectively. The average accuracy rate of three simulations is 98.34%. The high accuracy of data identification increases the
confidence in diagnosis.

5.2. Fault diagnosis

The proposed system is applied to identify potential faults after data pre-processing. For each group of raw data, half of
data (about 50 TB) is used for training, and the rest (about 50 TB) is used for testing. In both two procedures, RPCA does
de-noising operation first. Then it does dictionary learning. Finally, the SVM classifier is applied to the learned data. Training
results are used in testing process to get the final results.

5.2.1. First simulation


The test results of ten attempts are shown in Fig. 8 (1)–(10). Test results are demonstrated as corresponding accuracy
rates of fault diagnosis. In the first attempt, 92.9% of suction valve reed broken faults are successfully identified. The re-
maining 7.1% of suction valve reed broken faults are identified as exhaust valve reed broken faults by mistake. 100% of
suction valve leakage and exhaust valve reed broken faults are successfully identified. Similarly, 92.9% of exhaust valve leak-
age faults are diagnosed, and the rest 7.1% faults are diagnosed as suction valve leakage faults. In this attempt, piston-rods
settling faults have the lowest accuracy rate of fault diagnosis. Only 71.4% of piston-rods settling faults are identified. 7.1%,
7.1% and 14.3% of piston-rods settling faults are diagnosed as suction valve reed broken faults, suction valve leakage faults
and exhaust valve reed broken faults respectively.
G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127 121

Fig. 10. Fault diagnosis results in second simulation.


122 G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127

Fig. 11. Accuracy rate of fault diagnosis in second simulation.

In the second attempt, all exhaust valve reed broken faults are 100% identified. Only 61.5% of piston-rods settling faults
are diagnosed. The accuracy rates of fault diagnosis in suction valve reed broken faults, suction valve leakage faults, and
exhaust valve leakage are 76.9%, 92.3%, and 92.3% respectively. The fourth attempt has the best performance. Only 83.3% of
suction valve leakage faults are identified. The rest 16.7% of suction valve leakage faults are diagnosed as piston-rods settling
faults. The other four types of faults are fully identified.
Fig. 9 shows the results of ten attempts in first simulation. The accuracy rate of diagnosing each fault is marked in
different color. In most cases, more than 85% faults can be identified. Some types of faults have 100% accuracy rate of
fault diagnosis. Only two types of faults have more than 60% accuracy rate of fault diagnosis in all attempts. In the first
simulation, the high accuracy rates prove that the proposed fault diagnosis system can efficiently diagnose faults in large-
size data automatically.

5.2.2. Second simulation


Again, test results of ten attempts are shown in Fig. 10 (1)–(10). In the first attempt, 88.9% of suction valve reed broken
faults are successfully identified. The remaining 11.1% of suction valve reed broken faults are identified as piston-rods settling
faults by mistake. 100% of exhaust valve leakage faults are successfully identified. Similarly, 88.9% of suction valve leakage
and piston-rods settling faults are diagnosed respectively. The rest 11.1% faults are diagnosed as piston-rods settling and
suction valve leakage faults respectively. In this attempt, exhaust valve reed broken faults have the lowest accuracy rate of
fault diagnosis. Only 66.7% of exhaust valve reed broken faults are identified. 33.3% of exhaust valve reed broken faults are
diagnosed as piston-rods settling faults.
In the second attempt, all suction valve leakage and exhaust valve reed broken faults are 100% identified. 88.9% of suction
valve reed broken, exhaust valve leakage, and piston-rods settling faults are successfully identified. The 7th attempt has the
best performance. Only 88.9% of exhaust valve reed faults are identified, and the remaining 11.1% of faults are diagnosed as
exhaust valve leakage faults. The other four types of faults are 100% identified.
Fig. 11 shows the results of ten attempts in second simulation. The accuracy rate of diagnosing each fault is marked
in different color. In most cases, more than 85% faults can be identified. Some types of faults have 100% accuracy rate of
fault diagnosis. Only three types of faults have more than 65% accuracy rate of fault diagnosis in all attempts. In the second
simulation, the high accuracy rates prove that the proposed fault diagnosis system can efficiently diagnose faults in large-
size data automatically.

5.2.3. Third simulation


The test results of ten attempts in the third simulation are shown in Fig. 12 (1)–(10). In the first attempt, 88.9% of suction
valve reed broken, suction valve leakage, and exhaust valve leakage faults are successfully identified. The remaining 11.1% of
corresponding faults are identified as exhaust valve reed broken, piston-rods settling, and suction valve reed broken faults
respectively by mistake. 100% of exhaust valve reed broken and piston-rods settling faults are successfully identified.
In the second attempt, all suction valve leakage, exhaust valve reed broken, and piston-rods settling faults are 100%
identified. 88.9% of suction valve reed broken and 78.8% of exhaust valve leakage faults are successfully identified. The
corresponding 11.1% and 22.2% faults are all identified as piston-rods settling faults. Both the 4th and 7th attempts have the
best performance. Only 88.9% of suction valve leakage faults are identified, and the remaining 11.1% of faults are diagnosed
as exhaust valve leakage faults. The other four types of faults are 100% identified.
Fig. 13 shows the results of ten attempts in third simulation. The accuracy rate of diagnosing each fault is marked in
different color. In most cases, more than 85% faults can be identified. Some types of faults have 100% accuracy rate of
G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127 123

Fig. 12. Fault diagnosis results in third simulation.


124 G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127

Fig. 13. Accuracy rate of fault diagnosis in third simulation.

Fig. 14. Analysis of fault diagnosis results in first simulation.

fault diagnosis. Only six types of faults have more than 75% accuracy rate of fault diagnosis in all attempts. In the third
simulation, the high accuracy rates prove that the proposed fault diagnosis system can efficiently diagnose faults in large-
size data automatically.

5.2.4. Analysis of three simulations


Figs. 14–16 show the analysis results of three fault diagnosis simulations respectively. The expected fault identification
rate of five faults is 70%. In the first simulation, only fault identifications of piston-rods setting in 2nd attempt and exhaust
valve reed broke in 6th attempt are lower than expectations. All other results are better than expectations. In total 50 anal-
yses (10 times for each fault), 25 analyses (50% of all analyses) have 100% fault identification rate. In the second simulation,
only fault identifications of exhaust valve reed broken in 1st attempt, piston-rods settling in 8th attempt, and suction valve
reed broken in 9th attempt are lower than expectations. All other results are better than expectations. In total 50 analyses
(10 times for each fault), 22 analyses (44% of all analyses) have 100% fault identification rate. The third simulation has the
best performance. All results are better than the expectations. In total 50 analyses (10 times for each fault), 25 analyses (50%
of all analyses) have 100% fault identification rate.
In these three attempts, 50%, 44%, and 50% of faults are 100% successfully identified, respectively. Two and three types
of faults in first and second attempts (about 4% and 6% of all faults respectively) have the lower fault identification rate
than the expected fault identification rate 70%. All faults have more than 70% fault identification rate in third attempt. These
results show that the proposed system is efficient to do fault diagnosis.
G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127 125

Fig. 15. Analysis of fault diagnosis results in second simulation.

Fig. 16. Analysis of fault diagnosis results in third simulation.

In Fig. 8 (2), the fault from the exhaust valve reed broken is 100% detected and identified as the fault of exhaust valve
reed broken, not other faults. Based on the training data, the potential exhaust valve reed broken is detected by proposed
solution. The fault diagnosis result is verified by testing data. Similarly, the accuracy rate of suction valve leakage detection is
92.3%. The remaining 7.7% suction valve leakage faults are diagnosed as piston-rods settling faults mistakenly. Although 100%
accuracy rate of fault diagnosis is unusual in real world, 100% accuracy rate of fault detection occurred in our simulation
does not mean that the proposed solution can 100% detect and identify the potential faults in real world. 100% is only
for this attempt of analyzing corresponding data in our simulation. It cannot guarantee that 100% fault detection rate is
extended in real world.
In this paper, we do not model any type of data specifically. We build one model only for the data collected from re-
ciprocating compressors. The proposed model mainly analyzes the seven parameters relative to the number of observations.
The size of training dataset is large enough to offset the learning complexity increase. Before the learning process, the raw
data is denoised to counteract minor fluctuations in training dataset. It ensures the conformability of model structure with
data shape. So it does not have the overfitting issue.
126 G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127

6. Conclusion

This paper proposed a fault-diagnosis system for reciprocating compressors using big data collected in real oil operations.
A machine learning method was used to obtain a pattern to recognize working state and faulty state automatically. The
proposed fault-diagnosis method had three main steps. First, RPCA was applied to the source data from compressors to
reduce spike noise. Second, the data features were extracted from a part of denoised operating data and sparse coded by
online dictionary learning. Finally, a SVM (Support Vector Machine) classifier trained the obtained sparse coefficients. The
trained classifier was used to classify sparse coefficients from the rest of denoised source data. As each sparse coefficient
corresponded to a source data vector, when the sparse coefficients were classified, the data was classified. The potential
faults can be detected from the classified data.
The proposed system was evaluated using 300 TB operational data from oilfields. 50% of original data was processed by
de-noising, coding, and SVM classification for training. The remaining data was used for testing. Considering that the work-
ing environments of reciprocating compressors were variable, LS-SVM was applied to the data training process to ensure the
robustness of proposed system. It can accurately identify the normal working status of reciprocating compress in different
environment.
A two-level data analysis framework was used to process the large-scale data. The source data was analyzed in a dis-
tributed cloud environment to make sure the data can be processed concurrently and timely. The analyzed data from differ-
ent servers was merged to form the fault-analysis results. The results indicated that the system was reliable and can identify
most of potential faults automatically with more than 80% accuracy.
Although the proposed solution had a good performance in fault diagnosis of reciprocating compressors, 80% fault identi-
fication rate still leaves too much room for improvement. Many optimizations are worth doing in the following up research.
To increase the accuracy of data classification, de-noising and dictionary learning process will be improved. The perfor-
mances of different approaches will be investigated to find the most appropriate solution for fault diagnosis of reciprocating
compressors. The robustness of proposed system will also be improved to ensure the normal working status of reciprocating
compressor can be accurately identified in any environment.

Acknowledgments

We would like to thank the supports by Chongqing Natural Science Foundation Grant cstc2016jcyjA0428, National Natu-
ral Science Foundation of China Grant No. 61703347, 61633005, and 61773080, Fundamental Research Funds for the Central
Universities Grant XDJK2017C071, Student Innovation Training Program of Chongqing University of Posts and Telecommuni-
cations A2017-30, and Graduate Scientific Research Starting Foundation of Chongqing University of Posts and Telecommuni-
cations A2017-13.

References

[1] Y. Wang, C. Xue, X. Jia, X. Peng, Fault diagnosis of reciprocating compressor valve with the method integrating acoustic emission signal and simulated
valve motion, Mech. Syst. Signal Process. 56–57 (2015) 197–212.
[2] Wikipedia, Piper alpha, (https://en.wikipedia.org/wiki/Piper_Alpha/) (2016).
[3] Wikipedia, Deepwater horizon oil spill, (https://en.wikipedia.org/wiki/Deepwater_Horizon_oil_spill#Explosion/) (2016).
[4] W. Wu, W. Tsai, C. Jin, G. Qi, J. Luo, Test-algebra execution in a cloud environment, in: Proceedings of 8th IEEE International Symposium on Service
Oriented System Engineering, SOSE 2014, Oxford, United Kingdom, April 7–11, 2014, 2014, pp. 59–69.
[5] G. Qi, W.-T. Tsai, W. Li, Z. Zhu, Y. Luo, A cloud-based triage log analysis and recovery framework, Simul. Modell. Pract. Theory 77 (2017) 292–316.
[6] W.T. Tsai, J. Luo, G. Qi, W. Wu, Concurrent test algebra execution with combinatorial testing, in: 2014 IEEE 8th International Symposium on Service
Oriented System Engineering, 2014, pp. 35–46.
[7] B.N.S.S. Manepatil, G. Yadava, Modeling and computer simulation of reciprocating compressor with faults, J. Inst. Eng. 81 (20 0 0) 108–116.
[8] M. Elhaj, F. Gu, A. Ball, A. Albarbar, M. Al-Qattan, A. Naid, Numerical simulation and experimental study of a two-stage reciprocating compressor
forcondition monitoring, Mech. Syst. Signal Process 22 (2008) 374–389.
[9] J. Sun, Y. Hu, Y. Chai, R. Ling, H. Zheng, G. Wang, Z. Zhu, L-Infinity event-triggered networked control under time-varying communication delay with
communication cost reduction, J. Franklin Inst. 352 (11) (2015) 4776–4800.
[10] J. Kolodziej, H. Gonz¿ólez-V¿ªlez, H.D. Karatza, High-performance modelling and simulation for big data applications, Simul. Modell. Pract. Theory 76
(2017) 1–2.
[11] Q. Zuo, M. Xie, G. Qi, H. Zhu, Tenant-based access control model for multi-tenancy and sub-tenancy architecture in software-as-a-service, Front.
Comput. Sci. 11 (3) (2017) 465–484.
[12] Keerqinhu, G. Qi, W. Tsai, Y. Hong, W. Wang, G. Hou, Z. Zhu, Fault-diagnosis for reciprocating compressors using big data, in: Proceedings of IEEE
International Symposium on Big Data Service 2016, Oxford, UK, 2016.
[13] J. Sun, Y. Chai, C. Su, Z. Zhu, X. Luo, BLDC motor speed control system fault diagnosis based on LRGF neural network and adaptive lifting scheme, Appl.
Soft Comput. 14 (2014) 609–622.
[14] J. Sun, H. Zheng, Y. Chai, Y. Hu, K. Zhang, Z. Zhu, A direct method for power system corrective control to relieve current violation in transient with
{UPFCs} by barrier functions, Int. J. Electr. Power Energy Syst. 78 (1) (2016) 626–636.
[15] Z. Zhu, J. Sun, G. Qi, Y. Chai, Y. Chen, Frequency regulation of power systems with self-triggered control under the consideration of communication
costs, Appl. Sci. 7 (7) (2017) 688.
[16] W.T. Tsai, G. Qi, Dicb: Dynamic intelligent customizable benign pricing strategy for cloud computing, in: 2012 IEEE Fifth International Conference on
Cloud Computing, 2012, pp. 654–661.
[17] W.-T. Tsai, G. Qi, Z. Zhu, Scalable saas indexing algorithms with automated redundancy and recovery management, Int. J. Softw. Inform. 7 (1) (2013)
63–84.
[18] W. Tsai, G. Qi, Integrated fault detection and test algebra for combinatorial testing in taas (testing-as-a-service), Simul. Modell. Pract. Theory 68 (2016)
108–124.
G. Qi et al. / Simulation Modelling Practice and Theory 80 (2018) 104–127 127

[19] W. Tsai, G. Qi, Integrated adaptive reasoning testing framework with automated fault detection, in: Proceedings of IEEE International Symposium on
Service-Oriented System Engineering, SOSE 2015, San Francisco Bay, CA, USA, March 30, - April 3, 2015, 2015, pp. 169–178.
[20] W. Tsai, G. Qi, K. Hu, Autonomous decentralized combinatorial testing, in: Proceedings of 12th IEEE International Symposium on Autonomous Decen-
tralized Systems, ISADS 2015, Taichung, Taiwan, March 25–27, 2015, 2015, pp. 40–47.
[21] J. Wright, A. Ganesh, S. Rao, Y. Peng, Y. Ma, Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimiza-
tion, in: Proceedings of 23rd Annual Conference on Neural Information Processing Systems, 2009, pp. 2080–2088.
[22] K.M. Lee, Y. Matsushita, J.M. Rehg, Z. Hu (Eds.), Computer Vision –ACCV 2012, Lecture Notes in Computer Science, vol. 7724, Springer, 2013.
[23] G. Qi, J. Wang, Q. Zhang, F. Zeng, Z. Zhu, An integrated dictionary-learning entropy-based medical image fusion framework, Future Internet 9 (4) (2017)
61.
[24] J. Mairal, F.R. Bach, J. Ponce, G. Sapiro, Online learning for matrix factorization and sparse coding, J. Machine Learning Research 11 (2010) 19–60.
[25] Z. Zhu, G. Qi, Y. Chai, Y. Chen, A novel multi-focus image fusion method based on stochastic coordinate coding and local density peaks clustering,
Future Internet 8 (4) (2016) 53.
[26] M. Biba, F. Xhafa, F. Esposito, S. Ferilli, Stochastic simulation and modelling of metabolic networks in a machine learning framework, Simul. Modell.
Pract. Theory 19 (9) (2011) 1957–1966.
[27] A.F.K. Morales, I. Mejía-Guevara, Evolutionary training of SVM for multiple category classification problems with self-adaptive parameters, in: Proceed-
ings of 10th Ibero-American Conference on AI, 2006, pp. 329–338.
[28] C. Chang, C. Lin, LIBSVM: a library for support vector machines, ACM TIST 2 (3) (2011) 27.
[29] B. Lin, Q. Li, Q. Sun, M. Lai, I. Davidson, W. Fan, J. Ye, Stochastic coordinate coding and its application for drosophila gene expression pattern annotation,
CoRR (2014). arXiv:1407.8147.
[30] Z. Zhu, H. Yin, Y. Chai, Y. Li, G. Qi, A novel multi-modality image fusion method based on image decomposition and sparse representation, Inf Sci
(2017).
[31] H. Abdi, L.J. Williams, Principal component analysis, Wiley Interdiscip. Rev. 2 (4) (2010) 433–459.
[32] K. Wang, G. Qi, Z. Zhu, Y. Chai, A novel geometric dictionary construction approach for sparse representation based image fusion, Entropy 19 (7) (2017)
306.
[33] Z. Zhu, G. Qi, Y. Chai, P. Li, A geometric dictionary learning based approach for fluorescence spectroscopy image fusion, Appl. Sci. 7 (2) (2017) 161.
[34] J. Wright, A. Ganesh, S. Rao, Y. Peng, Y. Ma, Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimiza-
tion, in: Advances in Neural Information Processing Systems, 2009, pp. 2080–2088.
[35] Z. Lin, M. Chen, Y. Ma, The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices, (2010), arXiv:1009.5055.
[36] I. Rish, G. Grabarnik, Sparse Modeling: Theory, Algorithms, and Applications, first ed., CRC Press, Inc., Boca Raton, FL, USA, 2014.
[37] Y.-W. Chang, C.-J. Hsieh, K.-W. Chang, M. Ringgaard, C.-J. Lin, Training and testing low-degree polynomial data mappings via linear svm, J. Mach. Learn.
Res. 11 (2010) 1471–1490.

You might also like