Professional Documents
Culture Documents
Data
Xiaorong Cheng I ,Tianqi Li I
INorth China Electric Power University, Hebei Baoding, China
xiaor_ cheng@163.com, Itqwhy@163.com
1231
III. TRUSTED ANALYSIS MODEL FOR SMART GRID data or behaviors provided by between data sources exceeds a
DATA certain threshold, we believe that between data sources have a
local credibility. It is composed of the credibility of direct
Aiming at the characteristics of big data' s "3V", "3E" and
context interaction and the credibility of the similarity
"HDC", this paper presents a model of big data' s credibility
between data sources. Its notation is denoted as
measure of dynamic construction, which is divided into three
LocalTrust) B,t) , the meaning behind which is the local
parts - the trusted measurement model between data sources,
the trusted measurement model of data sources and the trusted credibility of data source A relative to data source B at the "t"
measurement model of data. The credibility between data moment, as is shown below in formula (2).
sources is restricted by the credibility of data sources, the o
ROlidOm or 0 I= 0 (2)
credibility of the data sources is restricted by the credibility of LocoITruSI, (B. I) = { LocoITrusl, (B,I- I) fle(t ) , tolllexl(A, B,I) = 0
data and the credibility of the data source, the credibility of [a , . DirTrllsl(A, B, COlllexl(A, B,I),I ) + /3, .Accepl (A, B,t)]. A, (I) , olher
data is restricted by the credibility between data sources and
the credibility of data sources, they are interrelated and Notes:
restricted each other and constitute a whole. a) The initial value is a random number or 0, which
indicates that data source A has some trust or no trust for data
A. Related notion source B.
b) f-l L(t) is the time decay factor at the "t" moment. If
In order to understand the model, the relevant defmitions of
the method are given in this paper, which are used to explain the local credibility of data source A is the same as that of data
the basic issues in the analysis of smart grid big data' s source B at the t and t-l moment, then it is punished by the
credibility.
time decay factor. Thereinto, f-l L(t) = 1- ~ , 0:::; f-lL(t):::; I.
Data source: It refers to the provider of data in the smart t -to
grid's environment. !}.t is the time difference of calculation between two times. to is
Data: It refers to the characteristics of multiple attributes. the starting moment of the current calculation, t is the current
Its notation is denoted as data= {dJ,d 2,d 3, ... ,dn }. Thereinto, di moment.
refers to the "i" attribute of the data. c) !}'Context(A, B ,t) is whether between data source A
and data source B has a new context of direct interaction at the
Trusted network: It refers to a network composed of data
sources and directed links between them. "t" moment.
!-"Context(A, B,t) = Context(A, B,t) - Context(A, B, t-1) .
B. A trusted measurement model between data sources d) DirTrust(A, B,Context(A, B, t), t) is the trusted value
In the process of calculating the credibility of the smart of data source A relative to data source B in the circumstances
grid, when there is a direct context interaction among data of context interaction at the "t" moment.
sources or the similarity of data or behaviors provided by e) Accept(A, B ,t) is the recognition value of the
between data sources exceeds a certain threshold. The direct
similarity of data source A relative to data source B at the "t"
link of the directed graph can be established among data
moment.
sources. With the expansion of the network size, the trusted
network is becoming more and more stable. If it finds out that LSim(dataa, data b )
a data source is not trusted, the model can also quickly impose Accept (A , B,t) =
da/aaEDa/a(A)
_da-''''
''ED
- _a.".:
'a(.".:
/J) _ _ _ _ __
the penalty factor on the credibility of the provider (data Data(A) n Data(B)
source), making it less reliable for the provider for a period of
time. But as time goes on, if the data source can continue to
Data(A) is a data set provided by data source A. data is
provide reliable data, its credibility will be restored. If data a data provided by data source. Sim(datan, data b ) is the
sources have no new context in a calculation interval in the similarity degree between data a and data b
network model of the credibility analysis, time penalty is
imposed on them. Data(A) nData(B) is the number of the same theme in
the data sets provided by data source A and B.
Definition 1. Credibility between data sources: It is
f) AL(t) is the penalty coefficient of local credibility of
composed of the local credibility and the global credibility
between data sources. Its notation is denoted as TrustA(B,t ) , the the model at the "t" moment.
meaning behind which is the comprehensive credibility of data I , MocaITrust 4 (B, t);:O: 0
source A relative to data source B at the "t" moment, as is ~W= { . .
0:::; x < 1 , MocaITrustA(B ,t) < 0
shown below in formula (1).
MocaITrustA(B, t) is whether the local credibility of
T rust A (B,t) = a , . L ocalTrustA (B, t) + /3, .GlobalTrustA(B, t) (1) data source A relative to data source B has changed at the
"t" moment.
Thereinto, a l + /31 = 1 .
M ocaITrustA(B, t) = LocaITrust)B,t)-LocaITrustA(B, t-1)
Definition 2. Local credibility: When there is a direct
context interaction between data sources or the similarity of g) a 2 + /32= 1.
1232
Definition 3. Global credibility: It refers to the credibility each layer of data sources relative to the objective data source
of data source in the trusted network, that is, the credibility of A. Thereinto, it is a n* 1 dimensional vector, the first element
data source. Its notation is denoted as GlobaITrust A(B,t) , the of which is the expected value of recommendation credibility
meaning behind which is the global credibility of data source of all data sources of the first layer, and the like, each vector
A relative to data source B at the "t" moment, as is shown element is the expected value for the corresponding layer. The
below in formula (3). average number of layer is set according to the accuracy and
needs, the greater the number of layer is, the greater the
GlobalTrust A (B, t) = Trust(B, t) (3)
amount of calculation is, and the credibility of the
corresponding data is more accurate.
C. A trusted measurement model of data source The recommendation credibility of a data source relative to
Definition 4. Credibility of data source: It is composed data source A for the "i" layer of trusted network is calculated
of the expected value of the credibility of all historical data by the formula (5), as is shown below in formula (5).
provided by data source and the recommendation credibility of
Re commend ( X " A,t) = Trust(X" t) Trust" (Neighbor=' ( X ,- > A),t) (5)
data sources of each layer in the whole trusted network. Its
notation is denoted as Trust(A,t) , the meaning behind which is Thereinto, Xi is a data source X of the "i" layer.
the credibility of data source A at the "t" moment, as is shown Neighbor"'"' (X,- > A) is a data source with the largest
below in formula (4). credibility adjacent to Xi on the "i-I " layer.
The expected value of the recommendation credibility of
(4) data sources relative to data source A for the "i" layer is
RandomO o r0 .t = 0
Trllst(A,t) = Trllst(A,t- l) fJ , (t) ,!lContext(A,B,t) = 0
calculated by the formula, as is shown below in formula (6).
L,Trllst(doto, ,t )
[ a , ' "",D",,,,,
1
+/3, (r, Recommend , (A,t)) ' ,;(, (t) ,other
L: Recommend (X , A,I)
(6)
Sllm(Data(A Recommend(A,t)" = .::.:
X'= CT'='.-','-''''Al---,-,=--=-----,-,_
, Sum(Circle, (A
1233
DirTrust(A, data, t) = Trust(A, t) (9) other part of data is to verify the stability and accuracy of the
model.
Definition 9. Indirect credibility of data provided by a
data source : It refers to the credibility of this data B. Design of simulation experiment
recommended by adjacent data sources with high credibility.
As mentioned above, fustly, the credibility of an entity
Its notation is denoted as InDirTrust(A, data,t) , the meaning
relative to other entities is calculated, starting from the
behind which is the indirect credibility of data recommended formula (1) to calculate the credibility between data sources.
by data sources associated with the data source A at the "t" According to the formula (2) and formula (3), the contents of
moment, as is shown below in formula (10).
the two aspects are calculated, on the one hand, the formula
I
Trust(A , X ,t ) Trust(X, data, t) needs to calculate the local credibility. If the entity has a
InDirTrust( A, data , t) = ",
X,::::
Nag"""""
",,,,,
(A,-) -------- (10) context interaction (condition 1) or a new behavior (condition
n
2), the local credibility need updating, if there is no new
Thereinto, Neighboy,,(A) is n data sources with high behavior, time penalty is imposed on it. If the entity meets the
credibility adjacent to data source A. condition 1 in the calculation process, or the entity not only
From the above defmition, the relationship is shown in meets the condition 2, the similarity of data or behaviors
Figure I. provided by between entities but also exceeds a system
threshold, the link of the directed graph can be established
among entities, thereinto, the weight of the link is the value of
local credibility. On the other hand, the formula needs to
calculate the global credibility.
Secondly, the credibility of an entity is calculated by the
Loca] credibility Globa] credibility
formula (4). If the expected value of the credibility of all
historical data provided by the entity or the recommendation
credibility of entities of each layer changes, the credibility of
DcfS the entity is updated. If the credibility is not changed, time
Dcr4 penalty is imposed on it.
Dcf7
Finally, the formula (7) calculates the credibility of data
Recommendation depending on a theme by using the probability of
credibility
complementary events. The formula (8) gives the true
credibility of data provided by the entity. Meanwhile, the
Direct credibility
formula (9) gives the direct credibility of data provided by the
entity, and the formula (10) gives the indirect credibility of
data provided by the adjacent entities. If an entity provides
Dcr6
some malicious and false data in the experiment, the entity
will be severely punished, so that it can be a very low value in
the trusted network. If its behavior is always normal, the
Fig. I . The correlation of the credibility definition between data sources, credibility will be improved with the increase of their credit.
data, data source
C. Experimental results and analysis
IV. CASE AN ALYSIS AND VERIFICATION Combined with 4.2 section, data are imported into the
algorithm to verify the feasibility. In the process of experiment,
In order to validate the practicability of the proposed we artificially set a power equipment's data, which are mainly
method in this paper, the object of the simulation experiment to verify the model for the detection of false data, processing
is the data of the real time operation of a power company's capacity, using the formula (1), formula (4), formula (7) for
dispatching and communication center, which includes the calculating the credibility, and observe the change of its
local power grid data of three power plants and four credibility with time. As shown in Figure 2.
substations. In this system, the total amount of each group is
obtained from 108 measurements, including 12 node voltages, I ---+- Credibilityvalue I
08
to active and reactive powers of generation units, 15 load 0.7
and 10 line power flow values. Each measure value is assigned ~ O.5
to each label to facilitate further analysis. ~ 0.4
~ O.3
A. Design ofsimulation experiment .n1
5
ji
1234
Fig. 2. Artificial power equipment's trusted value changing with time trend
References
From Figure 3, we can find that the equipment's [II Hao Jinping, Piechocki Robert J, Kaleshi Dritan, Chin Woon Hau, Fan
Zhong. "Sparse Malicious False Data Injection Attacks and Defense
credibility presents a rising trend at the T 0- T 30 moment, but the
Mechanisms in Smart Grids", IEEE Transactions on Industrial
equipment's credibility has a slow downward trend at the T Ir Informatics, vol. 11 , no. 05 , pp. 1198-1209, 2015 .
TI8 moment, which is mainly due to the absence of new [21 K. L. GAO, Y. ZH. XlN, ZH. LI, et al. "Development and Process of
behavior and imposes time penalty on its credibility. Cybersecurity Protection Architecture for Smart Grid Dispatching and
As a result of the equipment to make an unreliable Control Systems", Automation of Electric Power Systems, vol. 39, no.
01 , pp. 48-52, 2015.
behavior at the T31 moment, it has been punished, resulting in
[31 J. L. Duran-Paz, F .Perez-Hidalgo, M. J. Duran-Martinez. "Bad Data
its credibility dropped to O.l. After the T 32 moment, the Detection of Unequal Magnitudes in State Estimation of Power
equipment's behavior is normal and the credibility starts Systems", IEEE Power Engineering Review, vol. 121 , no. 05 , pp. 57-60,
recovering, but the trend is relatively slow. 2002.
The trusted network topology diagram of hierarchical data [41 Shyh-Jier Huang, Jeu-Min Lin. "Enhancement of Anomalous Data
sources and the transitive diagram of the credibility of Mining in Power System Predicting-Aided State Estimation", IEEE
Transaction on Power System, vol. 19, no. 01 , pp. 610-619, 2004.
hierarchical data sources relative to a data are shown in Figure
5 at a certain moment. [51 SH. R. WANG, L. ZHANG and H. S. LI. " Evaluation Approach of
Subjective Trust Based on Cloud Model", Journal of Software, vol. 21 ,
By definition 2, the formula (2) is used to calculate the no.06, pp.1341-1352, 2010.
local credibility between data sources in smart grid, and the [61 W. TANG and ZH. CHEN. " Research of Subjective Trust Management
Model Based on the Fuzzy Set Theory", Journal ofSofiware, vol. 14, no.
trusted network can be constructed. As shown in Figure 3 (a),
08, pp. 1401-1408, 2003.
a partial network topology graph is given, as shown in Figure
[71 T. ZHANG, M. Y.ZHAI, H. B. ZHANG, et al. Substation Topology
3 (b), the transitive credibility diagram is given for certain Error Identification Based on Uncer-tainty Reasoning, Automation of
data. We can draw from the fact that a data not only has direct Electric Power Systems, vol. 38, no. 06, pp. 49-54, 2014.
contact with the provider, but also is surrounded by a lot of [81 L. ZHANG, J. W. LIU, R. CH. WANG and H. Y. WANG. "Trust
data sources which are directly or indirectly linked to the data, evaluation model based on improved D-S evidence theory", Journal on
forming a small trusted network, which can greatly improve Communications, vol. 34, no. 07, pp. 167-173, 2013.
the accuracy of a data credibility evaluation. [91 Q. Y. ZHAO, W. L. ZUO, ZH. SH. TIAN and Y. WANG. "A Method
for Assessment of Trust Relationship Strength Based on the Improved
D-S Evidence Theory", Chinese Journal of Computers, vol. 37, no. 04,
v. CONCLUSIONS pp. 873-883, 2014.
In this paper, the typical characteristics and attributes of [10] J. ZHOU, Q. WANG, C. C. HUNG and X. J. Yi. "Credibilistic
Clustering: The Model and Algorithms", INTERNATIONAL
smart grid big data are analyzed in detail with combination of JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-
between the bad data identification models and the credibility BASED SYSTEMS, vol. 23, no. 04, pp. 545-564, 2015.
analysis' models of general data. Based on the hierarchical [II] He Kemeng, Sun Yushan,Qin Zhang, Sun Yuqiang and Gu Yuwan.
model, it gives the analysis model of smart grid big data's "The research of trusted attribute based on model-driven of MDA",
credibility measurement. In the case of the large amount of Open Electrical and Electronic Engineering Journal, vol. 08, no. 01 , pp.
data provided by data sources, the model can accurately 273-277, 2014.
analyze the credibility of data, and it is better to satisfy the [12] W. Yu, S. J. Li, S. Yang, Y. H. Hu, J. Liu, Y. G. Ding and H. Du,
"Automatically Discovering of Inconsistency Among Cross-Source Data
requirement of Big Data. A simple instance is selected to Based on Web Big Data", Journal of Computer Research and
verify the feasibility of the model. But the method of building Development, vol. 52, no. 02, pp. 295-308, 2015.
the credibility analysis' network still needs to be improved. [13] Hsiao-Ling Wu and Chin-Chen Chang, " A Robust Image Encryption
Above three points will be the focus in further research work. Scheme Based on RSA and Secret Sharing for Cloud Storage Systems",
Journal of Information Hiding and Multimedia Signal Processing, vol. 6,
no. 02, pp. 288-296, 2015 .
-- Credibility path
.....- Transitive path
~0.8?-~~
O."? f
0.69,0
0.45 0.65 0.23
~ ~
/ 0.34. 0.52
~.\k*.32~0.69 t:0~'~~06~'5J~07:~
0.5 0.34 ..-0.74. 0.80
1235