You are on page 1of 11

The Journal of Systems and Software 122 (2016) 496506

Contents lists available at ScienceDirect

The Journal of Systems and Software


journal homepage: www.elsevier.com/locate/jss

A fuzzy-based credibility model to assess Web services trust under


uncertainty
Zohra Saoud a,, Noura Faci a, Zakaria Maamar b, Djamal Benslimane a
a
b

Boulevard Niels Bohr, 69622 Villeurbannex Cedex, France


Zayed University, Po Box 19282, Dubai

a r t i c l e

i n f o

Article history:
Received 15 October 2014
Revised 24 July 2015
Accepted 30 September 2015
Available online 6 November 2015
Keywords:
Web service
Trust
Credibility

a b s t r a c t
This paper discusses the assessment of Web services trust. This assessment is undermined by the uncertainty
that raises due to end-users ratings that can be questioned and variations in Web services performance at
run-time. To tackle the rst uncertainty a fuzzy-based credibility model is suggested so that the gap between
end-users (known as strict) and the current majority is reduced. To deal with the second uncertainty two trust
approaches (i.e., deterministic and probabilistic) are proposed so that trust levels for future interactions with
WSs are made available to users. The deterministic approach takes account end-users credibility values and
the probabilistic one is built upon probabilistic databases and a fuzzy-based credibility model. A series of experiments are carried out to validate the suggested credibility model and these trust approaches. The results
show that the probabilistic approach improves signicantly trust quality and is more robust compared to the
deterministic one. Future work consists of incorporating several credibility models into a single probabilistic
trust model.
2015 Elsevier Inc. All rights reserved.

1. Introduction
It is largely accepted that current Web services (WSs) selection approaches rely on either non-functional properties (aka
Quality of Service (QoS)) that providers announce publicly or on
collecting qualitative/quantitative values that end-users share with
respect to past experiences of using these Web services. Qualitative/quantitative values permit to establish feedback/ratings that
indicate the satisfaction of end-users with the overall performance of
WSs. However a complete reliance on both providers and end-users
raises trustworthiness concerns among future potential end-users
due to biases such as beeng-up a WSs QoS and/or undermining a
WS performance both done purposely. To address these biases two
types of trust models are reported in the literature. The rst model
uses end-users feedback/ratings to compute a trust value (e.g., Xiong
and Liu, 2004). And, the second model observes the behaviors of WSs
over a period of time to compute a trust value (e.g., Wang and Singh,
2007). We are particularly interested in the rst trust model. Indeed
end-users with either limited or non-existent experience of using
WSs cannot provide adequate trust values. When establishing trust
these end-users wrestle with two kinds of uncertainties:

Corresponding author. Tel.: +33 783060937.


E-mail address: saoud.zohra@gmail.com (Z. Saoud).

http://dx.doi.org/10.1016/j.jss.2015.09.040
0164-1212/ 2015 Elsevier Inc. All rights reserved.

Uncertainty (U1 ) over feedback/rating. U1 arises from the lack of


consistent ratings that end-users provide over time. Credibility should help tackle U1 when aggregating end-users feedback/ratings into a common trust value (e.g., Xiong and Liu,
2004; Selcuk et al., 2004).
Uncertainty (U2 ) over the capacity of a WS in fullling the QoS
that its provider announces and thus, satisfying end-users requests. U2 arises from the inconsistency that affects the assessed QoS values due to a WSs dynamic nature and/or malicious behavior. Trust should help tackle U2 (e.g., Kim and Kim,
2005).
Feedback/ratings concurrently mitigate and introduce uncertainty. Uncertainty arises due factors like end-users subjectivity and
providers reliability. We should assess trust despite these factors.
Bordens and Horowitz (2001) decomposed credibility into two
components: (i) expertise that stems from end-users knowledge,
background, notoriety, etc.; and (ii) trustworthiness that refers
to the audiences assessment of the communicators character as well as
his or her motives for delivering the message. Credibility-based trust
approaches given by Malik and Bouguettaya (2009) and Noor et al.
(2013) assume that end-users have good expertise and/or are untrustworthy. When end-users disagree on a certain feedback/rating
on a WS a consensus needs to be reached using the majority opinion. End-users ratings close to the majority opinion are more credible than those with distant ratings. However these approaches do not

Z. Saoud et al. / The Journal of Systems and Software 122 (2016) 496506

consider end-users who are both expert and trustworthy. We refer to


such end-users as strict (severe) experts. They usually do not have any
interest (e.g., making extra income) in aligning themselves with the
majority. For the sake of achieving consensus fuzzy clustering technique would reduce the gap between strict experts feedback/ratings
and the current majority opinion.
There are a number of credibility-based trust approaches that assess trust as a scalar value (e.g., Wang and Singh, 2007; Ries et al.,
2011; Jsang, 2001). However these approaches struggle with establishing trust based on direct end-users experiences and/or peers
feedback/ratings. A scalar value fails to represent rst, the uncertainty over possible trust values and second, the lack of consistency
across different feedback/ratings. The obtained trust value is subject
to ambiguous interpretations by end-users.
Feedback/rating inconsistencies lead to disagreement amongst
end-users opinions. Troffaes (2006) shows that probabilities can
address this disagreement . As stated earlier end-users credibility
helps tackle uncertainty over feedback/ ratings (U1 ). Therefore we
associate credibility with probabilities. Let assume three end-users,
u1 , u2 , and u3 who have experienced WSj and let the following statement S: ui has correctly observed that WSj satises his requests.
The uncertainty here reects the probability that S happens. This
probability can be estimated by computing ui s credibility (Cri ). Let
e1 , e2 , and e3 denote respectively, the events that u1 , u2 , and u3 state
each that WSj satises their requests. Combining e1 , e2 , and e3 when
computing trust raises issues like what is the probability that u1 , u2 ,
and u3 jointly state that WSj satises their requests, and what is the
probability that u1 and u2 only state that WSj satises their requests?
Probabilistic databases permit to represent these kinds of events
by associating an occurrence (or existence) probability with each
statement (Dalvi and Suciu, 2007). These databases can also support
develop complex queries that combine selection criteria (e.g., only
end-users who provide at least n ratings).
Our contributions include: (i) modeling end-users credibility
based on a fuzzy clustering technique so that strict end-users ratings
are taken into account; (ii) developing strategies for establishing the
majority opinion; (iii) assessing trust under uncertainty using probabilistic databases; (iv) building a distributed trust assessment framework based on a proposed credibility model; and (v) developing a
system that measures the quality of trust.
The remaining of this paper is organized as follows. Section 2
identies some work related to WS trust assessment. Section 3.2 motivates the use of fuzzy clustering that underlies our credibility model
and describes how end-users credibility is established. Section 4
depicts two approaches for trust assessment; the rst consolidates
end-users ratings taking into account end-users credibility while
the second relies on probability theory coupled with possible worlds
semantics. Section 5 gives details on the proposed trust assessment
framework and discusses experiments. Finally, concluding remarks
and future work are reported in Section 6.

497

feedback means the satisfaction of this peer about the participation


of others in joint operations. A peer may send others false feedback/ratings because of some malicious motives, for example. The
feedback from peers with higher credibility have more weight than
those with lower credibility. The authors use two metrics to compute
the credibility value of a peer (pi ): QoS provided by pi and feedback
similarity between pi and pj .
Whitby et al. (2004) looked into biased feedback. These feedback
often have a different statistical pattern compared to unbiased ones.
The authors proposed a beta distribution-based ltering technique as
a statical pattern for feedback representation. This technique applies
the majority rule to exclude biased feedback by tagging feedback as
biased when they are distant from a majoritys referrals. This technique is only effective when the majority of ratings are unbiased.
Weng et al. (2005) examined unfair ratings in the online Bayesian
rating systems. These systems collect sellers behaviors over past
transactions so that future transactions life cycles is predicted. The
authors use entropy (i.e., measure uncertainty in information (Cover
and Thomas, 1991)) to evaluate the quality of ratings. Entropy excludes a particular buyers rating from the majority opinion if this
rating signicantly either improves or degrades the quality of the
already-aggregated majority opinion (i.e., above or below a certain
threshold).
Malik and Bouguettaya (2009) discussed trust for WSs selection and composition. They propose several decentralized trust
assessment techniques to ensure a better accuracy of the feedback
collected over time. Malik and Bouguettaya consider that feedback
of highly credible end-users are most trusted than those with low
credibility. To this end, they examine the feedback based on the
distance from the majority opinion using K-means clustering and
group similar feedback into clusters in order to dene this majority.
The highly (i.e., most dense) populated cluster is the majority cluster
whose centroid represents the majority feedback. Along with the
majority principle, the authors trust model takes into account other
social metrics such as end-users feedback history, personalized
reputation evaluation using end-users personal preferences, and
temporal sensitivity. These metrics help adjust the credibility value
when the number of end-users with biased ratings is above the
number of those with unbiased ratings and the majority rule does
not hold as well.
Noor et al. (2013) proposed a credibility model that distinguishes
credible from misleading feedback in a cloud context. This model uses
factors such as majority consensus and feedback density. To measure
how close a cloud end-users feedback is to the majoritys feedback,
Noor et al. use the slandered (i.e., root-mean-square) deviation. Feedback density overcomes the problem of misleading feedback from
end-users. These latter give multiple feedback to a certain cloud service in a short period of time.

2.2. Probabilistic trust


2. Related work
Uncertainty like U1 and U2 reported in Section 1 impacts the
way WS trust is established. In the following we discuss two research streams for tackling U1 and U2 , respectively: deterministic
(credibility-based) and probabilistic.
2.1. Deterministic trust
Deterministic trust approaches rely on end-users experiences (i.e.
feedback/ratings) build upon former interactions. They assess endusers credibility as a degree of uncertainty that a WS will successfully
satisfy a request.
Xiong and Liu (2004) developed Peertrust, a credibility-based
trust framework in the context of P2P networks. In Peertrust a peers

In existing probabilistic trust management approaches (e.g., Teacy


et al., 2006; Zhou and Hwang, 2007; Yu and Singh, 2002) peers rely
on direct use experiences with services or feedback/ratings that other
peers share. False feedback/ratings are handled through a suitable ltering mechanism. In the following we describe three relevant probabilistic approaches.
TRAVOS is a trust model used in open agent systems (Teacy
et al., 2006). An agent trusts a peer based on previous direct interactions. Interactions outcomes use a binary rating to express successful/unsuccessful interaction. The obtained binary ratings are then
used to form the probability-density function that models the probability of a successful interaction with an agent. If there are not enough
direct experiences the model uses other agents experiences to compute the trust value. The model determines the credibility of agents

498

Z. Saoud et al. / The Journal of Systems and Software 122 (2016) 496506

to lter feedback/ratings provided by agents that are inaccurate due


to their limited knowledge or malicious behaviors.
Powertrust (Zhou and Hwang, 2007) is a trust system for P2P networks. Initially nodes rate individual interactions and compute local trust values using a Bayesian learning technique (Buchegger and
Boudec, 2004). These local trust values are then used to evaluate a
global trust value. This value is updated periodically using the Lookahead Random Walk (LRW) algorithm (M. Mihail, 2007). Along with a
distributed ranking module, LRW identies the nodes that assess the
reputation of providers.
Yu and Singh (2002) proposed a probabilistic trust management
scheme that relies on the DempsterShaker theory (Kyburg, 1987).
This theory combines evidences from different sources in order to
reach a degree of belief that takes into account all the available evidences. This scheme extends the probability theory so that uncertainty is modeled. It is worth noting that there is no direct relationship between a possible outcome and its negation. Since the sum of
the possible outcomes probabilities is not necessarily equal to 1, the
remaining probability is treated as a state of uncertainty. The proposed scheme represents two kinds of beliefs: agent A believes that
interacting with agent B will be successful; and agent A believes that
agent B will fail to act as expected. Direct experiences are a priori
used to assess trust. When there is a lack of direct experiences an
agent takes into account other peers feedback/ratings.
To wrap up this section the above probabilistic approaches overlook the rating interpretation in the presence of uncertainty (U1 )
and/or uncertainty over the capacity of WSs (U2 ). Indeed each rating is true at some extend and false at another extend. Hence,
trust computation tends to be irrelevant and inaccurate. Peers feedback/ratings (or experiences) reduce uncertainty, but unfortunately
introduce additional uncertainty. We thus propose to model user ratings as a probabilistic database, interpret ratings in terms of possible
worlds, and compute WS trust as a query evaluation over probabilistic
database. Contrary to existing probabilistic approaches that consider
direct experience of the same user our approach relies on both direct
experience and recommendations. Indeed the set of feedback/ratings
provided by different users (recommendations) can be treated as an
experience of using a WS over time by a virtual user.
3. The credibility model
This section discusses the appropriateness of using fuzzy clustering for establishing and formalizing end-users credibilities. Then, it
presents how credibility is assessed.

consensus is reached. As strict end-users can be in several groups


they can affect groups beliefs in different manners (e.g., strongly and
weakly).
3.2. Credibility assessment
Strong and weak membership terms are fuzzy; they are not welldened (i.e., uncertain) and/or their semantics are dependent on domains and/or user preferences. To deal with uncertainty in group
membership and derive overlapping groups we adopt fuzzy clustering. Consensus clustering algorithms like K-means (Kanungo et al.,
2002) and fuzzy C-means (Bezdek, 1981) generate robust clusters,
detect unusual ones, and handle noise and outliers (Nguyen and
Caruana, 2007). Existing credibility-based trust approaches given by
Malik and Bouguettaya (2009) and Noor et al. (2013) rely on Kmeans to compute the majority (MK ) consensus as a centroid of the
most populated cluster. We use Bezdeks Algorithm 1 (discussed by
Bezdek (1981)) to reduce the gap between strict end-users (ui ) ratings
and MK consensus. Each ui provides a set of ratings (Xi ) on a set of
common WSs.
Let ME={MEi, j } be a membership matrix where MEi, j represents
the membership degree of Xi=1,n in the Cluster C j , and  corresponds to a similarity measure.
Since Algorithm 1 generates a number of clusters (Nbcluster ) with
fuzzy boundaries a new Majority Cluster (CMaj ) needs to be identied
taking into account that each end-users rating has a degree of membership per cluster. We provide three strategies to decide on CMaj . They
rely on qualitative values of membership degree in a fuzzy cluster:
weak, moderate, and strong. The weak strategy keeps all clusters current sizes and selects the most populated C j as CMaj . Eq. (1) identies
weak ):
the weak majority cluster (CMaj

 

weak
CMaj
= C j , C j  =

(|Ck |)

(1)

MEi,k > 0, i [1, n]

This strategy could turn out inappropriate when membership degrees are very small.
The moderate strategy retains the ratings in a cluster with a
membership degree exceeding a xed threshold and selects the
most populated C j as CMaj . Eq. (2) identies the moderate majority
moderate ):
cluster (CMaj

 

moderate
CMaj
= C j , C j  =

max

k=1,Nbcluster

(|Ck |)

(2)

MEi,k , i [1, n]

3.1. Basics
Credibility has two components (Bordens and Horowitz,
2001): expertise and trustworthiness. In this work we recall that we
target strict end-users who are known for their strong expertise and
trustworthiness in a certain community. These end-users stick to
their ratings regardless of the majority for reasons listed by Schum
and Morris (2007) including veracity they tell the truth, objectivity their ratings are based on evidences, and accuracy they
estimate their ratings well. Several studies in social psychology (e.g.,
Lesko, 1997; Sternthal et al., 1978) evaluate the impact of source
credibility on belief and attitude changes. These studies demonstrate that credible sources are persuasive and can affect existing
beliefs (e.g., ratings) and attitudes more than non-credible sources.
Therefore, strict end-users can push a majority to question (even
review) their ratings. To study how this happens we rely on Yagers
participatory learning paradigm (Yager, 2004). It represents situations
where the current ratings are correct, but not necessarily accurate
(resp., wrong) and only require a limited (resp. signicant) tuning by
the majority members. Our proposal is to reduce the gap between
strict end-users ratings and the current majoritys rating so that a

max

k=1,Nbcluster

Last but not least the strong strategy selects the cluster with the
highest membership degree of ratings as CMaj . Eq. (3) identies the

Algorithm 1: Fuzzy C-means (Bezdek, 1981).


Input: Xi=1,n , Nbcluster , m [0, 1],  (termination criterion)
Output: ME, C j=1,Nbcluster
1
2

Initialize ME ME (0) & k 0


n
n


Calculate centroid(C j )= (MEimj Xi )/
MEimj
i=1

MEi j = 1/

i=1

Update ME (k) , ME (k+1)


Nb
cluster

Xi centroid(C j )
2

p=1
Xi centroid
(C p ) m1


if  ME (k) ME (k+1)  <  then

stop
else
k = k + 1, return to step 2

Z. Saoud et al. / The Journal of Systems and Software 122 (2016) 496506
strong

strong majority cluster (CMaj ):

strong
CMaj
= C j,

(MEi, j ) =

i[1,n]


max

k=1,Nbcluster


(MEi,k )

(3)

i[1,n]

strategy{weak,moderate,strong}

Once CMaj
is established, the next step is to
compute the credibility of end-users in this cluster. Eq. (4) identies
ui s credibility as a distance from his rating to the majority opinion
represented by the centroid of CMaj . This credibility is computed using
the normalized Euclidean distance N as the similarity measure:

strategy 
CRij = 1  Xi centroid(CMaj
) ,
N

strategy {weak, moderate, strong}

(4)

In Section 5.2 we experiment these three strategies in order to


study their impact on the trust value and also suitability for Web services selection. The next step in our approach is to assess WSs trust
according to end-users credibility values.
4. Trust model
In this section, we describe two trust approaches (i.e., deterministic and probabilistic) that are used for assessing trust. The rst consolidates end-users ratings taking into account end-users credibility and the second relies on probability theory coupled with possible
worlds semantics.
4.1. Deterministic trust assessment
j

Let Li be the set of end-users (uk ) who send ui = k ratings (Xk )


about WSj performance. ui estimates WSj s trust (T j ) according to
XL

both

j
Xk

and CRk . Eq. (5) establishes T

TX j = 
Li

kLi

as a weighted average of Xk .
(5)

is the best trusted due to the limited knowledge about WSj amongst
Li . As a solution ui can consult an additional set of peers (L
i ) who
have already established trust values for WSj using other ratings given
by um = k . Therefore trust can also be measured using WSj s reputation (REP ij ). Sabater and Sierra (2002) dene reputation as the opinion (or view) of someone about something. Ramchurn et al. (2004)
rene Sabater and Sierras denition by stating that This view can
be mainly derived from an aggregation of opinions of members of the
community about one of them. Ramchurn et al. distinguished between trust and reputation; the former is derived from direct interactions and the latter is mainly acquired from the environment or other
agents and ultimately allows to establish trust. Eq. (6) assesses REP ij :

REP ij =

4.2.1. Probabilistic databases in brief


Formally, P robabilistic DataBase P robDB = (S, T , prob) is a triple
consisting of a database Schema (S), a nite set of T uples (T ), and
a function prob that assigns a probability value to each tuple t T .
S denes Probabilistic Relations ProbR represented as ProbR(A1 ,
, Am , p) where A1 , , Am denote a nite set of Attributes and
p denotes the probability value attached to t in a relation instance
of ProbR. prob(t) represents the condence that the tuple exists in
the database, i.e., a higher value of prob(t) means a higher condence that t is valid. The Semantics (Sem) of ProbDB is dened
through the possible worlds model (Dalvi and Suciu, 2007). Cavallo
and Pittarelli (1987) dene Sem(ProbDB ) as a discrete probability
space over a nite number (n) of database instances. They refer to
the various alternative states of ProbDB as possible worlds (pwdk ).
ProbDB with n tuples can include 2n possible worlds, i.e., one for
each subset of tuples. Possible worlds express the following uncertainty: one of the possible worlds is true, but we do not know which
one, and the probabilities represent degrees of belief in the various possible worlds (Huang et al., 2009). Formally, Sem(P robDB )=(P W D,P )
where P W D = { pwd1 , . . . , pwdn } and P : P W D [0, 1] such that

j=1,n P j = 1.
Different data models exist to handle uncertainty in databases.
For instance, tuple-level uncertainty models reported by Fuhr and
Rlleke (1997) and Sarma et al. (2006), associate existence probabilities with tuples. These models are attractive in data integration
for multiple reasons (Sen and Deshpande, 2007): they typically result in rst-normal form relations; they provide simple and intuitive
querying semantics; and they are easier to store and manipulate compared to attribute-level uncertainty-models. The independent tuplelevel uncertainty-model like (Dalvi and Suciu, 2007) is commonly
used for data integration and information extraction in probabilistic data management. In this model, ProbDB is an ordinary relational
database where each tuple is associated with a probability of being
true regardless of any other tuple.

kLi

is a good trust measure but does not allow to claim that WSj

XL


1

(CRk Xkj )
CRk

j
XL
i

499

 j
1

TLm

L
| i|

mL

(6)

Finally, Eq. (7) combines both T i j and REP ij with and as ui s


X

preference scores, respectively.

T ji = TX j + REP ij
Li

(7)

4.2. Probabilistic trust assessment


To keep the paper self-contained we briey review probabilistic
databases. Readers are referred to the work of Cavallo and Pittarelli
(1987) for more details. Furthermore, we discuss how our probabilistic database is structured using a tuple-independent uncertaintymodel and how trust is assessed based on query evaluation.

4.2.2. Our probabilistic data-model


Our trust approach aims at designing ProbDB in order to assess trust. Let us consider the following tuple t:
ui hascorrectlyobserved that WSj satisfies his
requests. The uncertainty here reects the probability (prob(t))
that t occurs. Therefore, prob(t) means the extent to which this
observation is true. When prob(t) is equal to 1 (resp. 0) t is valid
(resp. is not) in all cases. A probability prob(t) ]0, 1[ means that t
can occur in some cases, only. We model this uncertainty by CRi .
To design ProbDB we rst pre-process a traditional relational
DataBase (DB ) that contains on top of collected ratings additional
information on service providers and evaluation periods. To obtain
ProbDB we extract relevant views from DB for trust assessment and
add extra details such as credibility values obtained by the credibility model in Section 3.2 to these views. Thus, DB is built upon an
extended schema compared to ProbDB.
For illustration purposes we assume a database that contains one
probabilistic relation ProbR(service, end-user, rating, p) where service,
end-user, rating denote Web services identier, end-users name, and
satisfaction degree of end-user in this service (Fig. 1a). ProbR consists
of three tuples t1 , t2 , and t3 with probabilities 0.12, 0.84, and 0.88,
respectively. These latter correspond to credibility values computed
by using our credibility model on a random dataset.
Fig. 1b shows the possible worlds pwdk for ProbDB and their associated probabilities (Pk ). Each pwdk contains a subset of the tuples
present in ProbDB. Pk is calculated using the independence assumption (multiply together the existence probabilities of tuples present in
pwdk and non-existence probabilities of tuples not present in pwdk ).
For example, P2 for pwd2 = {t1 , t2 } is computed as 0.120.84(1
0.88) = 0.01.

500

Z. Saoud et al. / The Journal of Systems and Software 122 (2016) 496506

Fig. 1. Probabilistic database illustration.

Possible world interpretation is highly intuitive and offers a concise semantics for query evaluation over probabilistic databases. Let
service=W S1 be a query that looks for WS1 in certain tuples in ProbDB.
This query is evaluated against each pwdk separately ((pwdk )). The
probability associated with (pwdk ) corresponds to Pk . The result
is in k=1,8 ( pwdk ) that contains a set of tuples t
. Eq. (8) assesses
Prob(t
):

Prob(t
) =

Pk

(8)

k, t
( pwdk )

Fig. 2 a shows the results of executing service=W S1 on pwdk=1,8 .


(pwd6 ) and (pwd8 ) result in an empty set but with non-zero probabilities. Although the set is empty, it could be relevant for an enduser as this indicates that the existing data are not sucient to infer
relevant answers. Fig. 2b also shows the nal probability computation. Data inaccuracy leads to a large number of answers with low
probabilities and thus, low precision. End-users would appreciate receiving answers with high probabilities.
We note that ProbR contains tuples linked to end-users who provide ratings for different services. These end-users can be constant
(i.e., always credible or not) or inconsistent (i.e., swing back and forth
from credible to uncredible) in their evaluations. Indeed some endusers are more credible than others and provide correct ratings, while
others are less credible and do the opposite. Let consider two tuples
t1 and t2 related to u1 . If t1 is false, then it is false because u1 is wrong.
t2 is likely to be false, too. Thus, if one tuple is false, the probability
that the other tuple is false increases as well. Therefore, the proposed
probabilistic data-model does not comply with the independent tu-

ple model (e.g., Suciu et al., 2011); each tuple is associated with a
probability that needs to be independent from the rest of tuples.
It is worth noticing that it is not straightforward to represent probabilistic databases when all tuples represent independent events.
However, more complex probabilistic databases can sometimes be
decomposed into tuple independent relations and then be normalized (Suciu et al., 2011).
Fig. 3 shows how we normalize P robDB (P robDBN ) into two
tuple-independent probabilistic relations PE E R and P robR1 . P E E R
stores all end-users along with their respective credibility values.
Since PE E R should often be updated we treat it as a view instead
of a table. ui is credible about WSj if his ratings are consistent. Eq. (9)
assesses ui s credibility (CRi ) over the ratings he provided in the past.

CRi =

CRij

(9)

From ProbR we compute CR1 as 0.12 0.84 = 0.1. As u3 provides


only one rating, CR3 remains the same in PE E R. ProbR1 stores all
tuples that now are independent subject to the end-user credibility.
4.2.3. Trust assessment as a query evaluation
To establish WSs trust from ProbDB N we develop specic
queries. An end-user trusts WSj if it has successfully satised a large
number of end-users requests. As mentioned earlier WSs trust establishment consists of aggregating end-users ratings into one probabilistic value. This can be expressed using a SQL query SELECT AVG
to obtain the rating average value from ProbR1 . Intuitively, applying this query on pwdk means that end-users in pwdk jointly

Z. Saoud et al. / The Journal of Systems and Software 122 (2016) 496506

501

Fig. 2. Query evaluation in ProbDB .

Fig. 3. ProbDB normalization.

Fig. 4. Query evaluation on ProbDB N .

observe that WSj satises their requests with probability Pk . Let


FAVG(rating) (service=W S1 ) be the following SQL query:
SELECT AVG(rating)
WHERE service = W S1 ;

FROM ProbDB N

ProbDBN is interpreted as 25 = 32 pwdk . Fig. 4a shows pwd1 s


content. Fig. 4b shows that FAVG(rating) (service=W S1 )s evaluation returns four possible answers for trust value 0.585, 0.2, 0.97 and empty
set ordered by existence probability. Jayram et al. (2007) represents
FAVG() ()s result over probabilistic databases as a weighted average
of possible answers for the trust value.

Despite the simplicity of possible worlds semantics it raises


some serious computational concerns even for simple query operations given by Dalvi and Suciu (2007). Many studies have shown that
the query evaluation problem is P-hard and several algorithms (e.g.,
Dalvi and Suciu, 2007; Jayram et al., 2007) are provided to handle
complex queries over massive data streams. In our work we adopt
Jayram et al.s algorithm (Jayram et al., 2007). Contrarily to existing
probabilistic approaches (e.g., Teacy et al., 2006; Zhou and Hwang,
2007) we propose SQL queries customized according to the endusers preferences for trust assessment. Moreover different types of

502

Z. Saoud et al. / The Journal of Systems and Software 122 (2016) 496506

information can be extracted from ProbDB N . We list hereafter four


query variants for trust assessment:
1. Q1 returns the trust of WSj as an average of the ratings provided
about WSj :
Q1 : SELECT AVG(rating)
FROM ProbDB N
WHERE service = W S j ;
2. Q2
considers
that
we
have
a
predened
list
{u1 , . . . , uk }
of
credible
end-users
known
a
priori
and assesses trust as an average of their ratings:
Q2 : SELECT AVG(rating)
FROM ProbDB N
WHERE service = W S j
AND end-user IN (u1 , . . . , uk );
3. Q3 returns the trust of WSj as an average of the ratings provided about that WS from a given date:
Q3 : SELECT AVG(rating)
FROM ProbDB N
WHERE service = W S j
AND date 2014 01 01;
4. Q4 returns the trust of a provider (provi ) as an average of the ratings provided about all provi s WSs:
Q4 : SELECT AVG(rating)
FROM ProbDB N
WHERE provider = provi ;
5. The trust assessment framework
Our credibility model relies on fuzzy clustering technique
to assess end-users credibility. Two credibility-based trust approaches (i.e., deterministic and probabilistic) are discussed. In this
section we present the design and development of a trust assessment
framework for WSs built upon these credibility and trust models.
We also discuss performance and robustness of this framework so
that the quality of trust is established using both deterministic and
probabilistic approaches.
5.1. Framework design
Our framework includes three main components: feedback & trust
collector, credibility evaluator, and trust evaluator. For performance
purposes we suggest hosting these components on the client-side.
Fig. 5 illustrates a Web service-based environment that supports the
distributed trust assessment framework.
Upon subscription to the framework trust managers are deployed
on end-users platforms. After each transaction the end-user sends
his trust manager feedback/ratings about the experience with a
WS. These feedback/ratings are stored in the feedback database.
A prospective end-user requests his trust manager for some specic WSs. Upon receipt of the request the feedback & trust collector collects a set of feedback/ratings either from end-users in
the same community that stems from different social networks or
from other trust managers. Then, the credibility evaluator computes
each end-users credibility based on Eq. (4). The end-users credibility values are then used to generate the probabilistic feedback
database.
In this database, each tuple is associated with a probability
value that represents the credibility value of the end-user who provided that feedback. The trust evaluator assesses deterministic trust
upon the end-users request. To this end it uses other end-users
feedback/ratings (Eq. (5)). When end-users have strict security requirements the trust evaluator estimates probabilistic trust as per
Section 4. This would provide end-users with rened risk analysis.
The trust evaluator also exchanges trust information stored in the
trust repository about WSs with other end-users in the same community (Eq. (6)). Finally the trust evaluator aggregates this information as per Eq. (7) with the end-users preferences. Trust is assessed
as a query executed over the feedback probabilistic database. Finally,
the trust manager sends the end-user the most trusted Web service. In the rest of this section the components of the framework are
explained.

Feedback & trust collector supports queries from prospective


end-users about trust values in a given context. In our approach
both feedback/ratings and trust values are collected. Brokers are deployed over different social networks to make these collected information available when needed. These brokers dynamically update feedback/ratings after each transaction completion and trust
values after each trust request. Therefore an end-user receives the
response from the trust manager more quickly compared to other
approaches given by Malik and Bouguettaya (2009) and Noor et al.
(2013) where feedback/ratings and/or trust values are collected upon
request. Moreover, the end-user is informed about any transaction
and/or trust request involving a certain WS.
Credibility evaluator checks if all feedback/ratings that are
communicated by either end-users without trust managers or other
trust managers are currently valid. To this end the credibility evaluator
screens the feedback repository to look for feedback/ratings from
peers with social links like friendship and supervision with the
prospective end-user. As per Section 3.2 three strategies establish
the majority based on fuzzy clusters characteristics like the number
of end-users with low and/or high membership degrees. Therefore, an end-user should specify his own low and high fuzzy
values as membership functions according to the WSs context of
use (e.g., critical).
Trust evaluator is an editor that allows end-users to dene
preference scores used in Eq. (7) for trust specication. Certain
trust queries require that the trust evaluator evaluates trust with
constraints such as over some specic evaluation periods and the
collected feedback/ratings, only.

5.2. Prototype and experiments


We implemented the trust assessment framework in JAVA using
Eclipse IDE and PostgreSQL for feedback/ratings and trust storage, respectively. First, we developed different graphical user interfaces to
cater for end-users requirements like QoS preferences and trust level.
The experiments analyze the quality of trust from the perspective of
the frameworks robustness and performance. Robustness is an important quality attribute when end-users heavily rely on the framework for executing critical applications. It is dened by the IEEE standard glossary of software engineering terminology as: The degree to
which a system or component can function correctly in the presence of
invalid inputs or stressful environmental conditions (IEEE, 1990). To
interfere the framework execution we purposely inject invalid feedback/ratings from malicious end-users.
Our experiments use WS-Dream dataset1 that provides real-world
QoS evaluation results from 339 end-users on 5,825 WSs. It also provides information about users in the format (User ID, user IP address,
country, longitude, latitude) and WSs in the format (WS ID, WSDL address, provider name, country name). The QoS criteria are response
time and throughput. In order to adapt the dataset to our work, we
convert the QoS results into ratings. We rst assess the appreciation
of each end-user regarding the obtained value of the QoS as a ratio of
the obtained value of the QoS and the desired value of the QoS. For
instance, the desired value could be the best performance of the WS
for that QoS (minimum or maximum depending on the QoS). Then
we aggregate the appreciations of each QoS into one value that we
consider as the end-users rating.
The open-source library Apache Mahout for fuzzy C-means algorithm (Section 3.2) was also used in the experiments. The number
of clusters c and termination criterion  were xed to 3 and 0.05,
respectively.

http://www.wsdream.net .

Z. Saoud et al. / The Journal of Systems and Software 122 (2016) 496506

503

Fig. 5. Trust assessment framework.

5.2.1. Parameter setting


The rst stage of the experiments consists of altering a variable
ratio of existing end-users feedback/ratings in the dataset and
preserving feedback/ratings of the remaining end-users known as
normal. The former become either malicious or strict. This permits to
simulate attacks against the framework for analyzing how it behaves
in presence of altered end-users.
Since malicious end-users tell the opposite of what they initially
perceive their satisfaction is reversed. Strict end-users usually expect
from WS exactly the QoS published. We alter these end-users ratings
by setting a lower desired value for the response time and a higher
desired value for the throughput. Hence, their ratings will be more
restricted than those of normal end-users.
In our experiments we use three metrics. The rst is Root-MeanSquared Error RMSE. We use this metric to compare the trust values
obtained before and after altering end-users ratings. The other two
are recall and precision metrics used for evaluating search strategies.
To assess these metrics we need to differentiate between malicious
and non-malicious end-users. So we considered that non-malicious
end-users are those who belong to the majority cluster and altered
out the malicious ones. This altering helps also differentiating between strict end-users ratings that can deviate from the majority and
those submitted by malicious ones. Singling out malicious end-users
was driven by the inconsistency of their ratings over a certain time
frame. Recall and Precision metrics are relevant for checking rst,
the success of our credibility model nding the most relevant ratings that come from non-malicious end-users and second, the lack of
some important ratings (i.e., if it differentiates between true and false
non-malicious end-users. Let A be the set of non malicious end-users
and B the set of end-users that belong to the majority cluster. Recall
measures the capacity of the credibility model to nd non malicious
users ratings as per Eq. (10). Precision measures the capacity of our

credibility model to reject malicious end-users ratings as per Eq. (11).


Both metrics are expressed as a percentage.

Recall =

|A B|
|A|

P recision =

|A B|
|B|

(10)

(11)

5.2.2. Experimental results


Experiments on credibility models. We conducted four experiments to
evaluate the impact of the credibility model and different strategies
dened in Section 3.2 on improving the frameworks performance
and robustness.
The rst computes trust values based on two parameters: (i) ratio of altered end-users in the dataset; and (ii) choice of a predened trust model Mi . We consider three trust models: M1 uses
unweighted ratings (i.e., without considering end-users credibility
values), M2 uses K-means clustering-based credibility model, and
M3 uses C-means clustering-based credibility model. Then, we assess the trust error using the RMSE metric. Fig. 6(a) shows that M3 returns smaller trust error which means more accurate trust values (i.e.,
closer to the actual one) than the models relying on K-means clustering. Therefore, M3 helps in improving the frameworks robustness.
The second experiment compares trust values obtained with M2
and M3 in the presence of strict end-users. Fig. 6(b) shows that the
trust values obtained with M3 are always lower than those given
with M2 . This shows that strict end-users ratings are well considered using M3 .
The third experiment analyzes the impact of the different strategies (i.e. weak, moderate and strong) on the frameworks performance in achieving realistic trust values. This experiment establishes

504

Z. Saoud et al. / The Journal of Systems and Software 122 (2016) 496506

Fig. 7. Strategy performance.

Fig. 6. Quality of trust.

the majority opinion using the different strategies, assessing endusers credibility values and nally, comparing the obtained trust values. Fig. 7 shows that moderate and strong strategies give smaller and
less oscillating trust error and thus the most accurate and consistent
results. These strategies hold steady even when end-users ratings are
altered. The weak strategys results are very unstable and the trust error signicantly increases when the ratio of strict end-users exceeds
20%. The weak strategy detects another cluster as a potential majority
cluster and selects it as MC . This latter proves to be inappropriate.
The fourth experiment evaluates the capacity of our credibility
model in nding non-malicious end-users ratings and rejecting malicious end-users ratings using the recall and precision metrics. We
consider two credibility models: K-means clustering-based credibility model (CM1 ) and C-means clustering-based credibility model
(CM2 ). Fig. 8(a) and (b) shows the improvement in recall and precision that we have achieved using our credibility model. Both recall
and precision have increased. Even when the recall given by CM2 was
lower at some points, this did not impact the precision. Based on the
experiment the average precision has increased by 43% and the recall by 1.3%. The increase in the precision is quite impressive and has
even reached 100% in some tests. This shows the eciency of CM2

Fig. 8. Precision and recall.

Z. Saoud et al. / The Journal of Systems and Software 122 (2016) 496506

505

References

Fig. 9. Quality of deterministic versus probabilistic trust.

in rejecting malicious end-users ratings. As for recall the increase is


neither very high nor stable but still acceptable since we are only interested in including strict end-users into the majority cluster and
avoiding malicious ones.
Experiments on deterministic versus probabilistic trust. Fig. 9 depicts
the comparison between deterministic and probabilistic trust approaches dened in Section 4. The experiment analyzes the performance of both approaches in achieving realistic trust values when
altering the ratio of malicious end-users using RMSE. It can be seen
that the deterministic trust error results oscillate more than the probabilistic trust error results. The probabilistic approach gives always
smaller trust error. When passing the ratio of 50% both approaches
present a much bigger trust error.
6. Conclusion
In this paper we proposed a new credibility-based model for assessing Web service trust. In this model the focus is on strict endusers who have no interest in aligning themselves with the majority
opinion. Unfortunately existing trust approaches exclude them from
this assessment. To address this unfair exclusion credibility as a selective factor is used to help improve trust assessment. Fuzzy clustering was used to determine an end-users credibility. Two novel
trust approaches, deterministic and probabilistic, are introduced for
assessing Web service trust under uncertainty that raises from the
lack of consistent ratings that end-users provide over time and the inconsistency of the assessed QoS values. In the deterministic approach
two trust measures are proposed: end-users feedback/ratings and
Web services reputation. The second trust approach relies on probabilistic databases that stem from probability theory coupled with
possible worlds semantics. Our probabilistic database is structured
around the tuple-independent uncertainty model. Trust is assessed
by using specic queries applied to the probabilistic database. A trust
assessment framework implements the proposed trust approaches.
Finally, several experiments have been conducted to evaluate the impact of the credibility model on trust quality and to compare the trust
results obtained with deterministic versus probabilistic approaches.
The experiments demonstrated that trust quality substantially improves when using the credibility model. Even more stable results
are obtained using probabilistic databases. As future work we will explore the possibility to incorporate several credibility models into one
trust model using block-independent uncertainty model to structure
our probabilistic database.

Bezdek, J., 1981. Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer
Academic Publishers.
Bordens, K., Horowitz, I., 2001. Social Psychology. Psychology Press.
Buchegger, S., Boudec, J.-Y. L., 2004. A robust reputation system for peer-to-peer and
mobile ad-hoc networks. In: Proceedings of the Second Workshop on Economics
of Peer-to-peer Systems (P2P Econ). Cambridge, USA.
Cavallo, R., Pittarelli, M., 1987. The theory of probabilistic databases. In: Proceedings of
the Thirteenth International Conference on Very Large Data Bases. Brighton, England.
Cover, T., Thomas, J., 1991. Elements of Information Theory. Wiley.
Dalvi, N., Suciu, D., 2007. Ecient query evaluation on probabilistic databases. Int. J.
Very Larg. Data Bases (VLDB) J. 16 (4), 523544.
Fuhr, N., Rlleke, T., 1997. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Inf. Syst. (TOIS) 15 (1).
Huang, J., Antova, L., Koch, C., Olteanu, D., 2009. MayBMS: a probabilistic database management system. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data. New York, USA.
IEEE, 1990. Standard Glossary of Software Engineering Terminology. Technical Report.
IEEE Computer Society Press.
Jayram, T.S., Kale, S., Vee, E., 2007. Ecient aggregation algorithms for probabilistic
data. In: Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms.
New Orleans, USA.
Jsang, A., 2001. A logic for uncertain probabilities. Int. J. Uncertain. Fuzziness Knowl.
Based Syst. 9 (3).
Kanungo, T., Mount, D., Netanyahu, N., Piatko, C., Silverman, R., Wu, A., 2002. An ecient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24 (7).
Kim, Y., Kim, D., 2005. A study of online transaction self-ecacy, consumer trust,
and uncertainty reduction in electronic commerce transaction. In: Proceedings of
the Annual Hawaii International Conference on System Sciences (HICSS). Hawaii,
USA.
Kyburg, H.E., 1987. Bayesian and non-bayesian evidential updating. Artif.
Intell. 3 (1).
Lesko, W., 1997. Readings in Social Psychology: General, Classic and Contemporary Selections. Boston: Allyn & Bacon.
Mihail, P.M., Saberi, A., 2007. Random walks with lookahead in power law random
graphs. Internet Math. 1 (1).
Malik, Z., Bouguettaya, A., 2009. Rateweb: reputation assessment for trust establishment among web services. Very Larg. Data Bases (VLDB) J. 18 (4).
Nguyen, N., Caruana, R., 2007. Consensus clusterings. In: Proceedings of the Seventh
IEEE International Conference on Data Mining. Omaha, USA.
Noor, T., Sheng, Q., Ngu, A., Alfazi, A., Law, J., 2013. Cloud armor: a platform for
credibility-based trust management of cloud services. In: Proceedings of the ACM
Conference on Information and Knowledge Management (CIKM).
Ramchurn, S., Huynh, D., Jennings, N., 2004. Trust in multi-agent systems. Knowl. Eng.
Rev. 19 (1).
Ries, S., Habib, S., Mhlhuser, M., Varadharajan, V., 2011. CertainLogic: a Logic for Modeling Trust and Uncertainty. Lecture Notes in Computer Science, 6740. Springer.
Sabater, J., Sierra, C., 2002. Reputation and social network analysis in multi-agent systems. In: Proceedings of the First International Joint Conference on Autonomous
Agents and Multiagent Systems: Part 1. Bologna, Italy.
Sarma, A., Benjelloun, O., Halevy, A., Widom, J., 2006. Working models for uncertain
data. In: Proceedings of the Twenty-Second International Conference on Data Engineering (ICDE). Atlanta, USA.
Schum, D., Morris, J., 2007. Assessing the competence and credibility of human sources
of intelligence evidence: contributions from law and probability. Law Probab. Risk
6 (1).
Selcuk, A., Uzun, E., Pariente, M., 2004. A reputation-based trust management system
for p2p networks. In: Proceedings of the International Symposium on Cluster Computing and the Grid. Chicago, USA.
Sen, P., Deshpande, A., 2007. Representing and querying correlated tuples in probabilistic databases. In: Proceedings of the International Conference on Data Engineering
(ICDE). Istanbul, Turkey.
Sternthal, B., Phillips, L., Dholakia, R., 1978. The persuasive effect of source credibility:
a situational analysis. Public Opin. Q. 42 (3).
Suciu, D., Olteanu, D., Koch, C., 2011. Probabilistic Databases. Synthesis Digital Library
of Engineering and Computer Science.
Teacy, W.T., Patel, J., Jennings, N.R., Luck, M., 2006. Travos: trust and reputation in the
context of inaccurate information sources. Auton. Agents and Multi Agent Syst. 12
(2).
Troffaes, M., 2006. Generalizing the conjunction rule for aggregating conicting expert
opinions. Int. J. Intell. Syst. 21 (3).
Wang, Y., Singh, M., 2007. Formal trust model for multiagent systems. In: Proceedings
of the International Joint Conference on Articial Intelligence. Hyderabad, India.
Weng, J., Miao, C., Goh, A., 2005. Protecting online rating systems from unfair ratings.
In: Trust, Privacy, and Security in Digital Business. In: Lecture Notes in Computer
Science, 3592.
Whitby, A., Jsang, A., Indulska, J., 2004. Filtering out unfair ratings in Bayesian reputation systems. In: Proceedings of the Workshop on Trust in Agent Societies hold in
the Autonomous Agents and Multi Agent Systems Conference.
Xiong, L., Liu, L., 2004. Peertrust: supporting reputation-based trust for peer-to-peer
electronic communities. IEEE Trans. Knowl. Data Eng. 16 (7).
Yager, R.R., 2004. Participatory learning: a paradigm for building better digital and human agents. Law Probab. Risk 3 (1).

506

Z. Saoud et al. / The Journal of Systems and Software 122 (2016) 496506

Yu, B., Singh, M.P., 2002. An evidential model of distributed reputation management.
In: International Joint Conference on Autonomous Agents and Multi-Agent Systems. Bologna, Italy.
Zhou, R., Hwang, K., 2007. Powertrust: a robust and scalable reputation system for
trusted peer-to-peer computing. IEEE Trans. Parallel Distrib. Syst. 18 (4).
Zohra Saoud is a PhD student at Universit Lyon 1. She received a Masters degree in Databases and Web Technologies from the Universit de Poitiers. Her main
research interests include: Web services, trust and related elds. Contact her at
zohra.saoud@liris.cnrs.fr.
Noura Faci is an associate professor in the Department of Computer Science at Universit Lyon 1, France. Her research interests include fault-tolerance, trust, serviceoriented computing, social networks, and Enterprise 2.0. Faci received a PhD in Computer Science from Reims University, France. Contact her at noura.faci@liris.cnrs.fr.

Zakaria Maamar is a full professor in the College of Information Technology of Zayed


University, Dubai, United Arab Emirates. His research interests include Web services,
social networks, and context-aware computing. He has a PhD in computer science from
Laval University, Quebec City, Canada.
Djamal Benslimane is a full professor of computer sciences at Lyon 1 University. His
research interests include Distributed Information Systems and Web services. He has
published several papers in well-known journals (e.g. Communications of the ACM,
ACM Transactions On Internet Technology, ACM Transactions on Software Engineering
and Methodology, IEEE Transactions on Service Computing, IEEE Transactions on Systems, Man, and Cybernetics, IEEE Internet Computing, Data & Knowledge Engineering).

You might also like