Professional Documents
Culture Documents
1 Introduction
R. Setchi et al. (Eds.): KES 2010, Part II, LNAI 6277, pp. 591600, 2010.
c Springer-Verlag Berlin Heidelberg 2010
592 H. Alvarez et al.
2 Related Work
There are dierent kinds of communities. Kim et al. [4] organize Social Web Com-
munities describing the kind of users, uses and needed features for every kind
of community. Important missions for a Community of Knowledge are sharing
user-created content and how to hold users, specially key members. In this con-
text, VCoP is also a Community of Knowledge, therefore, it should accomplish
such missions, but also it must care that community members full their goals
(or purpose) when using the VCoP.
Concept-Based SNA on VCoP 593
classied experts opinion. The advantage of this work is that experts are iden-
tied, making a good example of expert-oriented enhancements. Unfortunatelly
this approach assume that we already know who are the experts.
In both cases we are concern about who the experts are and how they are
obtained. For that reason, this work focuses on nding a new approach to dis-
cover key members (experts or core members). The better you identifying them
the more relevant will be results obtained by other tecniques. Thus, it will be
possible to obtain better enhancements for the whole community.
VCoP talks about specic main topics, one of his worries is that community do
not deviate of them. Also, VCoPs have purposes (or goals) that administrators
want to accomplish. Therefore, it is important for VCoP to supervise the goals
through time and analyse its evolution. Then, enhancements will surfaced based
on this temporal purpose evolution analysis[8].
Ros et al.[8] dene the VCoP goals of a website in collaboration with com-
munity members. When goals are well established and dened, it is possible
the application of concept-based text mining to evaluate the accomplishment of
communities goals. The concept-based text mining uses fuzzy logic theory to
assign a goal score to every forum in the community. This goal score show
how aligned to this specic goal is the text inside a community forum.
Having these scores, VCoPs administrators evaluate the goals accomplish-
ment to allow administrators to make the proper enhancements. For example, if
two forums have similar score of a specic goal, an enhancement could be merge
both. On the other hand, if a category has very high scores in two specic goals,
it is possible to split the forum in two independent forums closer to each goal.
In this work, the objective was to evaluate the goal accomplishment of VCoPs
forum, but this does not help to evaluate how users contribute to their purpose
accomplishment. Thus, making dicult nding the VCoPs key members. How-
ever, the use of concept-based mining will improve the search accuracy.
Main question of present work is how to enhance key members discovery. This
question has no simple answer, the rst step is to obtain a graphic representa-
tion of the inner social community. The second step is to apply an core members
algorithm (like HITS) to this representation. As a result of the algorithms ap-
plication, we will obtain a rank of all community members leaving in the top of
the rank the experts (core or key) members.
We distinguish key member from expert member. Since a key member has
several characteristics that dene him/her. Firstly, a key member may be expert
in a eld or not. Secondly, he/she may increase the interaction in the community
because he ask interesting questions, which produce answers from the experts
on the eld. This means that questions are very specic in a eld, therefore,
Concept-Based SNA on VCoP 595
only experts are able to answer them. In other words, a key member is a person
totally aligned with the VCoPs goals and topics. Thus, producing contents which
are very relevant to satisfy other members interests. The only way to measure a
key member as we dene him is using an hybrid approach of SNA combined with
semantic-based text mining. Likewise, the mining process must include always
the dierent purposes of the community. This is why we chose the concept-based
text mining approach [8].
We can see a typilcal Forum structure in Fig.(1), then in Fig.(2) we can observe
how the Forum is converted into a graph. In Fig.(2), arcs will represent members
reply and nodes represent the users who made the posts. In our rst approach,
the weight of arcs will be a counter of how many times a member reply to other.
The problem is that we are not considering if the reply of members is according
to the community purpose (for any of these congurations). We have to lter
noisy post. This will be done using the concept-based text mining applied to
posts texts. Of course, networks will be dierent from those obtained before,
because, in order to draw an arc now we will compute the similarity between
concepts on a post and its reply. If a post and a certain reply are suciently
close, then we say they are similar, therefore, post and reply are relevant in a
specic topic (concept).
FORUM A
POST 1 U1
POST 2 U2
POST 3 U3
POST 5 U1
POST 4 U4
U3
POST 6
Fig. 1. A tipical forum structure, in circles are the users who posted
1
U1 U1
U2 U2
1 2
U4 U3 U4 U3
score, we use fuzzy logic to evaluate how much a goal is contained in a singular
post. Then, we will have a post vector in which the components will be the goals
accomplishment scores of the post.
The idea is to compare with euclidean distance two members posts and if the
distance it is over a certain threshold, there will be interaction between them.
We support the idea that this will help us to avoid, lter or erase irrelevant
interactions. For example, in a VCoP with k goals, let puj the post j of user u
that it is a reply to post i of user u (puj ). The distance between them will be
calculated with Eq.(1).
u u gik gjk
d(pi , pj ) = k (1)
2 2
k gik k gjk
Concept-Based SNA on VCoP 597
Where gik is the score of goal k in post i. It is clear that the distance exists only
if puj is a reply to puj . After that, we calculate the weight of arc uu (wuu ) with
Eq. (2).
wuu = d(pui , puj ) (2)
i,j
u
d(pu
i ,pj )
User Note
user2 Administrator
user1254
user37 Administrator
user808 he is not participating lately
user4
user1825
user999
user210 he is not participating lately
user240
user874
user321
user234 participation occasionally
user33
approach. We used PAJEK2 to draw the networks. Afterwards, HITS was applied
to every network topology. These results are shown in Table 2.
Finally, we summarized all results on Table 3. This table shows the intersection
between key members extracted using an algorithm and key members from the
survey on Table 1. We mark users in both lists with an X. We can observe that
concept based approach discovered one additional key member in every topology
used.
Concept-based text mining is used to discover new knowledge, therefore, we
wonder what kind of users are those which are not marked with an X on the
table. To do so, once more we ask the community expert. This time we showed
the list of key members gathered from every algorithm (Table 2). Surprisingly,
2
http://vlado.fmf.uni-lj.si/pub/networks/pajek/
Concept-Based SNA on VCoP 599
he recognized that most of the users on the lists of key members were in fact key
members. He forgot many of them since we are using data from 2002, however,
when he saw them on the list he remember them, validating almost every user
as key member. We marked with a () sign on Table 2 those members which
are not key members. We can observe that hits-cb-creator dicovered 100% key
members. Unfortunatelly, hits-cb-reply was worse than hits-reply to detect key
members.
5 Conclusion
We propose to combine traditional SNA with data mining techniques in order
to produce results closer to reality and gather useful knowledge for VCoPs
enhancement.
We applied two network topology to represent the VCoP, creator-oriented and
last reply-oriented networks. We used Plexilandia.cl which is a VCoP with more
than 2100 active members from 2500 members base.
We showed that SNA combined to concept based text mining approach out-
performs SNA alone to discover VCoPs key members in the case of a creator-
oriented network topology. However, in the case of last reply-oriented network
SNA outperformed SNA plus concept based approach.
We think results were promising since we used all history to perform the
analysis. Besides, we did not took into consideration ranking possition, which
seems to be much closer to reality in concept based SNA approaches. Thus, we
need more experimentation in order to show the real impact of our proposal.
Acknowledgments
Authors would like to thank the continuous support of Instituto Sistemas Com-
plejos de Ingenier a (ICM: P-05-004- F, CONICYT: FBO16); Initiation into