Professional Documents
Culture Documents
© 2010 IEEE.
Multidimensional Ontology modeling of Human Digital Ecosystems
affected by Social Behavioural Data Patterns
Abstract— Relational and hierarchical data modeling stud- II. PROBLEM STATEMENT
ies are carried out, using simple and explicit comparison based
ontology. The comparison is basically performed on relation- Data integration is a significant issue in the context of
ally and hierarchically structured data entities/dimensions. integrating wide variety and types of multidimensional
This methodology is adopted to understand the human ecosys- data. Human and social ecosystems possess multidimen-
tem that is affected by human behavioural and social disorder sional data attributes along with their instances. Connecting
data patterns. For example, the comparison may be made
these systems and extracting knowledge that can be inter-
among human systems, which could be between male and fe-
male, fat and slim, disabled and normal (physical impair- preted by health, medical professionals and social workers
ment), again normal and abnormal (psychological), smokers is a real problem issue.
and non-smokers and among different age group domains. Families interact with the environment to form and make
There could be different hierarchies among which, different up an ecosystem. Families make good use of (as well as for
super-type dimensions are conceptualized into several sub- society) biological sustenance, economic maintenance and
type dimensions and integrated them by connecting the inter- balance the psycho-social functions. All individuals, fami-
related several common data attributes. Domain ontologies lies as a group, irrespective of their identity, are interde-
are built based on the known-knowledge mining and thus un- pendent, especially on the usage of earth’s resources. Con-
known relationships are modeled that are affected by social
sequently a balance between cooperation and integration of
behaviour data patterns. This study is useful in understanding
human situations, behavioral patterns and social ecology that the ecosystem, satisfying the demands and needs of indi-
can facilitate health and medical practitioners, social workers viduals and also respecting family institution as a social
and psychologists, while treating their patients and clients. system/entity, their autonomy and freedom, is sought. Ei-
ther as individual or group of individuals, human behaviour
Index Terms— Ontology, digital ecosystems, social behav- is connected to different dimensions of society. It is signifi-
ioural patterns, health and medicine. cant to understand and model these interdependent human
situations from data patterns (Fig. 1).
I. INTRODUCTION
Human ecosystems do not exist independently, but inter- Domain Ontology Modeling
act in a complex web of human and ecological relationships
connecting all (human) ecosystems to make up the bio- Similarity Comparative Differential Parallel
Specialization
ing such concepts and relationships that exist among hu- body size
food emotions
habits
skin greedy living conditions
mans. Groups or classifications can channel into a particu- hobbies human ecology
gender
human ecosystems
working conditions
age
lar ecosystem based on the entities, dimensions and attrib- human anatomy traffic
human anatomy anger shapes happy sorrow
utes.
Generalization
Fig. 1: Process model of domain otology
498
4th IEEE International Conference on Digital Ecosystems and Technologies (IEEE DEST 2010)
© 2010 IEEE.
Human behavioural patterns are also unpredictable. In demonstrated in Fig. 1, several concepts are drawn from
the ecologically integrated environment, these behavioural entities and dimensions based on comparison, differential,
patterns (eco-pattern) have definite impacts on individual or similarity, parallelism to develop relational and hierarchical
among individuals. Humans express or react with a variety ontologies. The comparison based approach is more spe-
of emotions and feelings. Medical practitioners, social cific and explicit, in which several properties of different
workers and psychologists, daily treat numerous people data dimensions are compared. In other words, this ap-
with abnormal behavioural patterns. These patients are enti- proach compares data instances of attributes of multiple
ties, or described as conceptualization of several dimen- dimensions taken from several relational and hierarchical
sions with several attributes, based on their sex, body pos- data structural models warehoused in an integrated envi-
tures, skin colour, habits (living styles, including food hab- ronment (Fig. 2). In the case of differential concept, differ-
its) and age. There may be volumes of data items and in- ent instances (may be different in unique attributes) are
stances associated with these dimensions for several people drawn based on attribute/property strengths and sizes. The
and their associated behaviour or emotion attributes. Behav- similarity concept is drawn from same data attribute prop-
iour, emotion, greed, anger, sorrow, happiness dimensions erty strengths and magnitudes. The attributes based on par-
which contribute to ecology, are interconnected to different allelism concept are drawn that are comparable to each
dimensions and associated within human ecosystem. Each other. Again, these concepts may have several permutations
character or property which is scalable is considered as a and combinations among themselves, based on logic, scale
dimension. All these dimensions are grouped into generali- and context, based on which attributes and dimensions are
zation and specialization categories. Manually handling conceptualized.
these volumes of data and accessing them from different
web and on-line/off-line sources are tedious processes. Au-
Data Mining Human Ecology Data-instances Process Model
thors propose ontology based data warehousing and mining
data extraction
Human Ecosystem data warehouse
to analyze these data for several categories and behavioural
Exchange process
Data Instances DB
Data Warehouse
data mining. Domain ontologies are designed for knowl- RDBMS
other engineering applications have been investigated by Fig. 2: An integrated ontology framework for building Human Ecosystem
[5] and [8], demonstrating several multidimensional data
structuring and data integration issues. A. Components, Objectives & Dimensions of Ecosystems
Adaptation is a continuing process in an integrated sys-
IV. METHODOLOGY AND FRAMEWORK tem (ecosystem) -each and every element contributing to
the integrated ecosystem “respond, change, develop, and
Data integration is a characteristic solution in digital sys- act-on and modify its environment.” Measures and scales
tem framework. Several attributes are conceptualized in are significant indicators for social attainment and content-
these diversified multidimensional data structures of differ- ment. For example, there are socially defined and achiev-
ent domain ontologies. Digital ecosystem in the context of able goals which possess social values such as doctors treat-
ecology, surrounding the human-being is essentially creat- ing patients, or need and desire fulfilment, rather than being
ing value by making connections between human-being and outputs from the ecosystems. Social workers provide ser-
ecology of social behaviour domain. These domains could vices to variety of their customers to full their needs that
be generalization of gender, human body posture, skin attain their survival goals. Entities that are inherited from
color, food habits and age, and in the context of social an integrated and holistic approach preserve bio-diversity
ecology, any specific activity/action or business (different from genetic (generalization, in top/down hierarchy) to
domains) within which human-being is connected. These community level (specialization). These are open, dynamic
are all embedded in a networked society which can be and complex systems. The components of this system focus
simulated in different IT collaborative frameworks. The re- on dynamic interrelations, including social, biological, po-
lationships between human-being and ecology are concep- litical, economical and physical features. Humans are parts
tualized in multiple scalable dimensions; and each of which of ecosystems with dynamic interrelations. Broad spatial
have definite interaction with other dimensions and attribu- and temporal dimensions are attributable to scaling these
tion to different scales. In the context of multidimensional ecosystems. Several institutions, individuals and groups are
ecology, time and space are significant scalable dimensions. parts of these ecosystems.
Relational and hierarchical ontologies are proposed. As
499
4th IEEE International Conference on Digital Ecosystems and Technologies (IEEE DEST 2010)
© 2010 IEEE.
Male and female entities and their attributes can be ‘in grained data structuring approach as shown in Fig. 4.
common’ and ‘uncommon’. Humans of ‘skinny’ and ‘fatty’ In Fig. 5, the data connectivity between two different
postures are other entities. Human emotions and feelings eco-systems is conceptualized thus establishing unknown
are behavioural attributes, having definite measure and relationships. As demonstrated in Figs. 6-7, star schemas
scales. Documenting, acquiring these dimensions and at- narrate the age-behaviour eco-system models, based on de-
tributes and logically organizing them are part of current duced and validated multiple data dimensions (from Fig. 4)
scope of work. All these dimensions and attributes are hier- hierarchies. Key data attributes and their instances have
archical (Fig.3) and relational in nature. been used to model human eco-systems with some in-
stances being shown in Table 1.
Integrating Human anatomy - Ecosystem domain ontology structures
from Lateral (Geography)and Longitudinal (Periodic) Dimensions
Connectivity between two ecosystems
T11
S11
Behaviour Ecosystem
S21 S22 T21
Human Ecosystem
Integration
Reference
Ontology storage
Gender Emotion
S11 Gender
Behaviour Ecosystem
Human ecosystem
Age domain domain
S21 S22 (T11) Ontology DB Ontology
DB
Hierarchical Ontologies
Anger
Occupation
Datasets from different sources
(geography/periodic dimensions)
postures, skin, habits and age attributes of human ecosys- Fig. 5: Logically connected two ecosystems
tems. For example, similar sex matching attributes modeled
Table1: Hierarchies for each dimension (with instances) of the current data
as comparable is a composite ontology in which conceptu- warehouse
alized multiple dimensions and their associated attributes Period-dimension:
are inherited. Short height or tall person has scalable attrib- hour<day<month<year
ute too. location-dimension:
age
location-probe<city<state<region<country
Identify, conceptualize and
review data dimensions age-dimension:
category
(infant, teen, adult, old) € all (age)
(13, 14, ….19) € teen
(45, …, 50, 55) € old
Determine key dimensions process
(20, 21, 22, …, 30) € adult
climbing up the concept hierarchy (1, 2, …4, 5,…10) € infant
d im e n sio n co n c e p tu a liza tion
constraint
gender-dimension:
Check if fact is a dimension (male, female) € all (gender)
location
(10%, 20%, … 35%, 40% ... 55%) € male
type
city state region (50%, …, 45%, 60%) € male
(35%, 23%, 45%, … 35%) € male
number
(11%, 35%, … 25%) € female
Check if dimension is a fact
status
500
4th IEEE International Conference on Digital Ecosystems and Technologies (IEEE DEST 2010)
© 2010 IEEE.
Data Warehouse
Age domain - multidimensional schema
Age Facts Adult Age Dimension
Infant Dimension
Age Facts ID Adult Age ID
Infant ID Human
Gender Facts ID Name Ecology
Infant Name Wrapper
Data
Infant ID Type
Cleaning
Infant attachment
Initial Data Data
Reduced Data
D a ta S o u r c e s
Teen Age ID Old Age Dimension Dataset Selection Dataset Mining
Teen Age Dimension Data Interpretation
Social Wrapper Integration
D a ta p ro c e s s in g a n d In te rp re ta tio n
Teen Age ID Period ID Old Age ID
Ecology Data
Teen Name Name Data Reduction
Location ID Transformation
Teen Approach Type
Old Age ID
Human Wrapper
Period Dimension Location Dimension
Anatomy
Period ID Adult Age ID Location ID
Multidimensional Ontology
Number Type
Units
Type Area
Male Dimension
Male ID
Gender Facts Data Instances
Name Dimension and Fact Tables
Gender Facts ID
IQ
Knowledge Base
Male ID
Female ID
Female Dimension
Fig. 8: Knowledge Discovery from Multidimensional Data warehouse
Female ID
Units
Measures
Name
IQ
As a part of knowledge building process, it is important
to understand the logically organized data dimensions. A
Fig. 6: Age domain Multidimensional Ontology connecting two different
fact tables
constellation schema is prepared with human eco-system
data instances as demonstrated in Fig. 9, interpreting multi-
Emotion Dimension
Behaviour Ecosystem Facts
Sorrow Dimension ple data instances.
Emotion ID
Behaviour Ecosystem Facts ID Sorrow ID
Knowledge-base structure models: A humanecosys-
Sorrow ID
tem_socialecosystem structure model Z for L (logic) is a 5-
Sorrow Type
Type
Anger ID Area
Name
Happyness ID tuple (POSITION, AGE, GENDER, SKIN_COLOR, FOOD_HABIT,
Behaviour Dimensions
Greedy Dimension
Gender ID Anger Dimension
Greedy ID
Type Age ID Anger ID
(Logic1) is a 5-tuple (POSITION, EMOTIONAL, GREEDY,
ANGER, SORROW, HAPPYNESS, PERIOD) Here S =
Anger Type
Name Period ID
Area
Period Dimension Greedy ID
Happyness Dimension
U{(POSITION, AGE) R1}and U {(AGE, GENDER) R2} are do-
Period ID
Number
Emotion ID
Happyness ID mains of Z structure, and consists of the union of two mu-
Type
Units Type tually disjoint sets (POSITION, AGE), R1 and (AGE, GENDER),
Measures Area
R2. (POSITION, AGE) is a set of individual entities of S and R
Age Dimension Gender Dimension is a set of relationships between (POSITION, AGE) and (AGE,
Gender ID
GENDER) entities. R is partitioned in different ways as de-
Age ID
Type Type
Name
signer wanted it as R1 and R2, since the prior entity combi-
Name
nations are logically related.
Connecting Dimensions
Fig. 7: Multidimensional star schema model narrating attribute dimensions Gender Dimension
Period Dimension
Factors affecting human ecosystem are scalable and Age Dimension
Period Gender ID
measurable both qualitatively and quantitatively. These fac- Age ID Category Perc
ID
Month Day Year Type IQ
tors influence human anatomy [8] and the surrounding eco- Teen 5 Dec 30 2001 1 Male 50%
1 35%
system, which is in turn interconnected through integrated 2 Female
domain ontologies. The knowledge mapped from these in- 10 Jan 1 2008 75%
operations can be deduced [13], in data preparation stage, Location ID St City State Country
Emotion ID
human anatomy ontologies are integrated with ecosystems Type Response
5 Er St Sun WA AUS
by means of their heterogeneous data instances and appro- 1 Inferior Yes
6
priately guide the selection of data views to be mined. Dur-
7
ing data mining stage, domain knowledge is allowed with 2 Superior No
specification of constraints for directing the data mining Dimension Data Instances
procedures, for example narrowing of searching process. Fig. 9: A constellation schema connecting human and behaviour ecosys-
tems
During interpretation stage (Fig. 8), domain experts of hu-
If this 2-tuple is to be interconnected, it could be done
man anatomy and ecosystems validate and visualize the ex-
through relational or hierarchical relationships, such as:
tracted data views.
S = U{(POSITION, AGE)R1} and U {(EMOTIONAL,
501
4th IEEE International Conference on Digital Ecosystems and Technologies (IEEE DEST 2010)
© 2010 IEEE.
GREEDY)R2} are other domain of Z1 structure and consists
ensional dataview
Month = Oct
ensional dataview
Behaviour = Emotion
ultidim
Age = 30%
Behaviour = anger
M
Category=poverty
ultidim
Gender = 45% (male)
M
R is partitioned in different ways as designer wanted it as Dec
R1 and R2, since the priority entity combinations are logi- Nov
cally related.
Oct
Sep
onthly
Aug
M
Jul
Jun
Mar
tem. Feb
hu r
s g w
a b
t e
pb
a r
po
n ro y
y rn
s
n
Jan
o ed
g
e
re e n
s
s
te o
Anatomy=(A1, A2, A3… Ai); i=Anatomy unit numbers; A=Anatomy
)
S
Posture
ccupation
Disability
Age
ender
Poverty
n ti
Colour
E
FoodHabit
s o
m
(B
r
u
io
units
v
a
h
e
O
B
Am = U {(L1D, L2D…LmD) + (L1S, L2S…LmS)}; Category (Human Ecosystem)
Lm.n=U {(Pm, (n, n+1, n+2) Pm, (n+3, n+4, n+5) Pm, (n+6, n+7, n+8) .)}; Fig. 10: Data views representation from multidimensional human and be-
U= union; L = location; P = point; D = detection point; S = source point havioral ecosystems
m = number of treatment lines and n = number of infection/treatment Data discretization is used to minimize number of di-
points on each treatment line. mensions (as per logical organization of their attributes,
Data preparation – or pre-processing is aimed at quality which can classify to a meaningful group) of each entity
controlling the data, data cleaning, transformation, reduc- and consequently help interpretation of mining results.
tion and data integration. Each has an impact on the other, These classifications could as per scalable dimensions. In
when data inconsistencies are detected and rectified. Data such cases, the range of attribute can be divided into several
integration is done using data warehousing approach, com- intervals (by means of histograms), which can further be
bined with ontology based multidimensional domain on- iteratively aggregated into larger intervals. Scales are data
tologies. Logically warehoused data are normalized (or de- dependent. If user has much understanding of the data at-
normalized for fine-grained structures), integrated and tributes, an appropriate scale can be defined.
smoothed [9], which are ready for processing by data min- Data selection aims at identifying appropriate subsets
ing algorithms. Different data relationships are conceptual- among the initial set of attributes. This operation can be
ized [5], interpreted among several data attribute of human performed with the help of heuristic methods based on tests
ecosystems. These ontologies are also responsible manipu- of significance or entropy-based attribute evaluation [13]
lating conceptualization and contextualization of data trans- measures such as the information gain. Data selection is
formation processes. Ontology facilitates logical descrip- one of the data reduction methods.
tion of data, in away to reduce the number of dimensions
without altering the integrity of original datasets. V. RESULTS AND DISCUSSION
Strategies that include reduction of data are data cube
aggregation, dimension reduction, data discretization, and In the digital world, human ecosystem is essentially
data selection. about creating value by making connections between hu-
Data cube aggregation produces data cubes for storing man and social ecosystems in different domains through
multidimensional aggregated data (e.g. extracted from a support of different forms of collaborative IT frameworks
data warehouse) for OLAP (On-Line Analytical Process- such as data warehousing and data mining integrated ap-
ing) analysis [6]. For example, data on human relationships proaches. Dimensions are subset, discretized and reduced to
and socio-behavioural data held on millions of items, are finer and specialized level.
aggregated into each specific domain ontologies. 10000000
reduced format, with or without loss with respect to the X Coordinate (Easting)
original data set. For example, similarity dimension analy- Fig. 11a: Data View showing similarity and scalable data properties in
sis is used for dimensionality reduction that applies to pro- different regions (age vs. literacy)
jections of initial data onto a space of similar dimensions. The dimension data instances plotted in bubble plots
Again these similar dimensions are segregated as per their have different scales, similarities and also dis-similarities.
scale and magnitude. As demonstrated in Fig. 11 different attribute combinations
have been plotted such as, age vs. literacy, occupation vs.
502
4th IEEE International Conference on Digital Ecosystems and Technologies (IEEE DEST 2010)
© 2010 IEEE.
poverty and poverty vs. emotion. Different groups and sub- emphasis is model building will be addressing to develop-
sets of dimensions have been identified with respect to geo- ment of data mining approaches, which are highly auto-
graphic dimensions. mated, scalable, and reliable. Analysis of domain ontolo-
Similar, dis-similar and scalable bubbles represent sev- gies and thus interpreting domain knowledge are more chal-
eral dimensional magnitudes and the aggregations of these lenging issues.
groups of dimensional attributes.
VII. SCOPE AND FUTURE WORK
10000000
6000000
0 VIII. REFERENCES
0 2000000 4000000 6000000 8000000
X Coordinate (Easting) [1] Anderson, E.N. [1996] Ecologies of the Heart: Emotion Belief and
Fig. 11b: Data view showing similarity and scalable data properties in dif- the Environment, New York and Oxford: Oxford University Press.
ferent regions (occupation vs. poverty) [2] Gruber, T. [2007] Collective Knowledge Systems: Where the Social
Web meets the Semantic Web; Web Semantics: Science, Services
10000000 and Agents on the World Wide Web (2007),
doi:10.1016/j.websem.2007.11.011; http://tomgruber.org/
[3] Gruber, T.R. [1993] A translation approach to portable ontologies.
8000000
Knowledge Acquisition, 5(2):199-220; http://ksl-
web.stanford.edu/KSL_Abstracts/KSL-92-71.html
[4] Machiis, G.E. Force, J.E. and Dalton, S.E. [1994] Ecosystem
6000000
management, Technical paper submitted to the Interior Columbia
River Basin Project, University of Idaho, Moscow, Idaho 83843.
Y Coordinate (Northing)
4000000
[5] Nimmagadda, S.L. and Dreher, H. Ontology-Base Data warehousing
and Mining Approaches in Petroleum Industries: chapter XI in
Negro, H.O., Cisaro, S.E.G., and Xodo, D.H, (Eds.), Data Mining
2000000
with Ontologies: Implementation, Findings and Frameworks, pp 211-
236. Information Science Reference, IGI Global, Hershey, PA, USA,
2007. http://www.igi-pub.com/reference/details.asp?ID=6844
0 [6] A.K, Pujari. Data Mining Techniques, University Press (India), 2002.
0 2000000 4000000 6000000 8000000 [7] Cadez, P. Smyth, and H. Mannila. Probabilistic Modeling of
X Coordinate (Easting)
Transaction Data with Applications to Profiling, Visualization, and
Fig. 11c: Data view showing similarity and scalable data properties in dif- Prediction. In Proceedings of the Seventh ACM SIGKDD
ferent regions (poverty vs. emotion) International Conference on Knowledge Discovery and Data Mining
In recent years, increasing use of high dimensional data (KDD-2001), pages 37–46.ACMPress, New York, NY, 2001.
that occupy large number of database tables with millions [8] Nimmagadda, S.L and Nimmagadda, S. K, and Dreher, H. (2008)
of rows/columns is getting popular. Also, large competitive Ontology based data warehouse modeling and managing ecology of
human body for disease and drug prescription management, a
demand for rapid build and deploy data-driven analysis is technical paper presented and published in an international
increasing with several data mining algorithms and solu- conference of IEEE-DEST, held in Phitsanulok, Thailand, 2008.
tions. A third trend is presenting the analysis of results to [9] Hoffer, J.A, Presscot, M.B and McFadden, F.R (2005). Modern
end-users in a form that can be easily interpreted to im- Database Management, Sixth Edition, Prentice Hall.
prove decision making. Ontology based data mining can [10] Shawkat Ali, A. B. M. and Wasimi, S.A. (2007) Data mining:
Methods and Techniques, Thomson, p. 299.
emphasize scalable, reliable, fully automated and interpret-
[11] Nimmagadda, S.L. and Dreher, H. and Rudra, A. (2005) Ontology of
able data structures that can address the data analysis chal- Western Australian petroleum exploration data for effective data
lenges. warehouse design and data mining, a paper presented and published
in the proceedings of the 3rd international IEEE conference on
VI. CONCLUSIONS AND RECOMMENDATIONS Industrial Informatics, held in Perth, Australia, August, 2005.
[12] Plastria, F. Bruyne, S. D. and Carrizosa, E. (2008). Dimensionality
Ontology based data warehousing appears to be highly reduction for classification: Comparison of techniques and dimension
choice, published in the 4th International Conference, ADMA 2008,
effective, due to fine grained data structuring, effective data Chengdu, China, October, 2008.
integration and interoperability of data dimensions in dif- [13] Adrien Coulet, Malika Smaïl-Tabbone, Pascale Benlian, Amedeo
ferent application scenarios. Ontology handles conceptuali- Napoli and Marie-Dominique Devignes Ontology-guided data
zation; contextualization and semantics among conceptual- preparation for discovering genotype-phenotype relationships, BMC
Bioinformatics 2008, 9(Suppl 4):S3.
ized data attribute relationships. Knowledge building from
these approaches is effective and efficient. Data models de-
scribed in the present case study are more flexible. Main
503