You are on page 1of 14

DOMAIN ONTOLOGY HEALTH INFORMATICS

SYSTEM












INTRODUCTION

Health Informatics combines the fields of computer science, information technology and medical
documentation. It involves the integration of information science, computer technology, and
medical history to collect healthrelated data and service. In addition, operational engineering
and health facilities are both needed to study and develop systems, which administer the
expansion of available information. This now mature discipline fulfils a crucial function in the
collection, storage, retrieval, synthesis and communication of newly discovered knowledge in
patient medical records of symptoms or diseases.
In this project, textual medical data from patient records has been commuted through information
extraction and feature selection to be classified by disease base on ICD-10 diagnosis using the
decision tree algorithm and stored in a medical database. In machine learning, the C4.5 has been
used to construct pathways or provide mapping for existing medical or health informatics
ontologys.
Of interest in this project is health care related knowledge, patients diseases, symptoms,
diagnoses are among the primary fields for describing the domain application. These areas and
their meanings collectively define the ontology for health care. emphasis is on acquiring
knowledge and giving support for the preventive care service such that the management of
diagnoses will require both the integration of knowledge and a data retrieval mechanism. The
derived database can then be used for diagnosis of health problems in terms of implication and
used for providing better healthcare services. Every method that supports this purpose is
presented in DOHIC framework Domain ontology Health Informatics classification (DOHIC).









PROJECT CATEGORY : DATA MINING
Data mining (sometimes called data or knowledge discovery) is the process of analyzing
data from different perspectives and summarizing it into useful information - information that
can be used to increase revenue, cuts costs, or both. Data mining software is one of a number
of analytical tools for analyzing data. It allows users to analyze data from many different
dimensions or angles, categorize it, and summarize the relationships identified. Technically,
data mining is the process of finding correlations or patterns among dozens of fields in large
relational databases.
Data mining consists of five major elements:
Extract, transform, and load transaction data onto the data warehouse system.
Store and manage the data in a multidimensional database system.
Provide data access to business analysts and information technology professionals.
Analyze the data by application software.
Present the data in a useful format, such as a graph or table.









SYSTEM ANALYSIS
EXISTING SYSTEM:

This system introduces ontology for patient medical record in healthcare organizations that
includes a language structure adoptable to any kind of Agent applications in Multi-Agent
platforms, stand alone medical systems or independent applications that have followed by
standards of HL7 and FIPA . This ontology has been constructed with a patient medical record
formatted in HL7message and has enveloped according to standards of FIPA message structure.
Total message structure has ability to be summarized according to the providers request by
senders made user Interface.
PROPOSED SYSTEM

Domain Ontology can represent the particular meanings of terms as they apply to that domain
from medical data. Terms of meaning and use help provide information and knowledge for a
better health informatics service. In this paper, the proposed framework and method for building
ontology from text medical data is called Domain Ontology Health Informatics Classification
(DOHIC).












Usecase Diagrams

Fig 5.1 Use case diagram for server

Fig 5.1 Use case diagram for client

SYSTEM DESIGN

SYSTEM ARCHITECTURE

This section explains the proposed framework and method for building ontology from text
medical data classification, entitled domain ontology health informatics classification (DOHIC).
The initial approach was to classify the symptom for groups of diseases. Textual data was
extracted in three phases. First data was extracted to classify the disease using the principles of
the extract term, followed by sliding window and tf-idf for matrix term weight and frequency.
Groups of diseases were classified using the IDC-10 healthcare standard. This Healthcare base
was also used for building the diagnostic system. The features of C4.5 were applied to classify
documents using domain knowledge.
In the second phase, the classification and relational database was converted to XML/OWL
thereby modifying the formal language to the defined ontological structure and relationships
were identified in terms of word meaning.
In the Third phase, the classification was made by domain names using predefined key words or
terms for the search query in concept domain ontology. Finally, we converted the classification
structure to xml format to build the diagnostic system using the symptoms and search for data
from key words and interested concept. Figure one below shows diagrammatically how the
Domain Ontology Health Informatics Classification(DOHIC) is derived.




Fig.4.1 system architecture


MODULE DIAGRAM
In server side, the server prepare the useful medical data from the dataset ICD-10 health care std
and other text documents. After completing the medical data preparation the server led the data
to the classification module, in classification server classify the disease on the basis of
symptoms. Once the classification module have been finished the module forward the data to
data management module. From the data management module the server get into the
implementation module for processing the user request.
In the client side, user can directly start the application by making a connection with the server.
The user management checks for user authentication in the user data management. The
presentation processes the server request and display the result.











Fig.4.2 Server Module












Fig.4.3 Client Module

Start server
user
Implementation
Data Preparation
Data Classification
Updation
Data
management
User data management

User management
Presentation
connection

DATABASE DESIGN

Table Name:Login

Name Type Description Constraints
Id
Username
Password
Type

INT
Varchar
Varchar
Varchar
User id
Username
Password of User
Type of user
Priamary key
NOT NULL
NOT NULL
NOT NULL












MODULE DESCRIPTION
SERVER MODULES

There are 5 server modules.

1. Data Management: This module concern with management of data ie. managing the
details about disease and stmptoms.
2. Data Preparation: In this module we preparing the data for fulfilling the user needs. The
data is prepared through data mining technique.
3. Classification: The mined data is classified using the c4.5 classification algorithm. The
disease is classified based on symptoms.
4. Data Updation: The new disease and symptoms are added in the database.
5. Implementation: In this module the server process the search query send by the user and
give the result.

CLIENT MODULES

This project contain 2 client modules.
1. Data Management: This module manages the user details. This module also contain the
login process.
2. Presentation: This module display the result to the client or user.




SOFTWARE TESTING
Testing in general is conducted to verify that the software meets specific criteria and
satisfies the requirements of the end user or customer. The high level goal of testing is to identify
defects in the application, thereby permitting the prevention, detection and subsequent removal
of defects and the creation of a stable system. The primary goal of testing is to increase the
probability that the application under test will behave correctly under all circumstances and will
meet defined requirements thus satisfying the end user by detecting as many defects as possible.

MODULE TESTING
The unit testing is concerned with testing the individual modules of the system. Unit
testing focuses verification efforts even in the smallest unit of software design in each module.
This is also known as Module Testing. The modules of the system are tested separately. There
are some validation checks for the fields. Unit testing focuses first on the modules, independently
of one another, to locate errors. This enables the tester to detect errors in coding and logic that
are contained within that module alone. Those resulting from the interaction between modules
are initially avoided. Unit testing can be performed from the bottom up or top down.

For each module in bottom up testing, a short program executes the module and provides
the needed data, so that the module is asked to perform the way it will when embedded within
the larger system. When bottom level modules are tested, attention turns to those on the next
level that use the lower level ones.
In this method of testing we will test the smallest unit of software called modules. We
will be testing all the important paths to find any errors within the boundary of module. So here
white box search is applied. We will be testing parts of the software rather than the entire
software. The test cases are as follows:

1.Test case for login
2. Test case homepage
3. Test case for Information extraction
4. Test case for classification
5. Test case for sending queries
6. Test case for evaluation
7. Test case for presentation
VALIDATION TESTING
Validation testing is a process of obtaining the right amount of processing capability of
the software. Here we make sure that software is providing the exact result which it is assigned
for. We will look at the software requirement document in the case of conflict or
misunderstanding with client regarding software components. We will perform the black box
testing where the software is completed and we test all the software components together. We
will have several input data or test data that we will derive results for. We will insert this data in
the software and will get results from the software. We will compare the results from the
software with the results that we derived. This way will check for the validation of the software.
In case there are problems with the software we will create a deficiency list and will
record all the problems there. We will test all the components and subcomponents of the
software to perform validation test.

SYSTEM TESTING
System testing includes verification and validation, which can be classified as static
and dynamic. In static method the behavior of the system is not observed by executing
the system. In dynamic method the behavior of the system is observed by executing the
system. We have tested all the modules in our project separately and run successfully.

TEST CASE AND THEIR PURPOSE
Each of the software items will be the test case for the integration testing. So each form
as a whole will be a test case. We will be testing each and every form for all the errors that
logically can occur in it. We will install the software and will run it.





REQUIREMENTS
HARDWARE SPECIFICATION
Operating system : Linux/Windows with JDK 6
Memory: 256 MB RAM, About 20MB Storage Space.
Internet : NIC (for PC), GPRS/EDGE/3G (for mobile)


SOFTWARE SPECIFICATION
Front end : Java (Server)
Operating system : Platform independent
IDE : Net beans 7.0













CONCLUSION AND FUTURE ENHANCEMENT

The project extract text documents and used the ICD-10 code standard to identify a disease from
the documented symptoms and used C4.5 technique of data mining as a classifier. The sliding
window technique was also used to extract information. Weight and frequency of terms used for
symptoms were done by TF-IDF. The extracted information was stored in a database labelled
medical data. The result of the classifier has rendered at 98.61%. A number of steps including
converting the medical data to XML/OWL were done to create an ontology specific to our
purposes. The rationale for applying the ontological method for the disease ontology system was
to represent subsumptive symptom relationships in ICD-10. Domain ontology was utilized to
show meaningful relationships among the symptoms. These relationships are valuable for a
health informatics service as they continually help discover and provide new knowledge.

You might also like