You are on page 1of 5

INORMATION STORAGE AND RETRIEVAL TECHNIQUES FOR MOBILE HEALTHCARE

G. Anderson, S.D. Asare, A.O. Eyitayo, O.T. Eyitayo, D. Mpoeleng, T.Z. Nkgau, H. Nyongesa, and F. Ogwu Department of Computer Science University of Botswana Private Bag UB 00704 Gaborone, Botswana { andersong, asaresd, eyitayo, eyitayoo, mpoeleng, nkgautz, nyongesa, ogwufj } @mopipi.ub.bw
Abstract-This paper proposes an architecture for the storage and retrieval of multimedia information for mobile healthcare. Our coverage of storage mechanisms for multimedia information is centered on images, audio, video and textual contents, while the coverage of retrieval techniques is centered on audio requests, Internet connection, SMS and keypad queries. The paper highlights key issues and factors for technical consideration in the design of storage structure and retrieval mechanisms of mobile health information. The paper ends with the IHISM-SRA (Integrated Healthcare Information System through Mobile Telephony Storage and Retrieval Architecture): a proposed structure for the storage and retrieval of health information in the context of a less developed country, using mobile telephony.

Keywords-Digital inclusion, mobile healthcare, data storage and retrieval

1. Introduction
The IHISM project (Integrated Healthcare Information Service through Mobile Telephony) is sponsored by Microsoft and aims to take advantage of penetration of mobile phones, especially cell phones, in Botswana to improve healthcare by minimizing patient visits to hospitals and periods of hospitalization, improve patient access to information, and enabling patients to manage their conditions better. The medical condition focused on is HIV/AIDS, since this is one of the largest problems in Botswana. One of the outputs of the project is an Information Service. Major challenges include presentation and storing of individual patient (illiterate and semi-literate) information needs, acquisition, translation and presentation of multimedia content in local languages, and consolidation of these items in an information storage and retrieval tool. This paper discusses the relevant issues and suggests a framework for such a tool.

This research was made possible by a Microsoft Research grant, through its "Digital Inclusion Through Mobile and Wireless Telephony" programme (Reference Number

5703800404).

"Telemedicine can be broadly defined as the use of telecommunications technologies to provide medical information and services." [1]. It involves delivering healthcare over a distance using telecommunications equipment such as telephones (fixed and mobile), fax machines, and computers. This involves transmission of medical information for the purpose of treatment and diagnosis. Telemedicine is a great asset for managed care of patients, especially in cases where the patients are dispersed over a relatively wide area. Patients may be unable to travel to see an appropriate doctor because of their medical condition or distance, or unwillingness to lose time at work, or they may be prisoners, making it very expensive to transport them. Telemedicine could involve synchronous communication (real time consultation) as well as asynchronous communication, where data is stored for later use, for example for diagnosis at a later point in time. The IHISM research project involves the construction of a healthcare information system. As far as current mobile healthcare services in Botswana go, at least one healthcare service provider is using a healthcare system accessible through telephony. A Universal Electronic Patient Record-type of system could be developed (electronically stored health information about a patient uniquely identified using an identifier) [2]. Such a patient record system involves capturing, storing, processing and transmitting patient-specific healthcare data such as clinical and administrative data. This means a fixed format has to be developed. Healthcare is the largest service industry in the world [3]. It requires not just curing of illnesses but also prevention of illnesses. We believe that IT can go a long way to helping this industry in Botswana, and that, because of penetration of mobile devices (cell phones), mobile and wireless solutions are the most appropriate. This approach has the same design as m-commerce (mobile commerce) [4]. M-commerce involves viewing the service as being made up of 3 components: The customer (the patient), the producer (infrastructure providers), and management (of the system e.g. health care workers). Based on work done by INET International Inc. (a Canadian e-business organization), Goldberg and Wickramasinghe [4] showed that

ISBN 978-89-5519-131-8 93560

1096 -

Feb. 12-14, 2007 ICACT2007

mobile/wireless solutions for healthcare can achieve four and answers in the online forums are stored in Universal Networking Language (UNL) [7], so answers can be given in critical goals of: a language different from the one the question was asked in. * Improved Patient Care (for example, avoiding wastage, This allows for Semantic-Based Search techniques. Experts can also suggest keywords while looking into answers in and providing timely information) * Reduce Transaction Costs (this is a very big advantage more detail, and users can access discussion threads based on that is realized because of the use of a central information the keywords for possible solutions to their problems. source and leveraging of existing infrastructure like Because the farmers require precise answers, it would not be a good idea to fully automate the system (human responses mobile networks) * Increase Healthcare Quality (rapid deployment of new are needed), especially since questions may have errors in them and may include digital images. This leads us to the applications) * Enhance Teaching and Research (medical data is easily discussion on data storage requirements in the next section. accumulated and can be easily analyzed)

The IHISM project will construct a Healthcare Information System. Multimedia information will be stored (in its various forms; for example text for SMS and Web/WAP access, audio for telephone access, images for pictures of medication and symptoms, video for educational information) and therefore some sort of a multimedia database management system will be required. Some data to be stored does not conform to formats normally used in DBMS searches and retrieval so specialized techniques have to be used, as discussed later in the paper. This paper is organized as follows: Section 2 discusses some systems that address the similar issues as we are trying to address. Sections 3 discusses data storage requirements for mobile healthcare data practical enough for the data to be captured, processes, stored and disseminated efficiently. Section 4 discusses information retrieval techniques useful for mobile healthcare data. Section 5 contains the architecture we produced based on our research, and Section 6 provides a conclusion.

3. Data Storage Requirements

A mobile healthcare information system may require multimedia data, and therefore the following data types, similar to those proposed by Adjeroh and Nwosu [8]: * Text * Images. These could be color of black and white e.g. photographs. * Graphic Objects e.g. ordinary drawings, sketches, and illustrations, or 3D objects. 0 Animation Sequences e.g. images or graphic objects. 0 Video. Also a sequence of images (called frames) but usually recording a real-life event and usually produced by a video recorder. 0 Audio. This is generated from an aural recording device. 0 Composite Multimedia. This is formed from a combination of two or more of the data types listed above e.g. video with a textual annotation.

2. Related Work
The BEANISH project [5] is a collaboration between European and African governments, universities, private sectors, and NGOs to support the application and sharing of IST development in healthcare. It involves development of a system to support knowledge exchange between healthcare managers, providers and the community. One of the goals is to develop an Information System usable by healthcare workers with mobile devices. This would involve use of standardized protocols and services for great interoperability, enabling workers to collect data with mobile devices and transmit to fixed infrastructure. This phase has not yet progressed very far. aAqua (almost All QUestions Answered) is a portal for the support of the Indian agricultural community, handling multimedia information [6]. Farmers can ask questions supplying information such as crop names, pesticides and fertilizers used, dosages, farm size, etc. Agri-experts and people with similar experiences post answers and may ask the farmer more questions. A telephone conversation follow-up takes place a week later to find out from the farmer if the answer solved the problem, providing for feedback. Questions

There are two alternatives for storing such data: a file system could be used, with a custom set of programs for access, or a Database Management System (DBMS). The first option is not realistic because it does not allow for flexible queries and data manipulation facilities, and lacks features such as concurrency control and recovery management and facilities to produce reports, forms, and database statistics [9]. The mobile healthcare information system (IHISM) really requires a Multimedia Database Management System to store the data. The different data types require special methods for optimal storage, access, indexing, and retrieval. High-level abstractions should be provided to manage the different data types and suitable presentation should be possible. Video, audio and animation sequences have temporal requirements i.e. their presentation should not be distorted in terms of time (so certain events must happen at given points in time relative to other points in time, without distortion). Certain amount of data is required to be presented to the user within a given time, for the presentation to seem natural to the user. Images, graphics, and video data have spatial constraints in terms of their content. For example, objects in an image or a video frame have some spatial relationship between them. Multimedia information also takes up huge volumes of storage space. There are also issues to do with storing multimedia information in order to facilitate searches, because video, for example, is difficult to describe textually.

ISBN 978-89-5519-131-8 93560

1097 -

Feb. 12-14, 2007 ICACT2007

Data in a multimedia database could be stored in main memory, online devices (like magnetic disks and optical disks, that store data currently being used), near-line devices (like optical storage and jukeboxes) and off-line devices (like magnetic tapes and optical storage, used to archive data), in that order of decreasing speed, cost and probability of access, increasing permanence, capacity, and access time [8]. Data models are very important for multimedia databases, so the data can be stored and accessed uniformly. Data models for traditional databases include network, relational, semantic and object-oriented. Multimedia data is modeled either using a multimedia data model on top of a relational or object-oriented database system or developing a model from scratch, however most people think it is better to take the object-oriented route [8]. Data compression may be required if the amount of information becomes too big, for storage and communication. The major issues around storing multimedia data are limited available storage, the bandwidth limits of the storage system, and communication channel, and the multimedia data type's availability rate (the minimum amount of data per unit time required for the data to be considered of acceptable quality for presentation) [8]. A Medical Information System will be created, including a medical database. Theoretical and practical issues have to be considered [3]. Theoretical issues include database evolution control, temporal/intelligent reasoning, and scalability. Practical issues include handling of multi-media data, data integration, mobile access to data, and cost. Also very important is the ease of use of the system. Since the research project involves using mobile devices, another area of concern is data storage on mobile devices. Yu and Yu [10] concluded that user interface elements like text boxes should be avoided, but instead option (radio) buttons, check boxes, and drop-down list boxes should be used. This implies that response data will be short, not long as might be the case in open-ended software. Healthcare organizations may be interested in questionnaires and data stored in XML files is ideal for sharing between mobile devices applications and full versions of questionnaire software on the desktop or server. Liang et al [9] investigated use of relational and objectrelational database management systems (RDBMS and ORDBMS) for storing CDA (Clinic Document Architecture) data, an XML-based markup standard consisting of a hierarchical structure for clinical documents, for ease of information exchange. A CDA document is a complete information object that can exist on its own, outside of any other description, and can include text, images, sounds, and other multimedia content. The document structure has 3 levels, each defined by a DTD (Document Type Definition). Level One is the root and each additional level specifies more constraints to the document. The research concluded by recommending ORDBMS for prototypes and RDBMS for final versions of databases, since they have better query performance and support ad-hoc queries, although they are more difficult to design. From the literature, there are several information techniques that have been employed.

4. Retrieval Techniques For Mobile Services


Information retrieval techniques range from knowledgebased techniques, probabilistic techniques, neural network techniques, symbolic learning, to genetic algorithms [11]. These are classified as machine-learning techniques. The aim of employing machine learning is to improve the effectiveness of the information retrieval system, and in particular to adapt the retrieval system to individual users. Machine learning techniques, for this reason offer an approach for generating and adapting user models in IR (Information Retrieval) applications. In this section, we discuss three of these retrieval techniques. Our aim would be to compare them and then adopt the best technique, so we can employ this in mobile services.
4.1 Symbolic machine learning According to [11], inductive learning (learning by example) appears to be the most promising symbolic machine learning technique for knowledge discovery or data analysis. In this technique, positive and negative examples are input to an algorithm implementing the technique and knowledge (algorithm output) is represented in the form of symbolic descriptions (production rules or concept hierarchies). Usually, input data (examples) is composed of knowledge about documents, users and queries. It is pointed out in [11] that application of symbolic machine learning techniques is sparse. However, in [12] a system which uses inductive learning techniques for information retrieval is presented. A major drawback, according to [12], seems to be the size of the training data required. 4.2. Neural Networks A neural network is composed of a large number of interconnected processing elements (called neurons) working in tandem. Knowledge is acquired through learning and is remembered by this network of interconnected neurons, weighted synapses and threshold logic units. In information retrieval, the neurons correspond to documents, queries and keywords. As a result, neural networks use the vector space model - each document (and each query) is represented by an n-dimensional feature vector (usually binary). Their greatest strength is that they are able to represent both linear and nonlinear relationship and to learn these complex relationships from input data. Major drawbacks to using neural networks include * Usually require a large training set * Computationally intensive * Results sometimes could be hard to interpret Reference [13], presents a system which is used to both retrieve and learn about documents. An application of neural networks in medical information retrieval is discussed in [14]. The reader is referred to [15] for more on neural networks and their application. 4.3. Genetic Algorithms Though primarily used to solve optimization problems, their use for information retrieval is gaining ground. They also use the vector space model. Genetic processes (mutations,

ISBN 978-89-5519-131-8 93560

1098 -

Feb. 12-14, 2007 ICACT2007

crossovers and selection) are then applied to the initial document population. After several evolutions, the document population at that time is selected as the response to the query. In reference [16], genetic algorithms are used to adapt various document matching functions in an attempt to improve document retrieval. Maleki-Dizaji [17] uses genetic algorithms for user-modeling of adaptive and exploratory behavior in an information retrieval system. In [17], the choice of the underlying genetic operators is mentioned as a major drawback in the use of genetic algorithms.

4.4. Our recommendation Based on the fact that there is far more available knowledge and experience in using neural networks and genetic algorithms than symbolic machine learning, we believe that both techniques would be appropriate for our information retrieval system. Our recommendation drives the architecture presented in the next section.

5. The IHISM-SRA (IHISM Storage And Retrieval Architecture)


Fig. 1 shows the proposed architecture. The IHISAM-SRA was driven by data requirements for IHISM. Basic data types were discussed in Section 4, but the compound types were not. They include: * Fax: The ability to store data from which a fax can be generated and sent. * SMS (Short Message Service): The ability to store received SMS data from patients and healthcare workers. * FAQ (Frequently Asked Questions): The ability to store FAQs and other collections of questions and answers. * MMS (Multimedia Messaging Service): The ability to store video and graphical image MMS data, which could include photos of medications or wounds, for example. * Documents: For example PDF or Microsoft Office documents, which could contain suitable information in a format usable by healthcare workers and patients.

The SRA API is used by applications running within the boundary of IHISM. These are therefore trusted applications. The job of the Storage Manager is to store data either in a regular database (DB) or in Custom Storage (storage in Custom Storage not shown in diagram for clarity). It does this through a Database Management System (DBMS) (for the DB) or using sources like the Internet (for Custom Storage). The IR Manager manages the searching for and retrieval of data either from the DB using the DBMS or using custom Neural Network (NN) or Genetic Algorithms (GA), or a combination of the approaches. The custom algorithms are always required for retrieving data from Custom Storage and the DBMS is always required for retrieving data from the DB. The architecture supports various DBMSs. The data stores (DB and Custom Storage) also include metadata required for the higher layers to "learn" the various storage formats used. Security is provided for by the SM and IR Managers and DBMS access control. This is very important because personalized systems need (and record) information about users. This leads to privacy concerns and legal issues [18]. Samaras and Panayiotou [19] mention that this could affect
user trust.

6. Conclusion
This paper has discussed various issues surrounding mobile healthcare infornation systems, with a focus on infornation storage, search, and retrieval. After discussing mobile healthcare and related infornation systems, the paper discussed data storage requirements and retrieval techniques and presented the IHISM-SRA (IHISM Storage and Retrieval Architecture) which the authors believe is suitable for any mobile healthcare infornation system similar to the one being developed for IHISM.

REFERENCES
[1]
D.A. Perednia and A. Allen, "Telemedicine Technology and Clinical Applications," in T.L. Huston and J.L. Huston (2000) Is Telemedicine A Practical Reality? Communications of the ACM, Vol. 43 (6), pp. 91-95, 1995. W. Raghupathi, "Health Care Information Systems," Communications of the ACM, Vol. 40 No. 8, pp. 81-82, August 1997. J.C.G. Ramirez, L.A. Smith and L.L. Peterson, "Medical Information Systems: Characterization and Challenges," SIGMOD Record, Vol 23. No. 3, pp 44-53, September 1994. S. Goldberg and N. Wickramasinghe, ''21st Century Healthcare - The Wireless Panacea," Proceedings of the 36th Hawaii International Conference on System Sciences (HICSS'03), 2002. HISP, "BEANISH Annex I-'Description of Work'," Retrieved from: (November 2006)

} SRA Interface

[2]

[3]
} Management

[4]
} Storage and Retrieval Processing

[5]

} Storage Media

[6]

[7]
Fig. 1: IHISM-SRA (IHISM Storage and Retrieval Architecture)

ments, September 2006. K. Ramamritham, A. Bahuman, and S. Duttagupta, "aAqua: A Database-backended Multilingual, Multimedia, Community Forum," Proceedings of the SIGAIIOD 2006 International Conference on Management of Data, Chicago, Illinois, USA, pp 784-786, 2006. UNL Centre, "The Universal Networking Language (UNL) System," Foundation. Accessed from: UNDL (July 2006)

ISBN 978-89-5519-131-8 93560

1099 -

Feb. 12-14, 2007 ICACT2007

[8]

[9] [10]

[11]

[12]

[13]

D.A. Adjeroh and K.C. Nwosu, "Multimedia Database Management Requirements and Issues," IEEE Multimedia, pp 24-33, JulySeptember 1997. Z. Liang, P. Bodorik, and M. Shepherd, "Storage Model for CDA Documents," Proceedings ofthe 36th Hawaii International Conference on System Sciences (HICSS'03), 2002. P. Yu and H. Yu, "Lessons Learned from the Practice of Mobile Health Application Development," Proceedings of the 28th Annual International Computer Software and Applications Conference (COMPSAC'04), 2004. H. Chen, "Machine Learning for Information Retrieval: Neural Networks, Symbolic Learning, and Genetic algorithms," Journal of the American Society for Information Science, John Wiley & Sons, Inc., Vol. 46 No. 3, pp. 194-216, 1995. H. Kammoun, J.C. Lamirel, and M.B. Ahmed, "Neural-Symbolic Machine-Learning for Knowledge Discovery and Adaptive Information Retrieval," Transactions on Engineering, Computing and Technology, World Enformatika Society, Vol. 6, June 2005. R.K. Belew, "Adaptive Information Retrieval: Using a Connectionist Representation to Retrieve and Learn About Documents," ACM SIGIR Forum, Proceedings of the 12th annual international ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR '89, Volume 23 Issue SI 1989, pp. 11 - 20.

[14]

[15]
[16]

15V. Dasigi, "An Experiment in Medical Information Retrieval," Proceedings of the 1998 ACM symposium on Applied Computing, pp. 477 -481, 1998. Simon Haykin, Neural Networks: A Comprehensive Foundation, 2nd Edition, Prentice Hall, 1998. P. Pathak, M. Gordon, W. Fan, "Matching Functions Adaptation", Department of Computer and Information Systems, University of Michigan Business School, Accessed (November 2006) from:

[17]

S. Maleki-Dizaji, "Evolutionary Learning Multi-Agent Based Information Retrieval Systems," PhD. Thesis, Sheffield Hallam University, October 2003. [18] E. Volokh, "Personalization and Privacy," Communications of the ACM, Vol.43 No.8, pp. 84-88, August 2000. [19] G. Samaras and C. Panayiotou, "Personalized Portals for the Wireless User Based on Mobile Agents.," Proceedings of the 2nd International Workshop on Mobile Commerce, pp. 70 - 74, 2002.

ISBN 978-89-5519-131-8 93560

1100 -

Feb. 12-14, 2007 ICACT2007

You might also like