You are on page 1of 7

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976

6480(Print), ISSN 0976 6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 IAEME
8











BUILDING AGGREGATES IN THE DATA WAREHOUSE: A CASE STUDY
OF BIRTH, DECEASED AND PROPERTY REGISTRATION
E-GOVERNANCE DATA


Pushpal Desai
1


1
(M.Sc. (I.T.) Programme, VNSGU, Surat, India)



ABSTRACT

In this paper, the concept of aggregates in the data warehouse is discussed. The proposed
method to create aggregate in data warehouse and its implementation using Microsoft SQL Server
Integration Services is discussed. The results obtained from aggregates are presented. The results
indicate that aggregates can be very efficient compare to querying data from base fact table of the
data warehouse.

Keywords: Aggregates, Data Warehouse, Microsoft SQL Server Integration Service.

I. INTRODUCTION

An Aggregate is a supplemented data structure that helps make things go faster in the data
warehouse [3]. Aggregates are very important part of any data warehouse implementation. An
aggregate is a number that is calculated from amounts in many detail records. An aggregate is often
the sum of many numbers, although it can also be derived using other arithmetic operations or even
from a count of the number of items in a group [1]. An aggregate is a value formed by combining
values from a given dimension or set of dimensions to create a single value [1]. By implementing
aggregate in the data warehouse, we can store summarized data from the detailed data that are
available in the OLTP systems. Once we create different aggregates in the data warehouse, retrieving
information from the aggregate is much more efficient compare to detailed data [1]. There are
several advantages of creating aggregates in data warehouse. Typically, Aggregates contains fewer
rows than the base tables. Therefore, when end user executes query against the aggregates fact table
instead of the data warehouse fact table, the response time is quite high. So, aggregates are very
effective in improving query performance in data warehouse [2]. Typically, data warehouse contains
large amount of data with millions of records. In data warehouse environment several users tries to
executes complex queries from the data warehouse and that may take lot of time. The use of pre
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH
IN ENGINEERING AND TECHNOLOGY (IJARET)


ISSN 0976 - 6480 (Print)
ISSN 0976 - 6499 (Online)
Volume 5, Issue 6, June (2014), pp. 08-14
IAEME: www.iaeme.com/ijaret.asp
Journal Impact Factor (2014): 7.8273 (Calculated by GISI)
www.jifactor.com

IJARET
I A E M E
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976
6480(Print), ISSN 0976 6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 IAEME
9

calculated aggregates can greatly improve the query execution time and efficiency the data
warehouse [4].

II. METHODOLOGY

The Aggregate transformation allows us to combine information from multiple records from
the source data and convert into a single value [1].


Figure 1: The proposed methodology to create Aggregates

To create aggregate, first we need to specify source data and then select the input columns
from the source data. We need to specify operations on the input columns and the possible operations
on input columns are group by, minimum, maximum, sum, average, count, count
distinct, etcAfter specify these settings, we can create aggregate in the data warehouse and store
them for future analysis tasks by the management. The proposed methodology to create aggregates is
depicted in the Figure 1. The aggregate transformations are implemented on different data by
considering the common business requirements.
The SQL Server Integration Service provides aggregate transformation to develop various
aggregates [1]. For example, In Birth Data, aggregate based on RegistrationYear, ReligionID
and Sex fields was developed. Based on these fields, aggregate of Average Birth Weight was
developed. The Figure 2 shows settings for aggregate transformation settings in the SQL Server
Integration Services.
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976
6480(Print), ISSN 0976 6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 IAEME
10


Figure 2: Average Birth Weight Aggregate transformation using SSIS

Similarly, aggregate for Average Deceased Age considering Registration Year,
Deceased Religion and Deceased Sex fields was developed. The settings for Deceased Age
Aggregate transformation are shown in the Figure 3.


Figure 3: Average Deceased Age Aggregate Transformation using SSIS

Similarly, aggregates for Property Database considering average Property Age in various
wards and property types was developed. The settings for the property age aggregate transformation
are shown in the Figure 4.
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976
6480(Print), ISSN 0976 6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 IAEME
11


Figure 4: Property Age Aggregate transformation using SQL Server Integration Service

III. RESULTS

The SQL Server Integration Services package execution on Birth Data source records
generated 151 records. The execution flow and result is shown in the Figure 5 and Figure 6
respectively.


Figure 5: Execution flow of Childs Birth Weight Aggregate transformation

This aggregate summarized data for Average Child Birth Weight attribute. It considers
various fields such as Gender, Year and Religion. Hence, whenever, Average Child Birth Weight
data is required, query can be efficiently executed against aggregate. This query execution will very
efficient as aggregate contains only 151 records and query execution does not affect base fact table.
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976
6480(Print), ISSN 0976 6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 IAEME
12


Figure 6: Result of Childs Birth Weight Aggregate transformation

Similarly, we executed SSIS package for creating Average Deceased Age attributed. The
execution flow and its result are shown in the Figure 7 and Figure 8 respectively.


Figure 7: Execution of Deceased Age Aggregate transformation

This aggregate considers other fields such as Gender, Religion and Year. This aggregate can
be efficiently used, whenever; Average Decease Age information is required

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976
6480(Print), ISSN 0976 6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 IAEME
13


Figure 8: Result of Deceased Age Aggregate transformation

The execution of SSIS package for Average Property Age resulted in 768 rows from
1,47,1859 records stored in base fact table.


Figure 9: Execution of Property Age Aggregate transformation

This aggregate contains other important fields such as Property Type and Ward Number. So
this aggregate can be very efficiently used whenever Average Property Age is required as query
execution will be against only 768 records.

International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 097
6480(Print), ISSN 0976 6499(Online) Volume 5, Issue 6, June (2014), pp.
Figure 10: Result of Property

IV. CONCLUSION

The results clearly indicate that
deployment. The deployment and use of
warehouse queries. The practical implementation indicates that queries executed against
are highly efficient because aggregates contain far less records compare to base fact tables.

V. ACKNOWLADGEMENT AND

All results are based on data provided by the munici
only. Hence results may change, if data warehouse

VI. REFERENCES

(1) Brion Larson, Delivering Business Intelligence with Microsoft SQL Server 2008
(2) Paulraj Ponniah, Data Warehousing Fundamentals: A Comprehensive Guide for IT
Professional, Wiley India-Ediation.
(3) Christopher Adamson, The Complete Reference: Star Schema, Tata McGraw
(4) Ashok Kumar Verma, Effect of cube on query performance in data warehouse, Internat
Journal of Advanced Research in
2278-6244.
(5) Kuldeep Deshpande and Dr. Bhimappa Desai, A Critical Study
and Testing Techniques for Data
Technology and Management Information Systems (IJITMIS), Volume
pp. 60 - 71, ISSN Print: 0976
(6) Prof. Manas Kumar Sanyal, Sudhangsu Das
Way to Roll Out E-Governance Projects
Engineering & Technology (IJCET), Volume
0976 6367, ISSN Online: 0976
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 097
6499(Online) Volume 5, Issue 6, June (2014), pp. 08-14 IAEME
14

Result of Property Age Aggregate transformation
results clearly indicate that aggregates are crucial part of any
deployment. The deployment and use of aggregates greatly improves the efficiency of
practical implementation indicates that queries executed against
ggregates contain far less records compare to base fact tables.
ACKNOWLADGEMENT AND LIMITATIONS
All results are based on data provided by the municipal corporation for the research purpose
only. Hence results may change, if data warehouse concepts are applied on actual data sets.
Delivering Business Intelligence with Microsoft SQL Server 2008
Warehousing Fundamentals: A Comprehensive Guide for IT
Ediation.
Christopher Adamson, The Complete Reference: Star Schema, Tata McGraw-
Ashok Kumar Verma, Effect of cube on query performance in data warehouse, Internat
Journal of Advanced Research in IT and Engineering, Vol. 2, No. 6, June 2013, ISSN:
nd Dr. Bhimappa Desai, A Critical Study of Requirement G
or Datawarehousing, International Journal of Information
Technology and Management Information Systems (IJITMIS), Volume 5
0976 6405, ISSN Online: 0976 6413.
Prof. Manas Kumar Sanyal, Sudhangsu Das and Sajal Bhadra, Cloud Computing
Governance Projects in India, International Journal of Computer
Engineering & Technology (IJCET), Volume 4, Issue 2, 2013, pp. 61 -
6367, ISSN Online: 0976 6375.
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976
IAEME

ggregates are crucial part of any data warehouse
ggregates greatly improves the efficiency of data
practical implementation indicates that queries executed against aggregates
ggregates contain far less records compare to base fact tables.
pal corporation for the research purpose
applied on actual data sets.
Delivering Business Intelligence with Microsoft SQL Server 2008.
Warehousing Fundamentals: A Comprehensive Guide for IT
-Hill Edition.
Ashok Kumar Verma, Effect of cube on query performance in data warehouse, International
No. 6, June 2013, ISSN:
f Requirement Gathering
al of Information
5, Issue 1, 2014,
nd Sajal Bhadra, Cloud Computing-A New
ournal of Computer
72, ISSN Print:

You might also like