You are on page 1of 6

Parallelizing Insurance Processing

Application using Condor

March 2011

Parallelizing Insurance Processing Application using Condor | March 2011

TABLE OF CONTENTS

Abstract ............................................................................................. 3
Business Problem ............................................................................. 3
Solution ............................................................................................. 3
Benefits ............................................................................................. 4
Common Issues ................................................................................ 5
Conclusion......................................................................................... 5
Reference .......................................................................................... 5
Author Info ......................................................................................... 5
Siddesh is an Associate Technical Manager in the HPC Lab. His
interests include Cluster Programming, Auto Parallelization and Grid
Technologies. .................................................................................... 5

2011, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Parallelizing Insurance Processing Application using Condor | March 2011

Abstract
This white paper brings out an approach to increase the utilization
of available hardware to process long running Insurance
Applications using work load management software Condor. This
results in reducing processing time almost linearly, depending on
the number of nodes/ cores in the cluster.

Business Problem
Insurance Applications typically consists of several batch programs
executed daily, weekly, monthly and quarterly. The batch programs
java applications, execute for several hours depending upon the
data volumes. Thus, there is a need to reduce the execution time,
more so, on daily batches.

Solution
Solution approach is to identify the parallel execution paths within
the batch program and schedule them to run on the different nodes/
cores in parallel.
The implementation involves setting up the cluster, installation of
Condor software and application specific dependencies including
the database access. Condor is a work load management system,
which provides job queuing, scheduling, resource monitoring and
management.
User has to transform the parallel execution paths to Condor jobs
and submit as job files to Condor. Condor schedules the jobs in
parallel, taking care of inter-job dependencies.
Verification of proper execution is required to make sure that all the
steps are done appropriately.
This approach was applied to the scenario of a large Insurance
application.
This application required accessing information from database,
compute or analyze the information and create an output file in a
predefined format.
A two node cluster was setup along with the application
dependencies with database access. The setup included Condor
installation, and configuring one node as manager node and the
other as worker node. Job would be executed on both the nodes.

2011, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Parallelizing Insurance Processing Application using Condor | March 2011

The identified batch job for parallelization was processing 4,00,000


records sequentially. This job was split into four Condor Sub Jobs,
named as Sub Job1 to Sub Job4 which would process around
1,00,000 records each. The four sub jobs were scheduled on two
nodes resulting in utilizing each core for one sub job.

Figure 1Schematic representation of Nodes with Condor and Sub Jobs execution

The results were compared by checking the number of records


processed, output logs and error logs.
Time taken to 4,00,000 records - Single process execution

12 hours

Time taken to 4,00,000 records Condor based parallel job


execution on two dual core nodes

3.5 hours

Speed up realized
configuration

3.4

using

the

available

hardware

A notable attribute of this Insurance application is that record


processing is embarrassingly parallel. An application is
embarrassingly parallel if its subtasks rarely or never have to
communicate. Hence the scaling out is uniform and has yielded a
linear speed up. It is expected that with two more additional nodes
in the cluster, the execution time can be reduced to 1.5 2 hours,
which would improve the testing effectiveness significantly.

Benefits
Business benefits of the Condor based solution - reduced execution
time:

Reduced testing time leading to reduced turnaround time for


defect fix

2011, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Parallelizing Insurance Processing Application using Condor | March 2011

Increase utilization of available hardware

Scaling out with ordinary desktop systems is less expensive


than Scaling up the system with high end servers

Common Issues
Major challenge in parallelizing long running applications is
identification of the parallel paths and their inter-dependencies.
Once identified, Condor helps in scheduling such applications to
execute in a cluster environment.
Apart from identifying parallel execution paths, it is important to
avoid race conditions with respect to the data source.

Conclusion
Insurance processing typically includes long running applications.
Systematic parallel execution of such programs in a cluster using
work-load management software Condor can provide significant
speed up. This can be used to gain business benefits in production /
development environment. Similar technique can be employed for
huge code builds and automated testing.

Reference
Condor Home page http://www.cs.wisc.edu/condor/
Parallel Computing
http://en.wikipedia.org/wiki/Parallel_computing

Author Info
Siddesh is an Associate Technical
Manager in the HPC Lab. His interests
include Cluster Programming, Auto
Parallelization and Grid Technologies.

2011, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Hello, Im from HCLs Engineering and R&D Services. We enable


technology led organizations to go to market with innovative products &
solutions. We partner with our customers in building world class
products & creating the associated solution delivery ecosystem to help
build market leadership. Right now, 14500+ of us are developing
engineering products, solutions and platforms across Aerospace and
Defense, Automotive, Consumer Electronics, Industrial Manufacturing,
Medical Devices, Networking & Telecom, Office Automation,
Semiconductor, Servers & Storage for our customers.
For more details contact eootb@hcl.com
Follow us on twitter http://twitter.com/hclers
http://ers.hclblogs.com/

and

our

blog

Visit our website http://www.hcltech.com/engineering-services/

About HCL
About HCL Technologies
HCL Technologies is a leading global IT services company, working
with clients in the areas that impact and redefine the core of their
businesses. Since its inception into the global landscape after its IPO in
1999, HCL focuses on transformational outsourcing, underlined by
innovation and value creation, and offers integrated portfolio of services
including software-led IT solutions, remote infrastructure management,
engineering and R&D services and BPO. HCL leverages its extensive
global offshore infrastructure and network of offices in 26 countries to
provide holistic, multi-service delivery in key industry verticals including
Financial Services, Manufacturing, Consumer Services, Public Services
and Healthcare. HCL takes pride in its philosophy of Employee First
which empowers our 72,267 transformers to create a real value for the
customers. HCL Technologies, along with its subsidiaries, had
consolidated revenues of US$ 3.1 billion (Rs. 14,101 crores), as on
31st December 2010 (on LTM basis). For more information, please visit
www.hcltech.com
About HCL Enterprise
HCL is a $5.7 billion leading global technology and IT enterprise
comprising two companies listed in India - HCL Technologies and HCL
Infosystems. Founded in 1976, HCL is one of India's original IT garage
start-ups. A pioneer of modern computing, HCL is a global
transformational enterprise today. Its range of offerings includes
product engineering, custom & package applications, BPO, IT
infrastructure services, IT hardware, systems integration, and
distribution of information and communications technology (ICT)
products across a wide range of focused industry verticals. The HCL
team consists of over 79,000 professionals of diverse nationalities, who
operate from 31 countries including over 500 points of presence in
India. HCL has partnerships with several leading Global 1000 firms,
including leading IT and technology firms. For more information, please
visit www.hcl.com