You are on page 1of 11

Indexes and Database Optimization

02/02/2015

TELECOM/NASCP/ASU

Kalyani Das

Assurance Services Unit

kalyani.das@tcs.com

_______________________________________________________________________________________________

1
Confidentiality Statement
Include the confidentiality statement within the box provided. This has to be legally
approved
Confidentiality and Non-Disclosure Notice
The information contained in this document is confidential and proprietary to TATA
Consultancy Services. This information may not be disclosed, duplicated or used for any
other purposes. The information contained in this document may not be released in
whole or in part outside TCS for any purpose without the express written permission of
TATA Consultancy Services.

Tata Code of Conduct


We, in our dealings, are self-regulated by a Code of Conduct as enshrined in the Tata
Code of Conduct. We request your support in helping us adhere to the Code in letter and
spirit. We request that any violation or potential violation of the Code by any person be
promptly brought to the notice of the Local Ethics Counselor or the Principal Ethics
Counselor or the CEO of TCS. All communication received in this regard will be treated
and kept as confidential.

_______________________________________________________________________________________________

2
Table of Content

1. Database ................................................................................................................................................................... 4

2. Database Optimization .............................................................................................................................................. 5

3. Indexes ...................................................................................................................................................................... 6

3.1 B*- Tree Indexes .......................................................................................................................................................... 6

3.2 Bitmap Indexes ...................................................................................................................................................... 8

3.3 Bitmap Join Indexes............................................................................................................................................. 10

3.4 Index Organized Table ......................................................................................................................................... 10

_______________________________________________________________________________________________

3
1. Database

When a raw data is organized in a special way then its called data and in order to manage and keep track of this
data, we need a database.

A database can be considered as a collection of data which is considered as a collection of data which is arranged
in such a manner that its contents are easily accessible, manageable and updatable. It can be maintained using
three different means such as paper or cabinet file, spreadsheet and dbms.
Advantage of database management is to know the asset liability, profit loss of an organisation etc.

These databases are managed using a special software called database management systems. These DBMSs
act as an interface for the users, applications and the databases for interacting between them.

Based on how the dbms structures the databases, there are four types of dbms:
Network database management system
Hierarchical database management system
Relational database management system
Object oriented database management system

The database processing happens at four different layers such as:

Application Layer: Application code sends requests database and the database responds to these requests with
return codes and result sets.
Database code Layer: Database parses the SQL and perform various overhead operations such as security,
scheduling and transaction handling before executing the statement.
Memory Layer: Database requests processes the data by buffering cache (data blocks) and other shared memory
caches. Memory is also used to perform sorting and hashing problems.

_______________________________________________________________________________________________

4
Disk Layer: If the required data is not present in the memory, then it must be retrieved from the disk, resulting in
Physical I/O operations.

2. Database Optimization

The word optimization means achieving the best possible results in quickest amount of time. When it comes to
database its almost impossible to handle large amounts of data, at this stage database optimization techniques
come into action. If a database based application is not performing properly, then its data access routines(Query
and routines) are not optimized or not written properly.

After this buffer tuning, wait based tuning came into existence. In this approach, the session time spent waiting
for various events such as lock becoming available or disk I/O operations is taken into consideration. Performance
tuners concentrate on this waiting events, to target their tuning efforts.

In early 90's, performance tuning was not that well established as its today. It was mostly limited to a couple of
well-known 'rules of thumb. The most notorious ones were the ratio based techniques such as 'buffer cache hit
ratio', 'latch hit ratio' etc.These techniques though reflected some measure of internal efficiency, they were often
loosely associated with performance experienced by an application using database.
The most effective way of Databases optimization is tuning the performance of each layer of the database.
Problems in one layer can be fixed by configuring the next higher layer. There are few logical steps to be followed
for the database tuning such as:
1. Reduce the application demand to a logically minimum level by tuning the SQL query or tuning the PL/SQL or
by changing the physical design of the database.
2. Maximizing the concurrency level by minimizing contention for locks, latches, buffers, and other resources in
the code level.
3. Normalizing the physical I/O and its subsystem.

_______________________________________________________________________________________________

5
3. Indexes

If a heap sorted table does not have indexes, then the database must perform full table scan for processing the
records.
Indexes are schema objects responsible to arrange the records using references. Index object is not created on the
whole table rather than it is created on the column which is meant for data searching purpose.

There are variety ways of index in such as:

1. B*- Tree Indexes


2. Bitmap Indexes
3. Bitmap join indexes
4. Index Organized table

3.1 B*- Tree Indexes

The B*- Tree (Balance Tree) index is the default index structure. It has a hierarchical tree structure. There is a header
block at the top of the tree. The header block contains pointers which point to the appropriate branch blocks for the
given range of key values. This block usually points to appropriate leaf blocks for a more specific range or to another
branch block. The leaf block usually contains a list of key values or pointers (Row Ids) to the appropriate rows in the
table.

In the below one can see how a database traverses the data using index.

_______________________________________________________________________________________________

6
A-L
L-Z

A-E L-P
F-H Q-R
I-K S-Z

Ashok Harish Jacob Lohit Rohit Sita


Balaji Firoz Kalyani Manoj Ramesh Tom
Chinni Joe Naresh Uday

Now every row of the table requires the same number of index reads to locate as each leaf node is at the same
depth. So, the performance of the database is now predictable.

The selection of column or group of columns is the common measure of usefulness of an index on those columns.
For increasing the efficiency of the index, one should select the columns having more number of unique values or
less number of duplicate values. For example, if there are two columns such as Date of birth and Gender, then
date of birth should be selected for creating an index rather than gender.

The B*-Tree index is the default index structure for an Oracle database.

There are usually three types of B*-Tree index structure such as

1. Unique Index

2. Implicit Index

3. Explicit Index

Unique Index: These indexes are used to prevent any duplicate values into the column rather than to improve the
database performance. They are selective as they point to exactly one row in the database.

_______________________________________________________________________________________________

7
Implicit Index: This type of index is automatically created by the database server for implementing either primary
key or unique constraints.

Explicit Index: This type of index is usually created by the user on those columns having non-unique values.

The explicit index is again of two types such as:

a) Single index or Column wise index

b) Composite Index

Single Index is explicitly created on single column of desire whereas the composite index is created on multiple
columns explicitly.
Composite index is more selective than a single key index as the combination of columns points to more number of
rows than indexes composed of individual columns.

A composite index containing columns all the columns referred in the WHERE clause of SQL statement is usually very
effective. If we frequently query more than one column within a table, then creating a composite index for these
columns is a best practice.

An Index object can be dropped in two different ways such as one explicitly using drop command and the other
implicitly whenever a table is dropped.

3.2 Bitmap Indexes

In Bitmap index, a bitmap is created for each unique value in the column. Each bitmap contains a single bit (0 or 1)
for every row in the table. A 1 indicates that the row has the specified value by the bitmap and a 0 indicates that it
does not. Database server scans these bitmaps to find all the rows matching specified criteria.

Bitmap indexes are mostly used for the columns having limited number of distinct values that are often queried in
combination.

In the below figure we can see the how a bitmap index works on an imaginary table called employee_info

The table employee_info contains four columns such as Gender, MaritalStatus, AnnualIncome and HomeOwner.
The index is created on gender, MaritalStatus and HomeOwner columns.

_______________________________________________________________________________________________

8
Gender MaritalStatus AnnualIncome HomeOwner

M Single 100000 N

F Married 200000 Y

M Single 150000 N

Y N
Male Female
0 1
1 0 Married Single Divorced
1 0
0 1 0
0 1
1 0
1 0 0
1 0
0 1 0

Now we have a query like Select * from employee_info where gender= M AND MaritalStatus =Single AND
HomeOwner=Y

Male Single HomeOwner


1
1 AND 1 AND 0 Equals
0
0 0 1

1 1
1 1

This row satisfies the above query

_______________________________________________________________________________________________

9
Bitmap indexes are more suitable for columns with less distinct values or unselective columns.

The major drawback of the bitmap index is cannot be used for applications with high transaction rates because a
database server cannot lock a single bit and consequently a bitmap indexed column can result in locks being
applied to large number of rows.

3.3 Bitmap Join Indexes

A Bit map join index is an index that identifies the rows in one table that have matching values in the second
table. It can be used to avoid joining of two tables to resolve the results. It also reduces the number of rows
scanned by the database server for retrieving the results. This index is quite more effective than B*-Tree
index as they merge very easily but still the composite indexes still rule.

3.4 Index Organized Table

Index organized table are same as a normal table but they are internally stored in a B*-Tree index format.

IOTs are organized as a B* Tree index constructed against their primary key. The primary key plus the
additional column are stored in the index. But storing all the columns in the leaf block might degrade the
structure of the index. To avoid this degradation, one can specify the columns that need to stored in the leaf
block by specify the INCLUDING clause. Columns that appear after this clause will be stored in an overflow
segment. These overflow segments can also be stored in a different tablespace also so that it will enhance
the physical design of the table.
It has many advantages such as it avoids duplicating storage in both the index and the table, data searching
is faster than normal table as the data required is stored right in the index leaf block and also improves the
efficiency of range scans and sometimes of foreign key look ups as the rows with consecutive key values will
be stored together.

_______________________________________________________________________________________________

10
Thank You

Contact

For more information, contact gsl.cdsfiodg@tcs.com (Email Id of ISU)

About Tata Consultancy Services (TCS)

Tata Consultancy Services is an IT services, consulting and business solutions organization that delivers real results to
global business, ensuring a level of certainty no other firm can match. TCS offers a consulting-led, integrated portfolio of
IT and IT-enabled infrastructure, engineering and assurance services. This is delivered through its unique Global
TM
Network Delivery Model , recognized as the benchmark of excellence in software development. A part of the Tata
Group, Indias largest industrial conglomerate, TCS has a global footprint and is listed on the National Stock Exchange
and Bombay Stock Exchange in India.

For more information, visit us at www.tcs.com.

IT Services
Business Solutions
Consulting

All content / information present here is the exclusive property of Tata Consultancy Services Limited (TCS). The content / information contained here is correct at the time
of publishing. No material from here may be copied, modified, reproduced, republished, uploaded, transmitted, posted or distributed in any form without prior written
permission from TCS. Unauthorized use of the content / information appearing here may violate copyright, trademark and other applicable laws, and could result in
criminal or civil penalties. Copyright 2011 Tata Consultancy Services Limited

_______________________________________________________________________________________________

11

You might also like