You are on page 1of 24

UNIT 1:

Asymptotic Notations:
• Asymptotic Notations are the expressions that are used to represent the complexity of an
algorithm.
• Best Case: In which we analyse the performance of an algorithm for the input, for which
the algorithm takes less time or space.
• Worst Case: In which we analyse the performance of an algorithm for the input, for
which the algorithm takes long time or space.
• Average Case: In which we analyse the performance of an algorithm for the input, for
which the algorithm takes time or space that lies between best and worst case.
• Types of Data Structure Asymptotic Notation

1. Big-O Notation (Ο) – Big O notation specifically describes worst case scenario.

• Consider function f(n) the time complexity of an algorithm and g(n) is the most
significant term. If f(n) <= C g(n) for all n >= n0, C > 0 and n0 >= 1. Then we can
represent f(n) as O(g(n)).
f(n) = O(g(n))

Consider the following graph drawn for the values of f(n) and C g(n) for input (n) value
on X-Axis and time required is on Y-Axis

In above graph after a particular input value n0, always C g(n) is greater than f(n) which
indicates the algorithm's upper bound.

2. Omega Notation (Ω) – Omega(Ω) notation specifically describes best case scenario.

Consider function f(n) the time complexity of an algorithm and g(n) is the most significant
term. If f(n) >= C x g(n) for all n >= n0, C > 0 and n0 >= 1. Then we can
represent f(n) as Ω(g(n)).
f(n) = Ω(g(n))
Consider the following graph drawn for the values of f(n) and C g(n) for input (n) value on X-
Axis and time required is on Y-Axis

In above graph after a particular input value n0, always C x g(n) is less than f(n) which indicates
the algorithm's lower bound.
3. Theta Notation (θ) – This notation represents the average complexity of an algorithm.

Consider function f(n) the time complexity of an algorithm and g(n) is the most significant
term. If C1 g(n) <= f(n) >= C2 g(n) for all n >= n0, C1, C2 > 0 and n0 >= 1. Then we can
represent f(n) as Θ(g(n)).
f(n) = Θ(g(n))

Consider the following graph drawn for the values of f(n) and C g(n) for input (n) value on X-
Axis and time required is on Y-Axis

In above graph after a particular input value n0, always C1 g(n) is less than f(n) and C2 g(n) is
greater than f(n) which indicates the algorithm's average bound.
One Dimensional array
• An array is a collection of fixed number of values of a single type. For example: if you
want to store 100 integers in sequence, you can create an array for it.
int data[100];

Multi Dimensional array


• C supports multidimensional arrays. The simplest form of the multidimensional array is
the two-dimensional array.
Two-dimensional Arrays
• The simplest form of multidimensional array is the two-dimensional array. A two-
dimensional array is, in essence, a list of one-dimensional arrays. To declare a two-
dimensional integer array of size [x][y], you would write something as follows −

• type arrayName [ x ][ y ];

pointer arrays
• You can generate a pointer to the first element of an array by simply specifying the array
name, without any index.

Single linked list


• A linked list is a sequence of data structures, which are connected together via links.
• A singly linked list a linear data structure in which each node contains only one link field.
The following diagram illustrates the singly linked list.

Operations:
The following are the operations of single linked list
1.Traverasing
2.Searching
3.Insertion
4.Deletion
Double linked list

• Double linked list is a sequence of elements in which every element has links to its
previous element and next element in the sequence.
• In double linked list, the first node must be always pointed by head.
Always the previous field of the first node must be NULL.
Always the next field of the last node must be NULL
Circular linked list
• In single linked list, every node points to its next node in the sequence and the last node
points NULL.
• But in circular linked list, every node points to its next node in the sequence but the last
node points to the first node in the list
• Circular linked list is a sequence of elements in which every element has link to its next
element in the sequence and the last element has a link to the first element in the
sequence.
Operations
In a circular linked list, we perform the following operations...
• Insertion
• Deletion
• Display

Circular Double linked list


• Circular Doubly Linked List has properties of both doubly linked list and circular linked
list in which two consecutive elements are linked or connected by previous and next
pointer and the last node points to first node by next pointer and also the first node points
to last node by previous pointer.
Following is representation of a Circular doubly linked list node in C/C++:

// Structure of the node


struct node
{
int data;
struct node *next; // Pointer to next node
struct node *prev; // Pointer to previous node
};
Insertion in Circular Doubly Linked List

Application of linked lists.


➢ Applications of linked list in computer science –
• Implementation of stacks and queues
• Implementation of graphs : Adjacency list representation of graphs is most popular which
is uses linked list to store adjacent vertices.
• Dynamic memory allocation : We use linked list of free blocks.
• Maintaining directory of names
• Performing arithmetic operations on long integers
• Manipulation of polynomials by storing constants in the node of linked list
• representing sparse matrices
➢ Applications of linked list in real world-
• Image viewer – Previous and next images are linked, hence can be accessed by next and
previous button.
• Previous and next page in web browser – We can access previous and next url searched
in web browser by pressing back and next button since, they are linked as linked list.
• Music Player – Songs in music player are linked to previous and next song. you can play
songs either from starting or ending of the list.
UNIT-2
Stacks
• Stack is a linear data structure in which the insertion and deletion operations are
performed at only one end.
• In a stack, adding and removing of elements are performed at single position which is
known as "top".
• That means, new element is added at top of the stack and an element is removed from the
top of the stack
• In stack, the insertion and deletion operations are performed based on LIFO (Last In First
Out) principle
• Def:- "A Collection of similar data items in which both insertion and deletion operations
are performed based on LIFO principle".

Operations on Stacks
1.Push
2.Pop
3.Display
4.Isempty
5.Isfull
6.Peek
1.Push:-
In a stack, the insertion operation is performed using a function called "push".
2.Pop:-
In a stack, the deletion operation is performed using a function called "pop".
In the figure, PUSH and POP operations are performed at top position in the stack. That means,
both the insertion and deletion operations are performed at one end (i.e., at Top)
3.Display:-
By using display operation, to display the elements of stack
4.Isempty:-
It is used to check the stack is empty or not. If it is empty it display the stack is under flow i.e
deletion is not possible
5.Isfull:-
It is used to check the stack is full or not. If stack is full, it display overflow. i.e insertion is not
possible
6.Peek:-
It is used to get the value of the top element without removing it.
Expression
• In any programming language, if we want to perform any calculation or to frame a
condition etc., we use a set of symbols to perform the task. These set of symbols makes
an expression.
• An expression can be defined as follows...
An expression is a collection of operators and operands that represents a specific value.

• In above definition, operator is a symbol which performs a particular task like arithmetic
operation or logical operation or conditional operation etc.,
• Operands are the values on which the operators can perform the task. Here operand can
be a direct value or variable or address of memory location.
Expression Types
Based on the operator position, expressions are divided into THREE types. They are as follows...
1. Infix Expression
2. Postfix Expression
3. Prefix Expression
Infix Expression
In infix expression, operator is used in between operands.

The general structure of an Infix expression is as follows...


Operand1 Operator Operand2
Example

Postfix Expression
In postfix expression, operator is used after operands. We can say that "Operator follows the
Operands".

The general structure of Postfix expression is as follows...


Operand1 Operand2 Operator

Example

Prefix Expression
In prefix expression, operator is used before operands. We can say that "Operands follows the
Operator".

The general structure of Prefix expression is as follows...


Operator Operand1 Operand2
Example

Any expression can be represented using the above three different types of expressions. And we
can convert an expression from one form to another form like Infix to Postfix, Infix to
Prefix, Prefix to Postfix and vice versa.
Applications of Stacks
• Stack is used to evaluate prefix, postfix and infix expressions.
• An expression can be represented in prefix, postfix or infix notation.
• Stack can be used to convert one form of expression to another.
• Redo-undo features at many places like editors, photoshop.
• Forward and backward feature in web browsers
• Used in many algorithms like Tower of Hanoi, tree traversals
• Other applications can be Backtracking, Knight tour problem, N queen
problem and sudoku solver
• In Graph Algorithms like Topological Sorting and Strongly Connected Components

Queues
• Queue is a linear data structure in which the insertion and deletion operations are
performed at two different ends.
• In a queue data structure, adding and removing of elements are performed at two
different positions.
• The insertion is performed at one end and deletion is performed at other end.
• In a queue data structure, the insertion operation is performed at a position which is
known as 'rear' and the deletion operation is performed at a position which is known as
'front'.
• In queue data structure, the insertion and deletion operations are performed based
on FIFO (First In First Out) principle.

• In a queue data structure, the insertion operation is performed using a function called
"enQueue()" and deletion operation is performed using a function called "deQueue()".
Queue data structure can be defined as follows...
Queue data structure is a linear data structure in which the operations are performed based on
FIFO principle.
Representations of Queues
Queue data structure can be implemented in two ways. They are as follows...
1. Using Array
2. Using Linked List
Various Queue Structures
There are four types of Queue:
1. Simple Queue
A queue can also be defined as
• "Queue data structure is a collection of similar data items in which insertion and deletion
operations are performed based on FIFO principle".

2. Circular Queue
• In a normal Queue Data Structure, we can insert elements until queue becomes full.
• But once if queue becomes full, we can not insert the next element until all the elements
are deleted from the queue.
DEF: Circular Queue is a linear data structure in which the operations are performed based on
FIFO (First In First Out) principle and the last position is connected back to the first position to
make a circle.
Graphical representation of a circular queue is as follows...

3. Dequeue (Double Ended Queue)


• Double Ended Queue is also a Queue data structure in which the insertion and
deletion operations are performed at both the ends (front and rear).
• That means, we can insert at both front and rear positions and can delete from both
front and rear positions.

Double Ended Queue can be represented in TWO ways, those are as follows...
1. Input Restricted Double Ended Queue
2. Output Restricted Double Ended Queue
Input Restricted Double Ended Queue
• In input restricted double ended queue, the insertion operation is performed at only one
end and deletion operation is performed at both the ends.

Output Restricted Double Ended Queue


In output restricted double ended queue, the deletion operation is performed at only one end and
insertion operation is performed at both the ends.

4. Priority Queue
• Priority queue contains data items which have some preset priority. While removing an
element from a priority queue, the data item with the highest priority is removed first.
• In a priority queue, insertion is performed in the order of arrival and deletion is
performed based on the priority.
Applications of Queues.
• Queue is useful in CPU scheduling, Disk Scheduling. When multiple processes require
CPU at the same time, various CPU scheduling algorithms are used which are
implemented using Queue data structure.
• When data is transferred asynchronously between two processes.Queue is used for
synchronization. Examples : IO Buffers, pipes, file IO, etc.
• Queues are used in Breadth First search in a Graph . Handling of interrupts in real-time
systems. The interrupts are handled in the same order as they arrive, First come first
served.
• In real life, Call Center phone systems will use Queues, to hold people calling them in
an order, until a service representative is free.

UNIT-3
Trees
• Tree is a hierarchical data structure which stores the information naturally in the form of
hierarchy style.
• Tree is one of the most powerful and advanced data structures.
• It is a non-linear data structure compared to arrays, linked lists, stack and queue.
• It represents the nodes connected by edges.

Basic Terminologies

Field Description
Root Root is a special node in a tree. The entire tree is referenced through it.
It does not have a parent.
Parent Node Parent node is an immediate predecessor of a node.
Child Node All immediate successors of a node are its children.
Siblings Nodes with the same parent are called Siblings.
Path Path is a number of successive edges from source node to destination
node.
Height of Height of a node represents the number of edges on the longest path
Node between that node and a leaf.
Height of Height of tree represents the height of its root node.
Tree
Depth of Depth of a node represents the number of edges from the tree's root node
Node to the node.
Degree of Degree of a node represents a number of children of a node.
Node
Edge Edge is a connection between one node to another. It is a line between
two nodes or a node and a leaf.

Representations of Binary Tree


• In a normal tree, every node can have any number of children.
• Binary tree is a special type of tree data structure in which every node can have
a maximum of 2 children.
• One is known as left child and the other is known as right child.
A binary tree data structure is represented using two methods. Those methods are as follows...
1. Array Representation
2. Linked List Representation
Types of Binary Trees
1. Strictly Binary Tree
• In a binary tree, every node can have a maximum of two children.
• But in strictly binary tree, every node should have exactly two children or none.
• That means every internal node must have exactly two children. A strictly Binary Tree
can be defined as follows...

A binary tree in which every node has either two or zero number of children is called Strictly
Binary Tree

Strictly binary tree is also called as Full Binary Tree or Proper Binary Tree or 2-Tree

Strictly binary tree data structure is used to represent mathematical expressions.


Example
2. Complete Binary Tree
• In a binary tree, every node can have a maximum of two children.
• But in strictly binary tree, every node should have exactly two children or none and in
complete binary tree all the nodes must have exactly two children and at every level of
complete binary tree there must be 2level number of nodes.
• For example at level 2 there must be 22 = 4 nodes and at level 3 there must be 23 = 8
nodes.

A binary tree in which every internal node has exactly two children and all leaf nodes are at
same level is called Complete Binary Tree.

Complete binary tree is also called as Perfect Binary Tree

3. Extended Binary Tree


• A binary tree can be converted into Full Binary tree by adding dummy nodes to existing
nodes wherever required.

The full binary tree obtained by adding dummy nodes to a binary tree is called as Extended
Binary Tree.
In above figure, a normal binary tree is converted into full binary tree by adding dummy nodes
(In pink colour).
Binary Search Tree
• In a binary tree, every node can have maximum of two children but there is no order of
nodes based on their values.
• In binary tree, the elements are arranged as they arrive to the tree, from top to bottom and
left to right.
A binary tree has the following time complexities...
1. Search Operation - O(n)
2. Insertion Operation - O(1)
3. Deletion Operation - O(n)
• To enhance the performance of binary tree, we use special type of binary tree known
as Binary Search Tree. Binary search tree mainly focus on the search operation in binary
tree. Binary search tree can be defined as follows...
Binary Search Tree is a binary tree in which every node contains only smaller values in its left
subtree and only larger values in its right subtree.
In a binary search tree, all the nodes in left subtree of any node contains smaller values and all
the nodes in right subtree of that contains larger values as shown in following figure...

Heap Trees
• A Heap is a special Tree-based data structure in which the tree is a complete binary tree.
Generally,
• Heaps can be of two types:
• Max-Heap: In a Max-Heap the key present at the root node must be greatest among the
keys present at all of it’s children. The same property must be recursively true for all sub-
trees in that Binary Tree.
• Min-Heap: In a Min-Heap the key present at the root node must be minimum among the
keys present at all of it’s children. The same property must be recursively true for all sub-
trees in that Binary Tree.

Height Balanced Trees


• AVL tree is a height balanced tree.
• It is a self-balancing binary search tree.
• AVL tree is another balanced binary search tree.
• It was invented by Adelson-Velskii and Landis.
• AVL trees have a faster retrieval.
• It takes O(logn) time for addition and deletion operation.
• In AVL tree, heights of left and right subtree cannot be more than one for all nodes.

B. Trees
• B-Tree is a self-balancing search tree.
• In most of the other self-balancing search trees like AVL and Red-Black Trees
• it is assumed that everything is in main memory.
• To understand the use of B-Trees, we must think of the huge amount of data that cannot
fit in main memory.

Red Black Trees


• Red-Black Tree is a self-balancing Binary Search Tree (BST) where every node follows
following rules.
• Every node has a color either red or black.
• Root of tree is always black.
• There are no two adjacent red nodes (A red node cannot have a red parent or red child).
• Every path from a node (including root) to any of its descendant NULL node has the
same number of black nodes.
Graphs
• Graph is an abstract data type.
• It is a pictorial representation of a set of objects where some pairs of objects are
connected by links.
• Graph is used to implement the undirected graph and directed graph concepts from
mathematics.
• It represents many real life application. Graphs are used to represent the networks.
Network includes path in a city, telephone network etc.
• It is used in social networks like Facebook, LinkedIn etc.

Graph terminologies
• Vertex
A individual data element of a graph is called as Vertex. Vertex is also known as node. In above
example graph, A, B, C, D & E are known as vertices.
• Edge
An edge is a connecting link between two vertices. Edge is also known as Arc. An edge is
represented as (startingVertex, endingVertex). For example, in above graph, the link between
vertices A and B is represented as (A,B). In above example graph, there are 7 edges (i.e., (A,B),
(A,C), (A,D), (B,D), (B,E), (C,D), (D,E)).
Edges are three types.
1. Undirected Edge - An undirected egde is a bidirectional edge. If there is a undirected
edge between vertices A and B then edge (A , B) is equal to edge (B , A).
2. Directed Edge - A directed egde is a unidirectional edge. If there is a directed edge
between vertices A and B then edge (A , B) is not equal to edge (B , A).
3. Weighted Edge - A weighted egde is an edge with cost on it.
• Undirected Graph
A graph with only undirected edges is said to be undirected graph.
• Directed Graph
A graph with only directed edges is said to be directed graph.
• Mixed Graph
A graph with undirected and directed edges is said to be mixed graph.
• End vertices or Endpoints
The two vertices joined by an edge are called the end vertices (or endpoints) of the edge.

• Origin
If an edge is directed, its first endpoint is said to be origin of it.
• Destination
If an edge is directed, its first endpoint is said to be origin of it and the other endpoint is said to
be the destination of the edge.
• Adjacent
If there is an edge between vertices A and B then both A and B are said to be adjacent. In other
words, Two vertices A and B are said to be adjacent if there is an edge whose end vertices are A
and B.
• Incident
An edge is said to be incident on a vertex if the vertex is one of the endpoints of that edge.
• Outgoing Edge
A directed edge is said to be outgoing edge on its orign vertex.
• Incoming Edge
A directed edge is said to be incoming edge on its destination vertex.
• Degree
Total number of edges connected to a vertex is said to be degree of that vertex.
• Indegree
Total number of incoming edges connected to a vertex is said to be indegree of that vertex.
• Outdegree
Total number of outgoing edges connected to a vertex is said to be outdegree of that vertex.
• Parallel edges or Multiple edges
If there are two undirected edges to have the same end vertices, and for two directed edges to
have the same origin and the same destination. Such edges are called parallel edges or multiple
edges.
• Self-loop
An edge (undirected or directed) is a self-loop if its two endpoints coincide.
• Simple Graph
A graph is said to be simple if there are no parallel and self-loop edges.
• Path
A path is a sequence of alternating vertices and edges that starts at a vertex and ends at a vertex
such that each edge is incident to its predecessor and successor vertex.
Representation of graphs
• Adjacency Matrix
• Incidence Matrix
• Adjacency List

Operations on Graphs
• Depth First Search
• Breadth First Search
Application of Graph Structures
• Graph is used to implement the undirected graph and directed graph concepts from
mathematics.
• It represents many real life application. Graphs are used to represent the networks.
Network includes path in a city, telephone network etc.
• It is used in social networks like Facebook, LinkedIn etc.
Shortest path problem
• The shortest path problem is about finding a path between 2 vertices in a graph such that
the total sum of the edges weights is minimum.
• This problem could be solved easily using (BFS)
BFS (Breadth First Search)
• BFS traversal of a graph, produces a spanning tree as final result.
• Spanning Tree is a graph without any loops.
• We use Queue data structure with maximum size of total number of vertices in the graph
to implement BFS traversal of a graph.

topological sorting.
• Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices
such that for every directed edge uv
• vertex u comes before v in the ordering.
• Topological Sorting for a graph is not possible if the graph is not a DAG.
• For example, a topological sorting of the following graph is “5 4 2 3 1 0”. There can be
more than one topological sorting for a graph. For example, another topological sorting
of the following graph is “4 5 2 3 1 0”. The first vertex in topological sorting is always a
vertex with in-degree as 0 (a vertex with no incoming edges).

UNIT-4
Sorting
• Sorting is nothing but arranging the data in ascending or descending order. The
term sorting came into picture, as humans realised the importance of searching quickly.
• There are so many things in our real life that we need to search for, like a particular
record in database, roll numbers in merit list, a particular telephone number in telephone
directory, a particular page in a book etc.
• All this would have been a mean if the data was kept unordered and unsorted, but
fortunately the concept of sorting came into existence, making it easier for everyone to
arrange data in an order, hence making it easier to search.

Sorting Techniques
There are many different techniques available for sorting, differentiated by their efficiency and
space requirements. Following are some sorting techniques which we will be covering in next
few tutorials.
1. Bubble Sort
2. Insertion Sort
3. Selection Sort
4. Quick Sort
5. Merge Sort
6. Heap Sort
7. Shell Sort
1. Bubble sort:-
• Bubble sort is a simple sorting algorithm.
• This sorting algorithm is comparison-based algorithm in which each pair of adjacent
elements is compared and the elements are swapped if they are not in order.
• This algorithm is not suitable for large data sets as its average and worst case complexity
are of Ο(n2) where n is the number of items.

2.Selection Sort

• Selection Sort algorithm is used to arrange a list of elements in a particular order


(Ascending or Descending).
• In selection sort, the first element in the list is selected and it is compared repeatedly with
remaining all the elements in the list. If any element is smaller than the selected element
(for Ascending order), then both are swapped.
• Then we select the element at second position in the list and it is compared with
remaining all elements in the list.
• If any element is smaller than the selected element, then both are swapped. This
procedure is repeated till the entire list is sorted.
• The time complexity of selection sort is O(N2)

3)Insertion Sort

• Sorting is the process of arranging a list of elements in a particular order (Ascending or


Descending).
• Insertion sort algorithm arranges a list of elements in a particular order.
• In insertion sort algorithm, every iteration moves an element from unsorted portion to
sorted portion until all the elements are sorted in the list.
• In worst case,each element is compared with all the other elements in the sorted array.
For N elements, there will be N2 comparisons. Therefore, the time complexity is O(N2)

4. Merge sort:-
• Merge sort is a sorting technique based on divide and conquer technique.
• With worst-case time complexity being Ο(n log n), it is one of the most respected
algorithms.
• Merge sort first divides the array into equal halves and then combines them in a sorted
manner.
• The list of size N is divided into a max of logN parts, and the merging of all sublists into
a single list takes O(N) time, the worst case run time of this algorithm is O(NLogN)

5.Qick sort
• Quick sort is a highly efficient sorting algorithm and is based on partitioning of array of
data into smaller arrays.
• A large array is partitioned into two arrays one of which holds values smaller than the
specified value, say pivot, based on which the partition is made and another array holds
values greater than the pivot value.
• Quick sort partitions an array and then calls itself recursively twice to sort the two
resulting sub arrays.
• This algorithm is quite efficient for large-sized data sets as its average and worst case
complexity are of Ο(n2), where n is the number of items.
6.Shell sort
• Shell sort is a highly efficient sorting algorithm and is based on insertion sort algorithm.
• This algorithm avoids large shifts as in case of insertion sort, if the smaller value is to the
far right and has to be moved to the far left.
• This algorithm uses insertion sort on a widely spread elements, first to sort them and
then sorts the less widely spaced elements. This spacing is termed as interval.

7.Heap sort

• Heaps can be used in sorting an array.


• In max-heaps, maximum element will always be at the root.
• In min-heaps, minimum element will always be at the root.
• Heap Sort uses this property of heap to sort the array.
• max_heapify has complexity O(logN) build_maxheap has complexity O(N) and we run
max_heapify N-1 times in heap_sort function, therefore complexity of heap_sort
function is O(NlogN)

UNIT-5
Searching

• Search is a process of finding a value in a list of values. In other words, searching is the
process of locating given value position in a list of values.
• Searching Techniques

1. Sequential Search
2. Binary Search
Sequential Searches
• Sequential search is also called as Linear Search.
• Sequential search starts at the beginning of the list and checks every element of the list.
• It is a basic and simple search algorithm.
• Sequential search compares the element with all the other elements given in the list. If
the element is matched, it returns the value index, else it returns -1.

• The above figure shows how sequential search works. It searches an element or value
from an array till the desired element or value is not found. If we search the element 25, it
will go step by step in a sequence order. It searches in a sequence order. Sequential
search is applied on the unsorted or unordered list when there are fewer elements in a list.
• The following code snippet shows the sequential search operation:
function searchValue(value, target)
{
for (var i = 0; i < value.length; i++)
{
if (value[i] == target)
{
return i;
}
}
return -1;
}
searchValue([10, 5, 15, 20, 25, 35] , 25); // Call the function with array and number to
be searched
Binary Search
• Binary Search is used for searching an element in a sorted array.
• It is a fast search algorithm with run-time complexity of O(log n).
• Binary search works on the principle of divide and conquer.
• This searching technique looks for a particular element by comparing the middle most
element of the collection.
• It is useful when there are large number of elements in an array.

• The above array is sorted in ascending order. As we know binary search is applied on
sorted lists only for fast searching.
• The following code snippet shows the sequential search operation:

f = 0;
l = size - 1;
m = (f+l)/2;

while (f <= l) {
if (list[m] < sElement)
f = m + 1;
else if (list[m] == sElement) {
printf("Element found at index %d.\n",m);
break;
}
else
l = m - 1;
m = (f + l)/2;
}
if (f > l)
printf("Element Not found in the list.");

Hash tables

• Hashing is a technique that is used to uniquely identify a specific object from a group of
similar objects. Some examples of how hashing is used in our lives include:

• In universities, each student is assigned a unique roll number that can be used to retrieve
information about them.
• In libraries, each book is assigned a unique number that can be used to determine
information about the book, such as its exact position in the library or the users it has
been issued to etc.
• In both these examples the students and books were hashed to a unique number.

➢ Hashing is implemented in two steps:


• An element is converted into an integer by using a hash function. This element can be
used as an index to store the original element, which falls into the hash table.
• The element is stored in the hash table where it can be quickly retrieved using hashed
key.
hash = hashfunc(key)
index = hash % array_size

Collision Resolutions

Separate chaining (open hashing)

• Separate chaining is one of the most commonly used collision resolution techniques.
• It is usually implemented using linked lists. In separate chaining, each element of the
hash table is a linked list.
• To store an element in the hash table you must insert it into a specific linked list.
• If there is any collision (i.e. two different elements have same hash value) then store both
the elements in the same linked list.

Linear probing (open addressing or closed hashing)

• In open addressing, instead of in linked lists, all entry records are stored in the array
itself.
• When a new entry has to be inserted, the hash index of the hashed value is computed and
then the array is examined (starting with the hashed index).
• If the slot at the hashed index is unoccupied, then the entry record is inserted in slot at the
hashed index else it proceeds in some probe sequence until it finds an unoccupied slot.

index = index % hashTableSize


index = (index + 1) % hashTableSize
index = (index + 2) % hashTableSize
index = (index + 3) % hashTableSize
Quadratic Probing

• Quadratic probing is similar to linear probing and the only difference is the interval
between successive probes or entry slots.
• Here, when the slot at a hashed index for an entry record is already occupied, you must
start traversing until you find an unoccupied slot.
• The interval between slots is computed by adding the successive value of an arbitrary
polynomial in the original hashed index.

index = index % hashTableSize


index = (index + 12) % hashTableSize
index = (index + 22) % hashTableSize
index = (index + 32) % hashTableSize

Double hashing:

• Double hashing is similar to linear probing and the only difference is the interval between
successive probes.
• Here, the interval between probes is computed by using two hash functions.

index = (index + 1 * indexH) % hashTableSize;


index = (index + 2 * indexH) % hashTableSize;

Hash Method
• Define a hashing method to compute the hash code of the key of the data item.

int hashCode(int key){


return key % SIZE;
}
• Search Method
Whenever an element is to be searched, compute the hash code of the key passed and locate the
element using that hash code as index in the array. Use linear probing to get the element ahead
if the element is not found at the computed hash code.

• Insert Method
Whenever an element is to be inserted, compute the hash code of the key passed and locate the
index using that hash code as an index in the array. Use linear probing for empty location, if an
element is found at the computed hash code.

• Delete Method
Whenever an element is to be deleted, compute the hash code of the key passed and locate the
index using that hash code as an index in the array. Use linear probing to get the element ahead
if an element is not found at the computed hash code. When found, store a dummy item there to
keep the performance of the hash table intact.
Bucket hashing:
• A bucket is simply a fast-access location (like an array index) that is the the result of the
hash function.

Applications

• Associative arrays: Hash tables are commonly used to implement many types of in-
memory tables. They are used to implement associative arrays (arrays whose indices are
arbitrary strings or other complicated objects).
• Database indexing: Hash tables may also be used as disk-based data structures and
database indices (such as in dbm).
• Caches: Hash tables can be used to implement caches i.e. auxiliary data tables that are
used to speed up the access to data, which is primarily stored in slower media.
• Object representation: Several dynamic languages, such as Perl, Python, JavaScript, and
Ruby use hash tables to implement objects.
• Hash Functions are used in various algorithms to make their computing faster

You might also like