You are on page 1of 31

9/9/2018

Data Structure

Introduction
Divyashikha Sethia
Department of CSE, DTU
divyashikha@dtu.ac.in

Program
Good program defined as a program:
• runs correctly
• easy to read and understand
• easy to debug and
• easy to modify.

Divyashikha Sethia (DTU) 2

1
9/9/2018

Data Structure
• A data structure is basically a group of data
elements that are put together under one
name, and which defines a particular way of
storing and organizing data in a computer so
that it can be used efficiently.

Divyashikha Sethia (DTU) 3

Data Structure…
• The term data means a value or set of values.
It specifies either the value of a variable or a
constant (e.g., marks of students)
• A record is a collection of data items. For
example, the name, address, course, and
marks obtained are individual data items
• A file is a collection of related records. For
example, if there are 60 students in a class,
then there are 60 records of the students

Divyashikha Sethia (DTU) 4

2
9/9/2018

CLASSIFICATION OF DATA STRUCTURES


Primitive and Non-primitive Data Structures
• Primitive data structures: fundamental data
types which are supported by a programming
language. Some basic data types are integer,
real, character, and boolean
• Non-primitive data structures: which are
created using primitive data structures.
Examples of such data structures include
linked lists, stacks, trees, and graphs

Divyashikha Sethia (DTU) 5

Non-primitive data structure


• Linear: elements of data structure are stored in a
linear or sequential order (arrays, linked lists,
stacks, and queues)
- memory representation: linear relationship
between elements by means of sequential
memory locations
- relationship between elements by means of links.
• Non-linear: elements of a data structure are not
stored in a sequential order (trees, graphs)
- relationship of adjacency is not maintained
between elements
Divyashikha Sethia (DTU) 6

3
9/9/2018

Arrays
• collection of similar data elements
• Elements stored in consecutive memory
locations and are referenced by an index (also
known as the subscript).
• want to store large amount of similar type of
data int marks[10]; in C
• Memory representation

Divyashikha Sethia (DTU) 7

Arrays..
Limitations:
• Fixed size.
• Data elements stored in contiguous memory
locations which may not be always available.
• Insertion and deletion of elements can be
problematic because of shifting of elements.

Divyashikha Sethia (DTU) 8

4
9/9/2018

Linked Lists
• Dynamic data structure in which elements (called
nodes) form a sequential list.
• Each node is allocated space as it is added to the list.
Every node in list points to next node in the list
• Each node has two elements
i)Value of the node or any other data that corresponds
to that node
ii)Pointer or link to the next node in the list (last pointer
null for tail)

Divyashikha Sethia (DTU) 9

Linked List…
• Advantage: Easier to insert or delete data
elements
• Disadvantage: Slow search operation and
requires more memory space

Divyashikha Sethia (DTU) 10

5
9/9/2018

Stacks

• Linear data structure : insertion and deletion


of elements are done at only one end, which
is known as the top of the stack.
• Also called a last-in, first-out (LIFO) structure:
last element added to stack is also first
element deleted from stack
• Implementation: arrays or linked lists

Divyashikha Sethia (DTU) 11

Stack…
• Array implementation of a stack

• Top stores address of topmost element of stack.


• Position from where the element will be added or deleted.
• MAX variable: stores maximum number of elements stack
can store
• top = NULL, => stack is empty
• top = MAX–1 => stack is full

Divyashikha Sethia (DTU) 12

6
9/9/2018

Stacks..
Basic operations:
• Push: adds element to top of stack.
• Pop: removes element from top of stack.
• Peep: returns value of topmost element of stack
(without deleting it).

Divyashikha Sethia (DTU) 13

Stacks..
Important condition to check:
• Overflow: try to insert element into stack that is
already full
• Underflow: try to delete an element from stack
that is already empty.

Divyashikha Sethia (DTU) 14

7
9/9/2018

Queues
• First-in, first-out (FIFO) data structure : element inserted
first is also first one to be taken out
• Added at one end called rear and removed from other end
called front
• Implementation: arrays / linked lists

Divyashikha Sethia (DTU) 15

Queues..
• Array representation

• After insertion

• After deletion

Divyashikha Sethia (DTU) 16

8
9/9/2018

Queues..
• MAX –size of the queue
• Conditions
-overflow: insert element into queue that is
already full (rear = MAX – 1)
- underflow: delete element from queue that is
already empty.
If front = NULL and rear = NULL => no element in
queue.

Divyashikha Sethia (DTU) 17

Trees
• Non-linear structure: collection of nodes
arranged in a hierarchical order.
• One of the nodes is designated as root node
• Remaining nodes can be partitioned into
disjoint sets such that each set is a sub-tree of
the root

Divyashikha Sethia (DTU) 18

9
9/9/2018

Binary Trees (simplest)


• Consists of root node and left and
right sub-trees, where both sub-
trees are also binary trees.
• Each node contains a data element,
a left pointer which points to the
left sub-tree, and a right pointer
which points to the right sub-tree.
• The root element is the topmost
node which is pointed by a ‘root’
pointer. If root = NULL then the tree
is empty

Divyashikha Sethia (DTU) 19

Trees..
• Advantage: Provides quick search, insert, and
delete operations
• Disadvantage: Complicated deletion algorithm

Divyashikha Sethia (DTU) 20

10
9/9/2018

Graphs
• Collection of vertices (also called nodes)
and edges that connect these vertices.
• Generalization of the tree structure:
complex relationship between nodes
instead of parent-child relationship
• Example
-Graphs of a city with places as nodes and
road as edges
-Computer network: nodes are workstations
and edges are the network connections
- Searching graph and finding shortest path
between nodes of a graph

Divyashikha Sethia (DTU) 21

Graphs…
• Two nodes are connected via an edge, the two
nodes are known as neighbours
• Advantage: Best models real-world situations
• Disadvantage: Some algorithms are slow and
very complex

Divyashikha Sethia (DTU) 22

11
9/9/2018

OPERATIONS ON DATA STRUCTURES


• Traversing : to access each data item exactly once so that
it can be processed. For example, to print the names of all
the students in a class.
• Searching : find the location of one or more data items
that satisfy the given constraint. Such a data item may or
may not be present in the given collection of data items.
For example, to find the names of all the students who
secured 100 marks in mathematics.
• Inserting: add new data items to the given list of data
items. Eg: add the details of a new student who has
recently joined the course.

Divyashikha Sethia (DTU) 23

OPERATIONS ON DATA STRUCTURES


• Deleting : to remove (delete) a particular data item from
the given collection of data items. Eg: to delete the name
of a student who has left the course.
• Sorting Data : data arranged in some order like ascending
order or descending order
Eg: arranging the names of students in a class in an
alphabetical order, or calculating the top three winners by
arranging the participants’ scores in descending order and
then extracting the top three.
• Merging Lists of two sorted data items : combined to form
a single list of sorted data items.

Divyashikha Sethia (DTU) 24

12
9/9/2018

ABSTRACT DATA TYPE


• Way we look at data structure, focusing on what it does and
ignoring how it does its job.
• Eg: ADT stack and queue: implement both ADTs using array/ list
• Data type : set of values that the variable can take.
- data types in C include int, char, float, and double.
- Data item with certain characteristics and permissible operations on
that data
- Int data with values –32768 to 32767 operators +, –, *, and /
• Abstract The word ‘abstract’ in the context of data structures
means considered apart from the detailed specifications or
implementation
- stack or a queue, user is concerned only with type of data and
operations that can be performed on it. Fundamentals of how data
is stored should be invisible to user.
- users aware that stacks, have push() and pop() functions
Divyashikha Sethia (DTU) 25

Algorithm
• A formally defined procedure for performing
some calculation
• Algorithm provides blueprint to write program to
solve particular problem.
• It is considered to be an effective procedure for
solving a problem in finite number of steps.
• Well-defined algorithm always provides an
answer and is guaranteed to terminate.
• Choice of a particular algorithm must depend on
the time and space complexity of the algorithm.

Divyashikha Sethia (DTU) 26

13
9/9/2018

Algorithm Design
 Modularization
• process of dividing an algorithm into modules (smaller units)
• Advantages
- Makes complex algorithm simpler to design and implement.
- Each module can be designed independently.
- simplifies implementation, debugging, testing, documenting, and
maintenance of the overall algorithm.
• There are two main approaches to design an algorithm—top-down
approach and bottom-up

Divyashikha Sethia (DTU) 27

Top-down approach
• Dividing the complex algorithm into one or
more modules. These modules can further be
decomposed into one or more sub-modules
• stepwise refinement
• start from an abstract design and then at each
step, this design is refined into more concrete
level

Divyashikha Sethia (DTU) 28

14
9/9/2018

Bottom-up approach
• start with designing the most basic or
concrete modules and then proceed towards
designing higher level modules
• sub-modules are grouped together to form a
higher level module. All the higher level
modules are clubbed together to form even
higher level modules. This process is repeated
until the design of the complete algorithm is
obtained.

Divyashikha Sethia (DTU) 29

Comparison
Top-Down Bottom-up
highly appreciated for ease in allows information hiding as it first
documenting the modules, generation of identifies what has to be encapsulated
test cases, implementation of code, and within a module and then provides an
debugging abstract interface to define the module’s
boundaries as seen from the clients

sub-modules are analysed in isolation difficult to be done in a strict bottom-up


without concentrating on their strategy
communication with other modules or on
reusability of components and little
attention is paid to data, thereby ignoring
the concept of information hiding.

Divyashikha Sethia (DTU) 30

15
9/9/2018

CONTROL STRUCTURES USED IN


ALGORITHMS
• Sequence
• Decision
• Repetition

Divyashikha Sethia (DTU) 31

Sequence
• Each step of an algorithm is executed in a
specified order.

Divyashikha Sethia (DTU) 32

16
9/9/2018

Decision
• execution of a process depends on the
outcome of some condition

Divyashikha Sethia (DTU) 33

Repetition
• executing one or more steps for number of
times
• Constructs: while, do–while, and for loops

Divyashikha Sethia (DTU) 34

17
9/9/2018

TIME AND SPACE COMPLEXITY


• Analysing an algorithm means determining amount of resources
(such as time and memory) needed to execute it.
• time complexity of an algorithm: is basically the running time of a
program as a function of the input size.
- number of machine instructions dependent on the input size
• space complexity of an algorithm : is the amount of computer
memory that is required during the program execution as a function
of the input size
- Fixed part: It varies from problem to problem. It includes the
space needed for storing instructions, constants, variables, and
structured variables (like arrays and structures).
- Variable part: It varies from program to program. It includes the
space needed for recursion stack, and for structured variables that
are allocated space dynamically during the runtime of a program.

Divyashikha Sethia (DTU) 35

Time Complexity
• Worst-case running time :
- denotes the behavior of algorithm with respect
to worst possible case of input instance.
- is an upper bound on running time for any
input.
- gives us an assurance that algorithm will never
go beyond this time limit.

Divyashikha Sethia (DTU) 36

18
9/9/2018

Time Complexity..
• Average-case running time :
- is an estimate of running time for ‘average’
input.
- It specifies expected behavior of algorithm
when input is randomly drawn from a given
distribution.
- assumes that all inputs of given size are
equally likely.

Divyashikha Sethia (DTU) 37

Time Complexity..
• Best-case running time
- used to analyse an algorithm under optimal conditions.
- eg: best case for a simple linear search on an array
occurs when the desired element is the first in the list.
- while developing and choosing an algorithm to solve
problem, we hardly base our decision on best-case
performance.
- It is always recommended to improve average
performance and worst-case performance of an
algorithm.

Divyashikha Sethia (DTU) 38

19
9/9/2018

Time Complexity..
• Amortized running time
- refers to the time required to perform a
sequence of (related) operations averaged
over all the operations performed. Amortized
analysis
• guarantees the average performance of each
operation in the worst case.

Divyashikha Sethia (DTU) 39

Time-Space tradeoff
• The best algorithm to solve a particular problem
at hand is no doubt the one that requires less
memory space and takes less time to complete its
execution. But practically, designing such an ideal
algorithm is not a trivial task
• if space is a big constraint, then one might choose
a program that takes less space at the cost of
more CPU time.
• if time is a major constraint, then one might
choose a program that takes minimum time to
execute at the cost of more space

Divyashikha Sethia (DTU) 40

20
9/9/2018

Expressing Time and Space Complexity

• expressed using a function f(n) where n :


input size for given instance of problem being
solved.
• to predict rate of growth of complexity as the
input size of problem increases
• to find the algorithm that is most efficient.
• Big O notation: provides upper bound for
complexity

Divyashikha Sethia (DTU) 41

Algorithmic Efficiency
• function is linear (without any loops or
recursions): efficiency of algorithm or running
time of that algorithm can be given as number
of instructions it contains

Divyashikha Sethia (DTU) 42

21
9/9/2018

Linear Loops

number of iterations is directly proportional to


the loop factor.

Divyashikha Sethia (DTU) 43

Logarithmic Loops
• loop-controlling variable is either multiplied or
divided during each iteration of the loop

executed only 10 times

number of iterations can be given by log 1000

Divyashikha Sethia (DTU) 44

22
9/9/2018

Nested Loops
• determine the number of iterations each loop completes
• total is then obtained as the product of the number of iterations in
the inner loop and the number of iterations in the outer loop
Linear logarithmic loop

•Quadratic loop

Dependent quadratic loop: number of iterations in the inner loop is dependent on the
outer loop

Inner loop number of iterations:


1 + 2 + 3 + ... + 9 + 10 = 55

Divyashikha Sethia (DTU) 45

Big O
• If f(n) and g(n) are the functions defined on a positive integer
number n, then f(n) = O(g(n))
• That is, f of n is Big–O of g of n if and only if positive constants c and
n exist, such that f(n) <= cg(n).
• For large amounts of data, f(n) will grow no more than a constant
factor than g(n).
• g provides an upper bound. function f(n) can do better but not
worse than the specified value
• sorting algorithm performs n2 operations to sort just n elements,
• then that algorithm would be described as an O(n2) algorithm

Divyashikha Sethia (DTU) 46

23
9/9/2018

Big (O)..
 Constant c dependent on
• programming language used,
• quality of compiler or interpreter,
• CPU speed,
• size of main memory and access time to it
• knowledge of programmer, and
• the algorithm itself, which may require simple
but also time-consuming machine instructions

Divyashikha Sethia (DTU) 47

O(n)..
• Example

Divyashikha Sethia (DTU) 48

24
9/9/2018

O(n)
• Best case O describes an upper bound for all
combinations of input (eg sorting a sorted array)
• Worst case O describes a lower bound for worst case
input combinations. It is possibly greater than the best
case. ( eg: sorting an array the worst case is when the
array is sorted in reverse order.
• O, it means same as worst case O

Divyashikha Sethia (DTU) 49

Categories of Algorithms

Divyashikha Sethia (DTU) 50

25
9/9/2018

O(n)
• Ex1:

Divyashikha Sethia (DTU) 51

O(n)

Divyashikha Sethia (DTU) 52

26
9/9/2018

Homework

Divyashikha Sethia (DTU) 53

Divyashikha Sethia (DTU) 54

27
9/9/2018

Limitations of Big O Notation


• Many algorithms are simply too hard to
analyse mathematically.
• There may not be sufficient information to
calculate the behaviour of the algorithm in the
average case.
• Big O analysis only tells us how the algorithm
grows with the size of the problem, not how
efficient it is, as it does not consider the
programming effort

Divyashikha Sethia (DTU) 55

OMEGA NOTATION (Ω)


• Omega notation provides a tight lower bound
for f(n). This means that the function can
never do better than the specified value but it
may do worse
• `

Divyashikha Sethia (DTU) 56

28
9/9/2018

Omega..
• Best case Ω describes a lower bound for all
combinations of input. This implies that the function
can never get any better than the specified value. For
example, when sorting an array the best case is when
the array is already correctly sorted.
• Worst case Ω describes a lower bound for worst case
input combinations. It is possibly greater than best
case. For example, when sorting an array the worst
case is when the array is sorted in reverse order.
• If we simply write Ω, it means same as best case Ω.

Divyashikha Sethia (DTU) 57

• Example:

Divyashikha Sethia (DTU) 58

29
9/9/2018

THETA NOTATION (Θ)


• Theta notation provides an asymptotically
tight bound for f(n).

Divyashikha Sethia (DTU) 59

• Example

Divyashikha Sethia (DTU) 60

30
9/9/2018

THANKS

Divyashikha Sethia (DTU) 61

31

You might also like