You are on page 1of 14

Data Structures I Notes

Philip Hwang
March 30, 2016

These are notes on the Data Structures I course taken at the Oklahoma
School of Science and Mathematics. The primary language used is Java. The
course covers various types of data structures such as BSTs, different ADTs,
Hashing etc. The textbook used is...

1 Stacks
A stack is an Abstract Data Type that is what is called a LIFO (Last in First out).
You can push items onto the top of a stack, peek for what is on the top of the stack, and
pop items off the top of the stacks. We may implement a stack with an ArrayList or a
Linked List.

1.1 Run Time Complexity


For an ArrayList, the runtime complexity is as follows
push()
s.add(item); O(1)
s.add(e,item); O(n)
pop()
s.remove(s.size-1); O(1)
s.remove(o) O(n)
This just goes to show that when using ArrayList, one should always remove and add at
the end of the List.
For a LinkedList, the runtime complexity is as follows
push()
s.addFirst(item); O(1)
s.addLast(item); O(1)
pop()
s.removeFirst(item); O(1)
s.removeLast(item); O(n)
This just goes to show to always add and remove from the head of a LinkedList implementation of a stack.

Philip Hwang

Queues

1.2 ArrayList Implementation


The Code for the ArrayList Implementation looks like the following
Code 1.1
public class Stack<E>{
private List<E> s=new ArrayList<E>();
private int capacity;
public Stack(int size){
capacity=size;
}
public boolean isFull(){
return(s.size==capacity);
}
public boolean isEmpty(){
return(s.size()==0);
}
public boolean push(E e){
if(isFull()){return false;}
s.add(e);
return true;
}
public E pop(){
if(isEmpty()){return null;}
return s.remove(s.size()-1);
}
public E peek(){
if(isEmpty){return null;}
return s.get(s.size()-1);
}}

1.3 LinkedList
Same basic idea as ArrayList.

1.4 Array
Same basic idea as before

2 Queues
Queues are similar to stacks, but they are what are called FIFO (First in First Out).
The following is a table of the Run Time Complexity of the various methods for different
implementations of Queues:

Philip Hwang

3
Methods
isEmpty()
isFull()
enQueue()
deQueue()

AL
O(1)
O(1)
O(1)/O(n)
O(n)/O(1)

Trees

LL
O(1)
O(1)
addLast O(1)
removeFirst O(1)

Note how For ArrayList implementation, the enQueue() and deQueue() Run Time
Complexities will essentially be the same regardless of whether we add in the front of the
sequence or the end.

2.1 ArrayList Implementation


For ArrayList implementation, we will do a sort of modular/circular Queue. The code
goes as follows (minus the technicalities):
Code 2.1
public class Queue{
public boolean isEmpty(){
return front==rear;
}
public boolean isFull(){
return((rear+1)%q.length==front);
}
public boolean enQueue(E e){
if(isFull())
return false;
q[rear]=e;
rear=(rear+1)%q.length;
}
public E deQueue(){
if(isEmpty())
return null;
E temp=g[front];
front=(front+1)%g.length;
return temp;
} }

3 Trees
Some terminology
root node
No parent/predessesor
child node
exactly one parent, when in degree is 1
out degree
count of edges that point out of a tree (the number of children)

Philip Hwang

Trees

3.1 ADT BST


structure property
each node has maximum of two children
order property
a parent data has a value that is greater than the left childs but less than the
right childs
Here is some useful catch up from regular Java class
Code 3.1 (Compare)
a.compareTo(b) method returns a negative value when a < b, a positive value when
a > b, and 0 when a = b.

3.2 Implementation
For doing the recursive methods, we need a separate static class within or BST.

Philip Hwang

Trees

Code 3.2
private static class BinaryNode<E>{
E element;
BinaryNode<E> left, right;
BinaryNode(E e){
element=e;
}
BinaryNode(E e, BinaryNode<E>lc, Binary Node<E>rc){
element=e;
left=lc;
right=rc;
}}
public class BST<E extends Comparable<?super E>>{
private BinaryNode<E> root;
public BST(){
root=null;
}
public void makeEmpty(){
root=null;
}
public boolean isEmpty(){
return root==null;
}
public boolean contains(E e){
return contains(root,e);//check later for the other contains
}
private boolean contains(BinaryNode<E> t, E e){//Note there is a recursive
way of writing this method
if(t==null)
return false;
int result=e.compareTo(t.element);
if(result<0)
return contains(t.left, e);
else if(result>0)
return contains(t.right, e);
else
return true;
}
public E findMin(){//find max would be the same essentially
if(root==null)
return null;
for( ; ;){
if(t.left==null)
return t.element;
t=t.left;
}

Philip Hwang

Hashing

Code 3.3 (More Code)


public BSTNode<E> insert(E e, BSTNode t){
if(t==null)
return new BSTNode e;
int result=e.compareTo(t.element);
if(result<0)
t.left=insert(e,t.left);
else if(result>0)
t.right=insert(e,t.right);
return t;
}
public void remove(E e){
root=remove(e,root);
}
private BSTNode remove(E e, BSTNode t){
if(tree==null)
return null;
int result=e.compareTo(t.element);
if(result<0)
return remove(e, t.left)
else if(result>0)
return remove(e, t.right)
else if(t.left!=null&&t.right!=null){
t.element=findMin(t.right);
return remove(t.element, t.right);
}
else{
t=<t.left!-null>?t.left:t.right
return t;
}} }
One thing to note is that recursive function are usually expensive in time and should be
avoided. So, there are iterative ways of writing the methods above.

4 Hashing
Hashing is a storage/retrieval data structure that supports inserts/deletes/finds methods
in constant average time.
Differences between BST and Hashing:
Does not support relative order among data.
No findMin, findMax, printSorted methods.
Hashing is working with a set of keys K that maps
K {0, ..., H.size-1}.

data

table

K is an inexhaustible supply of keys and the table has a finite number of locations. In
hash tables, collisions are not avoidable (if location already used). If there are collisions
continue looking until you find a location not taking.

Philip Hwang

Hashing

Code 4.1
int hash(String s, int H.SIZE){
int val=0;
for(int i=0;i<S.length();i++)
val+=s.charAt(i);
return val%H.Size; }
What have we learned so far?
1. Hash function must map the entire table ( 10% gets mapped for the above example)
2. Anagrams of the key map to the same table location.
How do we fix the Anagram problem above? Look at the following
int val=s.charAt(0)*27+s.charAt(1)*(27*27)+s.charAt(2)*(27*27*27);
return val% H.SIZE;
Multiplication is expensive, so we want to make the Hash function simple and fast.
Instead, we can shift the bits with something like 8 << 1 which means shift the bit once
to the left so that the value now is 16.
val=(val<<5)+s.charAt(i);
There are two types of hashing, Open Hashing and Closed Hashing

4.1 Open Hashing

Philip Hwang

Hashing

Code 4.2
public class OpenHashTable<E>{
private List[] theLists;
private int currentSize;
public OpenHashTable(){
theList=(E[]) new Object [101];
for(int i=0;i<theList.length;i++)
theList[i]= new LinkedList<E>();
}
public boolean contains(E x){
return theList[myHash(x)].contains(x);
}
public void insert(E x){
List<E> whichList=theLists[myHash(x)];
if(!whichList.contains(x)){
whichList.add(x);
currentSize++;
}
if(currrentSize>theLists.length)
reHash();
}
public E remove(E x){
List<E> whichList=theList[myHash(x)];
if(whichList.contains(x)){
whichList.remove(x);
currentSize--;
}}
public int myHash(E e){
int hashVal=x.hashCode();
hashVal=hashVal%theList.length;
if(hashVal<0)
hashVal+=theList.length;
return hashVal;

4.2 Closed Hashing


Closed hashing requires a secondary hash function.
Example 4.3 (Linear Probing)
h(x) = x%10 + f (i), for f (i) = i; i = 0, 1, 2, ....
This linear probing is not good because of its probing effect. Instead we use a load
factor
k
= ,
s
where k is the number of keys stored and s is the table size. We want .5.
Choose a prime table size.
Since the target .5, table size is chosen as next prime greater than 2n for n
keys to be stored. Instead we use

Philip Hwang

Hashing

Example 4.4 (Quadratic Probing)


h(x) = x%TableSize + f (i) where f (i) = i2 ; i = 0, 1, 2, ....
However there is a problem. If I am doing linear probing, I will always fill the table. In
quadratic probing, it is possible to not fill the table. This is why the table must be less
than or half full.
Theorem 4.5
With quadratic probing and a prime table size, an insertion will never fail for .5.
For linear probing, we have for unsuccessful insertions:
successful probing it is:

1
2 (1

1
1 )

1
I() =

1
2 (1

1
).
(1)2

We have for

We integrate:
Z
0

1
1
1
dx = ln
.
1x
1

This gives the average number of probes encountered for successful find insertions.
There is still clustering, but not as bad as linear, for quadratic. We use double
hashing. Here are the hash functions
Example 4.6 (Double Hashing)
h(x) = x%tableSize + h2 (x), and h2 (x) = R (x%R), where R is the largest prime
below the table size. There is room for play when there is a probe collision. You
could look at, for example, h(x) + h2 (x) + f (i) for f (i) = i; i = 0, 1, 2, ....
For deletion, we can use lazy deletion.
If the number of deleted keys goes above a certain threshold, then we will re-hash.
For too many deletions, create a new table by sequentially taking out the deletions and
inserting the needed values into a new table. For too many insertions, make a larger
table.
Code 4.7 (Inserting)
public void insert(E x){
int currPos=findPos(x);
if(isActive(currPos))
return;
table[currPos]=new HashEntry(x,true);
//rehash?
if(++currSize>table.length/2)
rehash();
}

Philip Hwang

Code 4.8 (Remove)


public void remove(E x){
int currPos=findPos(x);
if(isActive(currPos))
table[currPos].isActive=false;
}

Code 4.9 (Contains)


public boolean contains(E x){
int currPost=findPos(x);
return isActive(currPos);
}
Here is the findPos method for a quadratic hasher.
Code 4.10 (Find)
public int findPos(E x){
int offSet=1;
int pos=myHash(x);
while(table[pos]!=null&&!table[pos].element.equals(x)){
pos+=offSet;
offset+=2;
if(pos>=table.length)
pos-=table.length;
}
return pos;
}

Code 4.11 (Is it active?)


private boolean isActive(int pos){
return table[pos]!=null&&table[pos].isActive;
}

Code 4.12 (My Hash)


private int myHash(E x){
int pos=x.HashCode();
pos=Math.abs(pos);
return pos%table.length;
}

10

Hashing

Philip Hwang

5 Sorting
5.1 Selection Sort
(n 1) + n2
Code 5.1 (Selection Sort)
public static void selectionSort(int[] num){
int min, temp;
for(int index=0; index<num.length; index++){
min=index;
for(int scan=index+1; scan<num.length;scan++)
if(num[scan]<num[index])
min=scan;
//swap
temp=num[min];
num[min]=num[index];
num[index]=temp;
}

5.2 Insertion Sort


Code 5.2 (Insertion Sort)
public static void insertionSort(Comparable[] num){
for(int index=1, index<num.length;index++){
int key=num[index];
int pos=index;
//shift larger items to the right
while(pos>0&&num[pos-1]>key){
num[pos]=num[pos-1];
pos--;
}
num[pos]=key;
}
}

5.3 Bubble/Shaker Sort


It just kind of bubbles its way to a sorted list.

11

Sorting

Philip Hwang

Code 5.3 (Bubble Sort)


public static void bubbleSort(int[] list, int size){
int pass=0;
while(pass<size-1){
pass++;
for(int j=0;j<size-pass;j++)
if(list[j]>list[j+1])
swap(list, j, j+1);
}
}
There is an improved version:
Code 5.4 (Improved Bubble Sort)
public static void bubbleSort(int[] list, int size){
int pass=0, lastSwapf=size-1, lastSwapb=0;
boolean exchange=true;
int LSF=size-1,LSB=0;
while(pass<size-1&&exchange)
exchange=false;
int LSB=lastSwapb;
for(int j=LSB;j<LSF;j++)//forward motion
if(list[j]>list[j+1]){
swap(list, j, j+1);
exchange=true;
lastSwapf=j;
}
LSF=lastSwapf;
if(exchange){
exchange=false;
for(int j=LSF;j>LSB;j--)
if(list[j]<list[j-1]){
swap(list,j,j-1);
exchange=true;
lastSwapb=j;
}
}

6 Merge Sort
Code 6.1 (Merge Sort)
void mergeSort(int a[], int n){
int [] tmpAry=new int[n];
mSort(a, tmpAry, 0, n);
}

12

Merge Sort

Philip Hwang

Heap

Here is the recursive portion of the latest given code:


Code 6.2 (mSort)
void mSort(int [] a, int[] tmpAry, int left, int right){
int center;
if(left<right){
center=(left+right)/2;
mSort(a, tmpAry, left, center);
mSort(a, tmpAry, center+1, right);
merge(a, tmpAry, left, center+1, right);
}}

Code 6.3 (Merge)


void merge(int []a, int []tempAry, int lBegin, int rBegin, int rEnd){
int lEnd=rBegin-1;
int numElements=rEnd-lBegin+1;
int tmpPos=lBegin;
while(lBegin<=lEnd & & rBegin<=rEnd)
if(a[lBegin]<=a[rBegin])
tmpAry[tmpPos++]=a[lBegin++];
else
tmpAry[tmpPos++]=a[rBegin++];
while(lBegin<=lEnd)
tmpAry[tmpPos++]=a[lBegin++];
while(rBegin<=rEnd)
tmpAry[tmpPos++]=a[rBegin++];
for(int i=1; i<=numElements; i++, rEnd--)
a[rEnd]=tmp[rEnd];
}
Becomes, for recursive, an O(n log n) algorithm.

7 Heap
Heap Tree
structure property: All nodes are present except the highest level which is filled
from left to right
order property:
(For max heap), parents key is to its childrens.

13

Philip Hwang

Code 7.1 (Heap Sort)


void heapSort(int[] a){
for(int i=a.length/2-1; i>=0; i--)/build heap ("heapify")
percDown(a, i, a.length);
for(int i=a.length-1; i>0; i--){
swap(a,1,i);
percDown(a,1,i);
}
}

Code 7.2 (percDown)


void percDown(int[] a, int i, int n){
int child, tmp;
for(tmp=a[i]; 2*i<n; i=child){
child=2*i;
//find the larger of two children
if(child!=n-1&&a[child]<a[child+1])
child++;
if(tmp<a[child])
a[i]=a[child];
else
break;
}
a[i]=tmp;
}

14

Heap

You might also like