Week 12 Hashing

CS221A Data Structures &
Algorithms
Hashing
Agenda
Hashing Concepts and Preliminaries

Hash Function
Separate Chaining
Hashing
A technique used for performing insertions, deletions and

search in a constant average time.
A scenario in which the keys themselves point directly to
records.
Information encoded directly within a key can point us to
its associated record.
Examine the key and simply know where to look.
Hashing
Determine the location of the record by performing an

arithmetic computation on its key.
Result of this computation yields the location of the
record using a table called Hash-Table.
This computation is referred as Hash-Function.
Hashing
Typical Hash-table is an array of some fixed size,

containing the keys.
A key is typically a string with an associated value.
Each key is mapped into some number in the range 0
Tablesize -1 & placed in an appropriate cell.
Mapping is provided by Hash-function.
A hash function should be simple to compute and should
ensure that any two distinct keys get different cells.
Hashing
Typical Hash-table is an array of some fixed size,

containing the keys.
A key is typically a string with an associated value.
Each key is mapped into some number in the range 0
Tablesize -1 & placed in an appropriate cell.
Mapping is provided by Hash-function.
A hash function should be simple to compute and should
ensure that any two distinct keys get different cells.
Hashing
In the figure is an ideal Hashtable.
All the Distinct Greek Alphabetical
Names hash to distinct keys.
Beta Hashes to 1.
Theta Hashes to 3.
Epsilon Hashes to 6.
Alpha
Beta
Gamma
Theta
Omega
Delta
Epsilon
Pie
Hashing
In this case Keys are the

names of the contacts and
hash function maps it to
the index of the arrays
where there phone
number is stored.
The hash function is used
to transform the key into
the index (the hash) of an
array element (the slot or
bucket) where the
corresponding value is to
be found.
Hashing
Only issue is picking up or figuring out the hash function

& deciding what to do when two keys hash to the same
value (phenomena know to us as Collision).
Hashing
Get the juice flowing guys;

Problem Statement:
Lets assume we have to build an application that supports a

customer service department for some company. To simplify
the operation for both representatives and customers, how
will you store the data ?
Hashing
Simple Solution:
Key the account records by telephone number, thus when

answering a call, the service representative will retrieve
account information by entering the customers telephone
number into the system.
What sort of hash function you can come up with ?
Hash Function
For integer keys, then simply returning Key % Tablesize is

generally a reasonable strategy for a hash function.
If we have 0 key 99. and our table size is 10. what will
be the worst case scenario for this hash function. ?
Hash Function
For integer keys, then simply returning Key % Tablesize is

generally a reasonable strategy for a hash function.
If we have 0 key 99. and our table size is 10. what will
be the worst case scenario for this hash function. ?
Answer:
If the all the keys end in 0.
Hash Function
For the situations like this its preferred to have the table
size as Prime.
When the keys are random integers, this function is
effective in distributing the keys evenly.
When keys are string values an effective hash function can
be adding the ASCII values and using our Mod function to
create mapping.
Hash Function
typedef unsigned int Index;

Index Hash(Char *Key, int Tablesize)
{
int HashValue = 0;
While (*Key != \0)
{
HashValue + = *Key;
}
return HashValue % TableSize;
Hash Function
Where is the hash function in previous slide ineffective ?
Hash Function
If the table size is large, the function doesnt distribute

keys well.
For higher prime number table size i.e. for example
10,007, suppose all keys are less than 8 characters. Most
value a char can have is 127 in ASCII. So 127*8 = 1,016 is
the largest value hash function can assume..
0 1,016 are the possible values can be assumed. Try
taking Mod on this one ..
When two or more keys hash to same function, this is
known as collision. Lesser the collision better is your has
function.
Separate Chaining
Keep a list of all elements that hash to the same value.
Separate Chaining
To perform a Find, we use the hash function to determine

which list to traverse. We then traverse the list in a
normal manner, returning the position where the item is
found.
Separate Chaining
To perform an insert, we traverse down the appropriate

list to check whether the element is already in place.
If duplicates are expected, an extra field is usually kept
and this field would be incremented in the event of a
match.
If the element turns out to be new. It is either inserted in
front of the list or at the end of the list, whichever is
easier and its frequency of retrieval.
Hashing Implementation
typedef struct ListNode *Position;

typedef struct HashTbl *HashTable;
typedef Position List;
struct ListNode
{
}
struct HashTbl
{
ElementType Element;
Position Next;
int TableSize;
List *TheLists;
HashTable InitializeTable(int TableSize)

{
HashTable H = NextPrime(TableSize);
HTheLists =
malloc(sizeof(List)*HTableSize);
For (int i =0;i<HTableSize;i++)
{
HTheLists[i] = malloc(sizeof(struct ListNode));

HTheLists[i]Next = Null;
}
return H;
Position Find(ElementType Key, HashTable H)

{
Position P;
List L = H TheLists[Hash(Key, HTableSize)];
P=L Next;
While (P != NULL && PElement !=Key)
{
// Strcmp
P= PNext;
}
return P;
void Insert(ElementType Key, HashTable H,

ElementType RecordValue)
{
Position Pos, NewCell;

List L;
Pos = Find(Key, H);
if (Pos == NULL)
{
NewCell = malloc(sizeof(struct ListNode));

L = HTheLists[Hash(Key, HTableSize)];
NewCellNext = LNext;
NewCellElement = RecordValue;
LNext = NewCell;

Week 12 Hashing

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week 12 Hashing

Uploaded by

Copyright:

Available Formats

CS221A Data Structures &

Hashing Concepts and Preliminaries

A technique used for performing insertions, deletions and

Determine the location of the record by performing an

Typical Hash-table is an array of some fixed size,

Typical Hash-table is an array of some fixed size,

In this case Keys are the

Only issue is picking up or figuring out the hash function

Get the juice flowing guys;

Lets assume we have to build an application that supports a

Key the account records by telephone number, thus when

For integer keys, then simply returning Key % Tablesize is

For integer keys, then simply returning Key % Tablesize is

If the all the keys end in 0.

typedef unsigned int Index;

Where is the hash function in previous slide ineffective ?

If the table size is large, the function doesnt distribute

Keep a list of all elements that hash to the same value.

To perform a Find, we use the hash function to determine

To perform an insert, we traverse down the appropriate

typedef struct ListNode *Position;

HashTable InitializeTable(int TableSize)

HTheLists[i] = malloc(sizeof(struct ListNode));

Position Find(ElementType Key, HashTable H)

void Insert(ElementType Key, HashTable H,

Position Pos, NewCell;

NewCell = malloc(sizeof(struct ListNode));

You might also like