modulo+arithmetic+&+hash+table

Hashing
toc Reference: pages 783- 795, CH

Hash Code
The idea behind a hashing is to compute the location of an element in a data structure quickly without having to do a linear search. This is done by applying an algorithm (known as the **hash algorithm** ) to the object. The result of this hash function is a numeric value, the **hash code**. Hash code is used to determine the location of the object. This is the location at which data would be stored. The same hash algorithm is applied when data needs to be searched.

A hash function is made in such a way that different objects yield different hash codes.

However, hash codes are not always unique. Two or more distinct objects may result in the same numeric value. When you get the same hash code, the situation is called a **collision.** Avoiding collisions is the biggest challenge in designing a good hash function. Hash code is used as an array index into a **hash table**.

Efficiency of search

 * Efficiency of search** – Big O efficiency is O(1) – best case efficiency. Table only takes one step during search, it is independent of size.

Example:

Conversion of string to hash code 1. Take all letters of a string. 2. Convert each letter to their ASCII value. 3. Add all ASCII values then find remainder when divided by table size. (modulo method) 4. Remainder value determines object location on table.

Size of table = 7

Ex. Joe = 106 + 111 + 101 = 318 Spot location = 318%7 = **3**

Very quick and efficient search. If more than one string has the same spot location, then a collision will occur. Too many collisions will result in reducing the search efficiency.
 * Pros:**
 * Cons:**

Hash tables require more memory allocated than there are data values to fill them. Unnecessary empty spots may be created on the table, wasting memory

**Collision handling**
. If 2 of more values collide, then the value will proceed through the table by sequential search until the next free spot is found. The value will then fill in the next spot. · An overflow table is a separate table created for all values that have collided. If 2 values collide, then one of the values will be filled in a spot on the overflow table. When a search is done, first the hash table is searched and thereafter the overflow table is searched sequentially. · Multiple overflow tables that are each assigned a spot location. When there is a collision at a specific spot, the collided value will fill in a spot on a bucket that has been assigned the same spot location. Buckets can either take the form of arrays or chains. More efficient to search, requires more memory too compared to overflow table method. · Hash table is an array of pointers to Linked list which stores collided values. Whenever a collision occurs, a pointer will be formed which will connect the node (with the same hash index) storing the collided value with the collision spot in the hash table. If more collisions occur on the same spot value, then more nodes will be added to this linked list.
 * Linear probing**
 * Overflow table**
 * Buckets**
 * Chaining**


 * Chaining**



Hash - buckets

Task Set #1
For the questions below use the following values: 66, 47, 87, 90, 126, 140, 145, 153, 177, 285, 393, 395, 467, 566, 620, 75

1. Store the values into a Hash table with 20 positions, using division method of hashing (key % table size). Consider linear probe method to resolve collisions.

2. Store the values into a Hash table of size 10 with buckets, each containing three slots. If a bucket is full, use the next sequential bucket that contains a free slot.

3. Store the values into a Hash table that uses the hash function (key % 7) to determine which of 7 chains to put the value into.