Direct Address Table | Hashing | Probing | Data Structures

NOTE: The main intension of data structures is to store the data. Either to give it as an input to an algorithm or take the output from an algorithm.

Common operations –

  1. Insertion
  2. Searching – dominant operation that can be performed.
  3. Delete

Data sets or data are generally static, ie, once you create the data, it rarely changes.

Many data structure are designed in order to search the data faster.

Worst case complexity for searching in various data structure are –

  1. Unsorted array = O(n)
  2. Sorted array = O(log n)
  3. Linked list = O(n)
  4. Binary Tree = O(n)
  5. Binary Search Tree = O(n)
  6. Balanced Binary Search Tree = O(log n)
  7. Priority Queue (Min. & Max heap) = O(n)

The minimum search time is (log n) in the above searching for DS.

So, in order to minimize the search time, we use hashing. Therefore, hashing is the soln. to minimize the search time. If we use hashing technique, then the avg. time to search an element is O(1), i.e., constant time. 

Before using hashing, Direct Address Tables were used which is similar to arrays.

In DAT, we are going to have an array and we are going to place the elements in order of key value. This method is useful only when we know the no. of keys which are going to be in a small range and the no. of keys which are inserted is very less.

eg. {1, 2, 3,…, 100} -> Key range

Suppose for 100 elements to be inserted into DS. Suppose 1 is to be inserted at index 10. And we have to search that 1 later, we can directly go to the index and access 1 since array has random accessibility over data.

Therefore, search time = O(1), we place elements according to the key value.

This method is useful only when,

  1. You know the no. small range and of keys which are going to be inserted is to be in a small range and
  2. If no. of keys which are inserted is very less.

 

Limitations:-

Since we are going to use key values in array indexes. Therefore, there has to be linear relationship (directly proportional) between no. of keys used and size of arrays.

In case, the key used are large in no. eg 1000000, and the no. of keys are less.

{1000000, 1000001, 1000002,…, 100009}

So, in order to put the keys, we have to define an array whose size is the key. So, if the key size and array elements are close to each other then we can use DAT.

Suppose if the key size is in 100s and no. of elements are also in 100s, then defining one slot for each element is useful in such a way.

If the key size is very large and the total no. of keys to be inserted into the array is very small, then having such an architecture is waste of space.

So, there we are going to use hashing.

 

 

Introduction to Hashing

Hashing is mapping a set of elements from a large set to a smaller set using hashing function.

Hashing
Hashing

Potential candidates to be placed in DS is very large but not all elements will be present in our DS, only few will be present.

Suppose, value of keys range = 0-999, but 10 elements can be placed in our DS at a time.

Therefore, declaring a DS which is of 1000 spots is a waste of space. Instead we declare a space of 10 elements and we could try to map.

The mapping function is called hash function. If we are sure that our data is not going to go beyond 10 elements, then we can use hashing.

We apply a function (hash function) on the data so that the function itself will say where to put the data in constant time. And when we apply search, we use the function to search and go to that element directly in case the element is present. If it is not present, we say it is not present.

Therefore, search time = O(1)

 

More content coming soon…

Related Links

Leave a Reply