Algorithms

The main differences between algorithm and program are :
  • Algorithm is the design of a solution to a problem whereas Program is the implementation of code.
  • Algorithm designer must have domain knowledge whereas Programmer must Hardware and software knowledge.
  • Algorithm can be written in any language but Program is written in High Level Languages or assembly language.
  • Algorithm should do Priori analysis after design but Program must do Posteriori testing after writing .

Characteristics of Algorithm :
  1. one or more Input and at least one output
  2. Definiteness  : every statement must give clear and correct meaning.
  3. Finite duration of execution
  4. Effectiveness : must not have non-meaningful statements

Analyze an Algorithm 

For analyzing an Algorithm we must consider time and space.
But we can also consider network bandwidth , power consumption etc.

Frequency Count Method
  1. For loop condition will execute `(n + 1)` times.
  2. All statements inside For loop will execute n times.
  3. Statements inside two for loop will execute `n(n + 1)` times.
  4. nested For loop will execute `n xx (n + 1)` times.
  5. Function of time f(n) is calculated as summation of iterations for all loops and statements.
  6. O(n) will be the highest degree of f(n).
  7. Space complexity for a variable is 1 (eg. i ,j,k) , array is n (eg. a[10])and 2-d matrix array is `n^2` (eg. a[2][2])
  8. Space function S(n) is sum of all variable , array , matrix.
  9. For loops with iteration (i = i + 2) will execute half times so f(n) = `n/2` but still O(n) = n.
  10. If For loop 1 executes i (`0->n`) and nested for loop 2 executes when j`(j<i)` then statement inside loop 2 will execute `0+1+2+3+.....+n = (n(n+1))/2 times and O(n) = n^2`.
  11. If for loop condition is checked by (p < n) where p = i + 1 then total iteration will not be n .At last iterations p = 1+ 2+.....+k .So we can write value of p as, `(k(k+1)/2)`  > n implies ` k^2 > n` So, k = `root(2)n` .Hence time complexity will be O(n) =` root(2)n`.
  12. When `i = i xx 2` then `i =1 -> 2^1,2^2,2^3,.....,2^k` . last iteration `2^k > n => k > logn` base 2 and O(n) = `log(n)` 1.5.2 Time Complexity Example #2
  13. General solution to find O(n) is first find the value of i at last iteration . Equate it to condition value n . Then find the total no. of iteration in terms of n .
  14. For Loop and While loop time complexity is same. do-while will execute one extra time.
  15. If loop execution is not straight forward then study each execution to get the time complexity.
Types of Time Function
  1. F(n) = 1,2,3, 5000 etc time complexity is O(1) or constant.
  2. F(n) = log(n) time complexity is O(logn) or logarithmic
  3. F(n) = n , 2n , 3n +5 time complexity is O(n) or Linear.
  4. F(n) = `n^2`,`n^2 + 4`,`n^2 +n` time complexity is O(`n^2`) or Quadratic.
  5. F(n) = `n^3`,`n^3 +n`,`n^3 +n^2` time complexity is O(`n^3`) or Cubic.
  6. F(n) = `2^n`,`3^n`,`n^n` time complexity is O(`2^n`),O(`3^n`) O(`n^n`) or Exponential.
Increasing order of time complexity ,

`1 < log(n) < root(2)n < n < n^2 < n^3 < ....<  2^n < 3^n < ....<  n^n`

Big-Oh : (Upper bound)

`f(n) = O(g(n)) ` if and only if for each positive constant c , m = 0,1,2..... , such that `f(n) <= cg(n)` where n >=m.

Omega : (Lower bound)

`f(n) = Omega(g(n)) ` if and only if for each positive constant c , m = 0,1,2..... , such that `f(n) >= cg(n)` where n >=m.

Theta : (Average bound)

`f(n) = Theta(g(n)) ` if and only if for each positive constant c1 ,c2 , m = 0,1,2..... , such that `c1 * g(n) <= f(n) <= c2 * g(n)` where n >=m.

Note : Both side g(n) is same function and constant c1 and c2 are multiplied to get upper and lower bound.

For `n!` we can not find average bound `theta()` only upper bound O(`n^n`) and lower bound `Omega(1)`.Examples .

Properties of Asymptotic Notation :
  1. General Property : if f(n) is O(g(n)) ,`Omega(g(n))` and `Theta(g(n))` then `a * f(n)` is  also O(g(n)) ,`Omega(g(n))` and `Theta(g(n))` where a is constant.
  2. Reflexive : f(n) is O(f(n)) and also true for `Omega`
  3. Transitive : If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) is O(h(n)). And also true for `Omega and Theta`
  4. Symmetric : If f(n) is `Theta(g(n))` then g(n) is `Theta(f(n))`.  f(n) = g(n) = `n^2`
  5. Transpose :if f(n) = O(g(n)) then g(n) = `Omega(f(n)).
  6. Big-oh of Addition of function (f(n) + d(n)) will highest upper bound of O(g(n)) and O(h(n)) .
  7. Big-oh of Multiplication of function (f(n) x d(n)) will be O(g(n) x h(n)).
Comparison of Function :
  1. Substitute values and check.
  2. Apply log on both sides (after log whichever is greater that is upper bound)
  3.  If a function is greater after a certain value that function is greater then the other.
Best Worst and Average Case Analysis :

Disjoint Sets :

Two sets S1 and S2 represents vertices of two different graph, if `S1 cap S2 = phi`, then they are disjoint.
Operation on Disjoint set :

  1. Find (Finding out which set the point belongs to or member of)
  2. Union (connecting two vertices of different sets)

Detecting a cycle :

  1. Number all the edges,
  2. For a edge join them and make a set and repeat it.for all edges,
  3. If vertices of edge are in different sets Union them.
  4. If vertices of edge belongs same set , we detect a cycle . don't join them.

Detecting cycles in a disjoint set in graph:
  1. Number all the edges,
  2. For a edge join them and make a point parent and other child , repeat this,
  3. If two vertices of a edge belongs to different pairs then union them by making one parent, child of other parent with more weight and join.
Detecting cycles in a disjoint set in Array :
  1. Array of length equal to number of vertices,
  2. Fill the array with value -1, negative means parents and value -1 means one node. 
  3. For each edge , update the child value to parent index,
  4. update the value of parent index to -2.
  5. repeat 3,4.
    1. if a edge has different parents the update parent1 value to index of parent2 and update no. of nodes in parent .
    2. If edge has same parent then its a cycle
  6. For Collapsing find , update the value of all child to its direct parent. 
  7. Linked list can also be used for disjoint sets.

Divide and Conquer Algorithm :
  1.  If the problem is big then divide into sub problems,
  2. sub problem is same as actual problem,
  3. Find solution to sub problems.
  4. recursive solution.
  5. Combine the solutions.
Examples :
  1. Binary Search,
  2. Finding maximum and minimum,
  3. Merge sort ,
  4. Quick sort,
  5. Strassen's Matrix multiplications
Recurrence Relation Decreasing Function

 Recurrence Relation (T(n)= T(n-1) + 1) ,T(0) = 1: Normal recursion.
  • T(n) = 1 + n and O(n)
  • Statement will execute one time for each recursion call.
 Recurrence Relation (T(n)= T(n-1) + n) : for loop with i++ and recursion.
  • `T(n) = 1 + (n * (n + 1)/2) 
  • For loop will execute `sum n , n-1 , n-2 ,...2,1` = T(0) +`n*(n+1)/2`
  • O(n^2)`
 Recurrence Relation (T(n)= T(n-1) + log n) : for loop with i=i *2 and recursion.
  • For loop will execute `sum log n,log (n-1) ,......,log 2,log 1 = log (n!)` + T(0)
  • upper bound n! = `n^n` , therefore O(nlog n)
Note : Use Tree method and substitution method to find T(n) and order of recurrence relations.

Recurrence Relation T(n)=2 T(n-1)+1 : Function is called twice inside .
  • O(`2^n`)
  • T(n)=2 T(n-1)+1
  • T(n)=2 [2 T(n-2)+1]+1 = `2^2`T(n-2) + 2 +1
  • T(n)=`2^3` [2 T(n-3)+1]+ 2 +1 = `2^3`T(n-3) +`2^2`+ 2 +1
  • T(n)=`2^k` [2 T(n-k)+1]+`2^(k-2)`+`2^(k-3)`+....+ 2+ 1 = `2^k`T(n-k)+`2^(k-1)`+`2^(k-2)`+`2^(k-3)`+...+`2^2` + 2 +1
  • Assume n-k = 0 => n= k for the smallest value T(0) = 1
  • T(n) = `2^n`T(0)+`2^(k-1)`+`2^(k-2)`+`2^(k-3)`+...+`2^2` + 2 +1
  • T(n) = `2^n`T(0) + 1 + 2 + `2^2` + .............+`2^(k-3)+`2^(k-2)+`2^(k-1)
  • T(n) = `2^n` + `[(2^(k-1+1)-1)/(2-1)]` = `2^n` + `2^k` - 1 =`2^n` + `2^n` - 1 =`2*2^n` - 1  since n=k
  • T(n) = `2^(n+1)` - 1
  • Therefore O(`2^n`)
Recurrence Relation T(n)=3 T(n-1)+1 : O(`3^n`)

Recurrence Relation T(n)=2 T(n-1)+ n : O(`n2^n`) 

Master Theorem for Decreasing FunctionT(n)=aT(n-b)+ f(n)  : where f(n) = `n^k`
  • if (a < 1 ) then O(f(n)) 
  • if (a = 1 ) then O(`n*f(n)`)
  • if (a > 1 ) then O(`a^(n/b)`f(n))

Recurrence Relation Dividing Function  :

 Recurrence Relation T(n)=T(n/2)+1 :

  • T(n)=T(n/2)+1 
  • T(n)=`T(n/2^2)+ 2 `
  • T(n)=`T(n/2^3)+ 3 `
  • T(n)=`T(n/2^k)+ k`
  • For T(1) ,n/2^k = 1 => k = logn
  • T(n) = T(1) + logn = 1 + logn
  •  Therefore O(logn)
Recurrence Relation   T(n)=T(n/2)+ n : O(n)

Recurrence Relation  T(n)= 2T(n/2) + n : O(nlogn)
  • T(n)=2T(n/2)+ n 
  • T(n)=`2^2T(n/2^2)+ 2*n/2+n `
  • T(n)=`2^3T(n/2^3)+ n + n +n `
  • T(n)=`2^kT(n/2^k)+ kn`
  • For T(1) , `n/2^k` = 1=> `2^k` = n => k = logn
  • T(n) = nT(1) + nlogn
  •  Therefore O(nlogn)

Recurrence Relation  root function
T(n) =  `{(1,,n,=,2),(T(sqrt(n)) + 1,,n,>,2):}` ` rarr `O(loglogn)


Master Theorem for Dividing Function

 Recurrence Relation T(n)= aT(n/b) + f(n) : where f(n) = `theta(n^k(log_()n)^p)`  Let `log_(b)a = x`
  • case 1  `x > k` then `theta(n^x)` 
  • case 2  `x = k` and
      • if (p>  -1) then `theta(n^k(logn)^(p)*logn)`
      • if (p= -1) then `theta(n^k(loglogn))`
      • if (p< -1) then `theta(n^k)`
  • case 3  `x<k` and
      • if `p >=0` then `theta(n^k(logn)^(p))`
      • if `p<0`    then  O(`n^k`)
Linear Search Algorithm :

  • also called sequential search which check all the elements one by one
  • Best case = O(1) and worst case = O(n)


Binary Search Algorithm :

  • Binary search uses a sorted array or list to find the key element.
  • BinarySearch(l , h , key) `->` l = lowest index , h = highest index , key = element to search
  • For single element array `->` if(l==h) then check the element with the key
  • For larger array `->` while (l < h)
      • mid = Floor`[(l + h)/2]` and check with key
      • T(n/2)  if key is larger then mid  l = [mid + 1]
      • T(n/2)  if key is smaller then mid   h = [mid -1] (ascending array)
  • T(n) =  `{(1,,n,=,1),(T(n/2) + 1,,n,>,1):}` `rarr T(n) = T(n/2) + 1 rarr`  O(logn)
  • Best case = O(1) and worst case = O(logn)
Array representation of Binary Tree :
  • For a element at index i ,
      • Left child = 2*i
      • Right child = 2*i +1
        • Parent = floor`[i/2]`
  • Fill the Binary tree array from left to right ,starting from up to down . And missing element with gap '-'
  • Binary tree with all elements is a Full Binary tree
  • Complete Binary tree is a Full Binary tree up to height (h -1) and left nodes should not be missing .
  • There should not be any gap in complete binary tree array.
  • Max Heap is a complete binary tree where every element is higher then or equal to its decedents.
  • Min Heap is a complete binary tree where every element is lesser then or equal to its decedents.
  • For inserting a element in heap , 
      • add the element after the last element of the array to make a complete binary tree
      • check it's parent `[n/2]` and swap if parent is smaller in case of max. heap.
      • complexity = height of tree = O(logn) for worst case and O(1) for best case
      • For n elements O(nlogn)
  • For Deleting a element in heap ,
      • delete the root element 
      • Fill the root with last element of the array
      • compare with its larger descendants between (2*i) and (2*i +1) and swap if descendant is smaller in case of max. heap
      • After deleting root element the last position of array will be empty
      • Fill the empty space with the previous root . This way after deleting all the elements we will end up having a sorted array. its called Heap Sort.
      • complexity is O(logn)
      • For n elements O(nlogn)
  • To heapify any array 
      • start from its last index and check descendants are smaller (if exist) then it is a heap
      • move towards top of array swap if we found descendants is larger then the current  array element for all descendants.
      • complexity is O(n)
  • priority queue 
Merge Sort :

  • Two sorted list or array A[] , B[] with length m,n  .
  • For creating Merge()
      • create a new array C[]
      • keep i ,j , k iterators on array A[] , B[] , C[]
      • starting from first position compare element of A[] and B[] and
      • insert to C[], increment the iterators C[k++] = A[i++]  or  C[k++] = B[i++]
      • if one array is exhausted then append remaining elements of the other array to C[]
      • complexity is O(n)
  • In Mergesort() 
      • assume that a single element is a sorted array
      • use recursive method to divide the array to single element
      • (l < h) is true if there are more than single element
      • call MergerSort() for [l,mid] and [mid +1 , h]
      • call Merge() for [l,mid,h]
  • Merging two list at a time is 2-way merging , four at a time is 4-way merging
  • Steps :
      • last iteration of recursion mergesort() which fail (l<h)
        • mergesort(0,0) and mergesort(1,1)  where l=0,h=1
        • then merge(0,0,1) for two elements
  • T(n) =  `{(1,,n,=,1),(2T(n/2) + n,,n,>,1):}`
  • complexity O(nlogn) and `theta(nlogn)`
  • Analysis of Mergesort 
    • Pros :
      1. large size arrays
      2. can be used with Linked list
      3. external sort possible
      4. stable (second sorting will not get effected for duplicates )
    • Cons :
      1. Need extra space (no In-place sorting) O(n)
      2. In practical merge sort is slower for `n<= 15`
      3. Insertion sort is suitable with merge sort because it is stable and support linked list.
      4. O(logn) stack space is needed
      5. total space = O(n + logn)
Quick Sort :
  • Divide and Conquer Algorithm
  • For creating Partition()
      • select pivot (first element of array)
      • set last element of array as `infty`
      • take i , j index starting form pivot and last element `infty`
      • i++ until we find a element smaller than pivot
      • j-- until we find a element larger than pivot
      • swap() the elements
      • do this till i > j
      • swap() pivot with element at j position
  • QuickSort()
      • l = first index (pivot), h = last index (`prop`)
      •  first Partition on pivot and then call QuickSort() for Right and Left Side of array
      • while (l < h)
      • j = Partition(l , h)
      • QuickSort(l,j) and QuickSort(j+1 , h)
  • best case complexity = O(nlogn) (partition on middle element)
  • worst case complexity = O(`n^2`) (for sorted array portioning at the edges)
Matrix multiplications :
  •  matrix multiplication of A and B is C matrix
  • `C[i,j] = sum_(k=0)^n A[i,k] * B[k,j]`
  • it takes three for loops to find C matrix
  • complexity = O(`n^3`)
  •  2 x 2 matrix is considered small , so multiplied normally
  • Applying Divide and Conquer to square matrices (n x n) where n is power of 2,so we can divide it into 2 x 2 matrices 
  • complexity `T(n) = 8T(n/2) + n^2` is `O(n^3)`
  • Strassen's  formula :
      • `T(n) = 7T(n/2) + n^2` is `O(n^2.8)`
Greedy Algorithm :
  • Problem solution is found in stages.
  • If a input is feasible for optimal result then it is included in solution
  • Optimization problems are solved

Data Structures

Array : 
Let size of array be n.
Accessing Time: O(1) [This is possible because elements
                      are stored at contiguous locations]   
Search Time:   O(n) for Sequential Search: 
               O(log n) for Binary Search [If Array is sorted]
Insertion Time: O(n) [The worst case occurs when insertion 
                     happens at the Beginning of an array and 
                     requires shifting all of the elements]
Deletion Time: O(n) [The worst case occurs when deletion 
                     happens at the Beginning of an array and 
                     requires shifting all of the elements]
Linked List :

  1. Singly Linked List : ( data + address to next)    1 -> 2 -> 3 -> 4 -> NULL
  2. Doubly Linked List : ( data + address of previous + address of next )     NULL <- 1 <-> 2 <-> 3 -> NULL
  3. Circular Linked List : A circular linked list can be a singly circular linked list or doubly circular linked list. Advantage of this data structure is that any node can be made as starting node.
Accessing time of an element : O(n)
Search time of an element : O(n)
Insertion of an Element : O(1) [If we are at the position 
                                where we have to insert 
                                an element] 
Deletion of an Element : O(1) [If we know address of node
                               previous the node to be 
                               deleted]

Stack :

  • LIFO (last in, first out)
  • push, which adds an element to the collection, and 
  • pop, which removes the last element that was added
Insertion : O(1)
Deletion :  O(1)
Access Time : O(n) [Worst Case]
Insertion and Deletion are allowed on one end.

Queue : 

  •  FIFO (first in, first out) 
  • Enqueue :The element is added from the rear side and 
  • Dequeue : The element is removed from the front side

Insertion : O(1)
Deletion  : O(1)
Access Time : O(n) [Worst Case]
Binary Tree :

  • Trees are hierarchical data structures
  • Each node has at most two children, which are referred to as the left child and the right child
  • A Binary Tree can be traversed in two ways : 
      • Depth First Traversal: 
        • Inorder (Left-Root-Right),
        • Preorder (Root-Left-Right) and 
        • Postorder (Left-Right-Root)
      • Breadth First Traversal: Level Order Traversal

The maximum number of nodes at level l = 2l-1.

Maximum number of nodes = 2h – 1.
Here h is height of a tree. Height is considered 
as is maximum number of nodes on root to leaf path

Minimum possible height =  ceil(Log2(n+1))   

Time Complexity of Tree Traversal: O(n)
Note : In Binary tree, number of leaf nodes is always one 
more than nodes with two children.
Binary Search Tree :

  • Elements at right will be greater than the node and left will be lesser .
  • Height of BST :
      • min : logn
      • max : n
  • Height depends on elements insertion
  • Perform rotation to change from BST to Balanced BST 
  • Rotations are done only on three nodes.
Search :  O(h)
Insertion : O(h)
Deletion : O(h)
Extra Space : O(n) for pointers

h: Height of BST
n: Number of nodes in BST

If Binary Search Tree is Height Balanced, 
then h = O(Log n) 
Note : Self-Balancing BSTs such as AVL Tree, Red-Black
Tree and Splay Tree make sure that height of BST 
remains O(Log n)
AVL Tree :
  • it is always Balance BST
  • balanced factor(bf) = height of left subtree - height of right subtree
  • `bf = {-1 ,0 ,1} or |bf| <= 1`
  • searching in AVL tree is O(logn)
  • Following are the possible 4 arrangements:
      1. Left Left Case (LL)
        • perform R-rotation
      2. Left Right Case (LR)
        • perform L-rotation on middle then R - rotation on first
      3. Right Right Case (RR)
        • perform L-rotation
      4. Right Left Case (RL)
        • perform R-rotation on middle then L-rotation on first
Note :  The AVL trees are more balanced compared to Red-Black Trees, but they may cause more rotations during insertion and deletion.if the insertions and deletions are less frequent and search is the more frequent operation, then AVL tree should be preferred over Red Black Tree.
B-Tree and B+ Tree :
  • In DBMS tables , records are stored in blocks on disk . 
  • Index is used to store key and block number .These indexes are also stored on disk.
  • Spar Index are used to store starting index and block number where they are stored.
  • m-way search tree , each node can have at most m children. degree is m.
  • For each node , n keys , n+1 child pointer , n record pointer
  • NO rule for inserting data in m-way search tree
Rules for B-Tree / B+ Tree :
  1. Each node must have ceil[m/2] children where m is degree.
  2. Root can have min. 2 children.
  3. All leafs at same level.
  4. Bottom up creation.

Comments