Learning and Understanding BigO Notation
We’ve all heard of Big(O). It’s something most of us learn in college and promptly forget. We also know it’s something that Top Coders and Googlers are good at and many of us would like to be good at it too!
I can relate – I find many algorithms fascinating and many more intimidating. And for a long time I struggled to get my head around the concept of Big(O). I knew what it was – vaguely – but I had no deep understanding – no intuition for it at all. And I knew that it was important in telling me which algorithms were good and which weren’t.
If you can relate to that then this article is for you – you will be able to understand Big(O) and have the beginnings of an intuition about it.
What is Big(O)?
So what is Big(O)? It simply describes how an algorithm scales with more inputs. Big(O) is an upperlimit on the algorithm ignoring all exceptions, special cases, and complex details and irrelevant constants.
Squint at your algorithm – find it’s important parts (usually the loops) and you’ve trapped the BigO.
And why is this important? Well, let’s consider a common problem: finding a needle in a haystack. Here are two approaches:
(1) Find the needle in a haystack
for(item in haystack) { if(item == needle) return item; }
(2) Find needle in sorted haystack
low = 0 high = sortedhaystack.size  1 while(low <= high) { mid = (low + high) / 2; item = sortedhaystack[mid]; if(item == needle) return item; if(item > needle) high = mid  1; else low = mid + 1; }
Finding Big(O)
If we “squint” at the algorithm looking at the number of times through the main loops and ignoring all the complications we see that (1) walks through every item of the input. This means that it’s O is directly proportional the number of items in the haystack. If we call the number of items `n` then algorithm (1) has:
Squinting at algorithm (2) we see that it also has only one loop where it is going from low to high. Now this range again starts off as the number of items in the input but on every loop it halves the range it has to travel by narrowing to the midpoint. This kind of discarding half a range is terribly common and terribly useful. This constant halving means that the loop only goes through items. Therefore algorithm (2) has:
Because we do this halving thing often, it is useful to remember and its relation to halving in loops.
So now we have some idea about the Big(O)’s. Big deal! How does that help us know which algorithm is better?
Well remember, Big(O) gives us useful information as the algorithm scales. If we have a small haystack then it really doesn’t matter – even if each loop takes a second then a haystack of a few hundred items will finish quickly enough in both cases.
But if we have a million items things get interesting:
Algorithm 1: (1 second per loop)
Algorithm 2: (1 second per loop)
We can see there is a phenomenal difference between the two. So we can see that just knowing the Oclass can really help us figure out how an algorithm scales.
Now the milliondollar question is: what are the common Oclasses and how do each of them scale?
Meet The BigO Classes
Let’s take a look at important the BigO classes now:
The Flash 
Shrinking Violet 
Groot 
Clark Kent 
The Fastest O 
A common way to shrink inputs quickly by discarding halves 
The Rare One – Rarely found in practice 
A Straight Guy – looking at each input leads to linear scale 
1 million items in… 1 second  1 million items in… 20 seconds  1 million items in… 16 minutes  1 million items in… 11 days 
Hisoka the Card Magician 
The Power Sisters 
Wonder Woman 
Link the Traveler 
Often reflects a sort 
Growing exponentially – We can usually see `x` nested loops in the algorithm 
Combination Loops – looking at various subsets in the input 
All possible permutations of the inputs 
1 million items in… 1 year  1 million items in… 32,000 years , 32 billion years  1 million items in… millennia (longer than the universe has been around!) 
1 million items in… millennia (good grief!) 
The Blackest Panther 
The worst scaling! 
1 million items in………wayyyy too long… 
Taking a deeper look
Let us understand each class and build an intuition for them:
: The Fastest O
The holy grail – algorithms always complete in a fixed time irrespective of the size of the input.  
get_first_item(array haystack) { if(haystack.size > 0) { return haystack[0]; } } 

Flash – The Fastest BigO 
Examples: return the head of a list, insert a node into a linked list, pushing/popping a stack, inserting/removing from a queue… 
1 million items in…1 second 
: Divide and Conquer
These algorithms never have to look at all the input. They often halve inputs at each stage and thus have inverse the performance of exponentiation (see the PowerSisters to contrast).  
binary_jump_search(item needle, array sortedhaystack) { n = sortedhaystack.size; jump = n/2; pos = 0; while(jump > 0) { while(pos+jump < n && sortedhaystack[pos+jump] <= needle) { pos += jump; } jump = jump/2; } if(sortedhaystack[pos] == needle) { return pos; } } 

Shrinking Violet – Halving input BigO 
Examples: looking up a number in a phone book, looking up a word in a dictionary, doing a binary search, find element in a binary search tree, deleting from a doublylinked list… 
1 million items in…20 seconds 
: The Rare O
If we notice that in some sense, is in the “middle” of . This type of an algorithm is not very commonly found.  
Groot – The Rare One 
Examples: Grover’s algorithm, The Square Root Trick 
1 million items in…16 minutes 
: A Common O
This is a linear algorithm which scales directly proportional to the input. This is commonly the case because we often have to access an item at least once.  
find_in_unsorted(item needle, array haystack) { for(item in haystack) { if(item == needle) { return item; } } } 

Clark Kent – Just a Straight Guy 
Examples: finding the maximum/minimum of a collection, finding the max sequential sum, traversing a linked list, deleting from a singlylinked list,… 
1 million items in…11 days 
: A Sorting O
Efficient general sorting algorithms work in time, and many algorithms that use sorting as a subroutine also have this time complexity. It can be shown that algorithms that need to compare elements cannot sort faster than this (Algorithms like counting sort and radix sort use other information and can be faster).  
merge_sort(list items) { if(items.size <= 1){ return items; } mid = items.size / 2; left_half = items[0:mid]; right_half = items[mid:items.size]; left_half = merge_sort(left_half) right_half = merge_sort(right_half) while(left_half and right_half has items) { if(first(left_half) <= first(right_half)) { result.append(pop(left_half)); } else { result.append(pop(right_half)); } } if(left_half has items) { result.append(left_half); } else if(right_half has items) { result.append(right_half); } } 

Hisoka – The Card Magician 
Examples: Merge Sort, Quick Sort, Heap Sort 
1 million items in…1 year 
: The Power O’s
These algorithms grow as an polynomial of the input. are known as Quadratic and are known as Cubic algorithms. Higher powers are just known as bad algorithms 🙂 . The powers usually reflect the number of nested loops in the system.Note that whenever there are multiple BigO’s in an algorithm, the biggest class wins out because it usually dominates the scaling. We can see this in the example below which has both (two nested loops) and (three nested loops) but it’s overall BigO is
We can see this by noticing the time that smaller classes like take in comparison with larger classes like 

find_max_seq_sum(array numbers) { max_sum = 0; for(i = 0;i < numbers.size;i++) { curr_sum = 0; for(j = i;j < numbers.size;j++) { curr_sum += numbers[j]; if(curr_sum > max_sum) max_sum = curr_sum; } } // paranoid check  did we get it right? chk_sum = 0; for(i = 0;i < numbers.size;i++) { for(j = i;j < numbers.size;j++) { curr_sum = 0; for(k = i;k < j;k++) { curr_sum += numbers[k]; } if(curr_sum > chk_sum) { chk_sum = curr_sum; } } } assert(curr_sum == max_sum); return max_sum; } 

The Power Sisters – Growing Exponentially 
Examples: : multiplying two ndigit numbers by a simple algorithm, adding two n×n matrices, bubble sort, insertion sort, number of handshakes in a room…
: multiplying two n×n matrices by a naive algorithm… 
1 million items in… 32,000 years 32 billion years 
: Combination O
These are exponential algorithms whose growth doubles with every new addition to the input. Algorithms with are often recursive algorithms that solve a problem of size by recursively solving two problems of size . Another such type is one that iterates over all subsets of a set.If you find it hard to understand how iterating over subsets translates to, imagine a set of switches, each of them corresponding to one element of a set. Now, each of the switches can be turned on or off. Think of “on” as being in the subset and “off” being not included. Now it should be obvious that there are combinations.  
solve_hanoi(N, from_peg, to_peg, spare_peg){ if (N<1) return; if (N>1) { solve_hanoi(N1, from_peg, spare_peg, to_peg); } print "move from " + from_peg + " to " + to_peg; if (N>1) { solve_hanoi(N1, spare_peg, to_peg, from_peg); } } 

Wonder Woman – Combination Loops 
Examples: Tower of Hanoi, Naive Finonacci Calculation, … 
1 million items in… millennia (longer than the universe has been around!) 
: Permutation O
This algorithm iterates over all possible combinations of the inputs.  
travelling_salesman(list cities) { best_path = path(cities); while(get_next_permutation(cities)) { if(path(cities) < min_path) { best_path = path(cities); } } return best_path; } 

Link – The Traveler 
Examples: The traveling salesman problem. 
1 million items in… millennia (good grief!) 
: The Slowest O
This is just included for fun… Such an algorithm will not scale in any useful way.  
The Blackest Panther – Black Hole Black 
Examples: Please don’t find any… 
Did this help you understand BigO better? Which is your favourite BigO class? Let me know what you think!