title | type | duration | creator | competencies | ||||
---|---|---|---|---|---|---|---|---|
Time complexity and Big O |
lesson |
2:30 |
|
Programming |
After this lesson, students will be able to:
- Describe the concept of time and space-complexity
- Compare/contrast common run-times in Big-O notation
- Recognize patterns between algorithms and their run-times
- Jump in to the next lecture
Before this lesson, students should:
- Be comfortable reading and writing pseudocode
- Have practice with iterating over arrays and objects
- Have a willingness to touch some mathematics!
Many reasons. Here are the big ones
- For yourself
- Be able to quantify and improve your algorithms.
- You will better understand code (include your own)
- You will better be able to talk about your code.
- For interviews
- If you whiteboard during an interview you probably be asked what the run-time of your algorithm is.
- If not, tell them anyway!
- For the money and for the world
- Learn this stuff and you could win a million dollars and change the world over night
- (This is a stretch-goal)
Time complexity is a measure how long an algorithm takes to run given the size of its inputs. Big-O the most common notation for quantifying the complexity.
From interview cake
Big O notation is the language we use for articulating how long an algorithm takes to run. It's how we compare the efficiency of different approaches to a problem.
With big O notation we express the runtime in terms of—brace yourself—how quickly it grows relative to the input, as the input gets arbitrarily large.
In other words, Big-O represents the (time) complexity of an algorithm as the input size grows. With Big-O, we always consider worst-case-scenario.
The formal definition is very math-y. Let us instead practice understanding run-time with examples.
Given an object obj
, a key key
, and a value val
, check to see if there is a key in obj
with value val
.
func check_pair(obj, key, val):
return obj.has_key?(key) && obj[key] == val
No matter how large our obj
is, it takes the same amount of time to check to see if it has a key and check that 2 values match. (We accept this as given.) Therefore, this is algorithm runs in constant time
. In Big-O notation this is O(1)
.
func check_pair(obj, key, val):
for_each_key k in obj:
if(k == key && obj[k] == val):
return true
return false
Here we have to loop over each key in obj
. Inside of the loop we just do a quick operation (that we will accept as constant time). If there are 5 pairs, we will have to iterate 5 times. If there are 6, then 6. If 7, then 7...
We call this linear
or O(n)
(where n
is the size of our object). With an object of size n
we will have about n
operations.
So which algorithm is "better"? Which has a smaller run-time?
Create a function largest_element
that take an array arr
and returns the largest element.
func largest_element(arr):
for_each_element x in arr:
return x if is_largest?(x, arr)
func is_largest?(x, arr):
for_each_element y in arr:
return false if y > x
return true
First let's figure out the run-time of is_largest?
. Assume our arr
has length 5. Then in the worst case we will have to iterate over 5 elements. If there are 6, then 6...
We saw this pattern above. This is linear
or O(n)
. (Where n
is the length of arr
).
Now let's figure out the run-time of largest_element
. Again, assume our arr
has length 5. Then in the worst case we will have to iterate over 5 elements. For each of those, we have to check is_largest?(x, arr)
. As shown above, this has a run-time of O(n)
. This means we will have to do 5 operations for every 5 elements in arr
.
If arr
is of length 5, we will have to do 5 * 5 = 5² = 25 operations.
If arr
is of length 6, we will have to do 6 * 6 = 6² = 36 operations.
If arr
is of length 7, we will have to do 7 * 7 = 7² = 49 operations.
We call this O(n²)
.
func largest_element(arr):
max = arr[0]
for_each_element x in arr:
if x > max:
max = x
return max
In the worst case scenario we will have to iterate over every element in array and compare it to just 1 other element. For an array of length 5, this is 5 operations. For an array of length 6, it is 6. For n
it is n
. Thus the run-time of our whole algorithm is O(n)
.
Given a word word
, determine if it is possible to re-arrange the characters such that they would form a palindrome.
Ex: poss_pal?("cracera")
is true
because "racecar"
is a palindrome.
Ex: poss_pal?("stacey")
is false
because there is no anagram that would yield a palindrome.
Let's assume we already have an all_anagrams
function.
func poss_pal?(word):
for w in all_anagrams(word):
if is_pal?(w):
return true
return false
func is_pal?(word):
return word.reverse() == word
For a word of length 5, how many anagrams are there? 5 choices for the first letter, 4 for the second, three for the third... or 5 * 4 * 3 * 2 * 1 = 5! anagrams. If the word had length 6, there would be 6! anagrams. For a word of length n
, there are n!
anagrams.
This means we have n!
iterations of our loop. In each loop we have to check is_pal?(w)
. For a word of length n
, we would have n
characters to reverse so the run-time of is_pal?
is O(n)
.
Thus, the over-all run-time is O(n⋅n!)
. Yuck.
We actually drop the first
n
because it is dwarfed by then!
leaving us withO(n!)
Hm. Is there a way we can do better? What do all anagrams have in common?
Let us try to pseudocode an algorithm with a smaller run time.
Hint: we can do this in
O(n)
time - a lot thanO(n!)
Assume we have a sorted array sorted
of numbers. Write a function contains?(sorted, num)
to check if some num
is in sorted.
Ex: contains?([2,3,5,7,11,13,17], 13)
is true
Ex: contains?([4,8,15,16,23,42], 21)
is false
func contains?(sorted, num):
forEach n in sorted:
if n == num:
return true
return false
What is the run-time of this algorithm? Can we do even better?
func contains?(sorted, num):
leftIdx = 0
rightIdx = sorted.length - 1
while(leftIdx <= rightIdx):
midIdx = (leftIdx + rightIdx) / 2
midElem = sorted[midIdx]
if num < midElem:
rightIdx = midIdx
else_if num > midElem:
leftIdx = midIdx
else
return true
return false
Hm.
Introducing a sexy new run-time: O(log n)
. It is even slimmer than O(n)
. It grows very, very slowly. (The base of the log does not matter as these all differ by a constant).
This is a famous algorithm called binary search. It can be adapted to return the index of the element searched for. It can also be done with recursion! (Check out the Python solutions for both)
Given an array arr
and number num
, return a pair of numbers summing to num
.
func two_sum(arr, num):
for_each n in arr:
if includes?(arr, num - n)
return [n, num - n]
return nothing
func includes?(arr, elem):
for_each e in arr:
return true if e == elem
return false
What do we think the run time is? Can we do better?
func two_sum(arr, num):
set = build_set(arr)
for_each n in arr:
if set[num - n]:
return [n, num - n]
return nothing
func build_set(arr):
set = {}
for_each e in arr:
set[e] = true
return set
What is the run-time of this algorithm?
Here are some common run-times from small/fast/good to large/slow/bad
Polynomial
O(1)
- constant. Size of input does not matter. Can't do better than thisO(log n)
- logarithmic. Very fast. Just slower than constant time. See binary searchO(n)
- linear. This is often one or more isolated loops.O(n⋅log n)
- Best case scenario for sorting algorithmsO(n²)
- quadric (polynomial). This is often a loop within a loop.O(n³)
- cubic (polynomial). This is often a loop within a loop within a loop.O(n⁴)
,O(n⁵)
... polynomial
Super-polynomial
O(2ⁿ)
- exponential. The number of subsets of a set of sizen
O(3ⁿ)
,O(4ⁿ)
... exponentialO(n!)
- factorial. The number of ways you can permute something of sizen
With Big-O, we drop constants and anything smaller than the leading term.
Examples:
O(2n)
=>O(n)
O(5n² + 3n + 1)
=>O(n²)
O(n! + 100n²)
=>O(n!)
Why? We only care about a general idea of how this grows over time as our input gets very large. When the input is large enough, the constants/other terms just don't mean as much in comparison.
There is a mathematical justification for dropping these. Look at the formal definition if you are really interested.
Let's take a look at problems.md
. Spend some time thinking about each one. You may look at some example solutions in /js-solutions
after the lecture today.
- Big-O can also be used to represent space-complexity.
- This is how much extra memory, not time, is needed to run the algorithm
- An algorithm with a smaller run-time is not always "better"
- Other things to consider are code complexity, space-complexity, etc
- Two algorithms can have the same run-time with one being a lot faster
- For example, one algorithm may require
10n
operations while another requires only2n
of the same operations. The latter is faster but both areO(n)
- For example, one algorithm may require
- Why do we care about time and space complexity an algorithm?
- Why do we use Big-O notation? What does it tell us?
- How can we read an algorithm and boil it down to it's run-time in Big-O?
Things to make you less confused.
- Math refresher
- Big O basics
- More example algorithms
- Also discusses Big O in terms of space complexity
- Beginner's guide to Big O
- Great explanation and examples of common run-times
- Big O cheatsheet
- The complexity chart is a good visual to understand common run times
- Good information about run time of algorithms in data structures (don't need to know this)
- Different run times of sorting algorithms is also interesting but you certainly don't need to know them all. You probably should know that sorting can be done in
O(n⋅log(n))
time.
- Intro to Big-O video
- From the author of Cracking the Coding Interview (see below)
Things to make you more confused.
- Cracking the Coding Interview
- This book will help you get a job. Bigly. This is a fabulous resource for practicing these problems. Just fabulous. You look at other books - they're terrible. Just terrible. They're a disaster. Total disaster. They're losers. This is the number one book. It's terrific. Everyone agrees.
- Formal definition of Big O
- This is very technical and you do not need to know it. Most developers don't.
- P vs NP problem
- One of the most famous problems in Mathematics and Computer Science
- Simply find a single poly-time algorithm for any NP-complete problem
- you will literally win 1 million dollars (probably more)
- you will literally change the world