Efficiency - Big O Notation

Nov. 2, 2020 pexels-pixabay-35888.jpg Vuong Huynh

Efficiency is also called complexity.

Think about space and time: How long the code run and how much storage space needed.

Efficiency can come off with a tradeoff between time efficiency and space efficiency.


Notation

O(n)

Big O notation includes big O and parenthesis () includes algebraic expression with valuable n in it.

The only exception is O(1), it is in fact O(0n + 1)

An example with cipher code with 3 letters to map to 3 other letters : (such as A->Y, B->E, and C->S)

We can try this pseudocode as:

function decode(input):
    create output string

    for each letter in input:
        get new_letter from letters location in cipher
        add new_letter to output
    return output

To calculate time efficiency, we start counting the number of lines.

create output string

return output

These two lines will need to run once and we have O(... + 2)

get new_letter from letters location in cipher
add new_letter to output

The two lines in for loop have to run every letter in the input, so we can add 2n into our big O since 2 is two lines of code and n is the number of letters in the input.

O(2n + 2)

If we have 10 letters, n = 10, we will have O(2*10 + 2)

If we have 1 million letters, n = 1000000, we will have O(2*1000000 + 2)

To get the actual efficiency, we can multiply (2*10 + 2) or (2*1000000 + 2) times the amount of time it takes to run one line of code.


Approximation and Worst Case

With the above example, we have some different cases such as O(2n + 2) or O((26+2)n + 2).

We can use approximation and say the efficiency is O(n)

" Some numbers of computation must be performed for EACH letter in the input"

The worst-case scenario from the above example is where we have to check all 26 letters every single time.

But we also have the best case when we find the result in the first try in 26 letters.

So the average case efficiency is 13 tries and we will have O((13+2)n + 2) or O(15n + 2)

Below are some examples of functions in Python. Look at each and take note of the time efficiency.

"""input manatees: a list of "manatees", where one manatee is represented by a dictionary
a single manatee has properties like "name", "age", et cetera
n = the number of elements in "manatees"
m = the number of properties per "manatee" (i.e. the number of keys in a manatee dictionary)"""

def example1(manatees):
    for manatee in manatees:
        print manatee['name']

def example2(manatees):
    print manatees[0]['name']
    print manatees[0]['age']

def example3(manatees):
    for manatee in manatees:
        for manatee_property in manatee:
            print manatee_property, ": ", manatee[manatee_property]

def example4(manatees):
    oldest_manatee = "No manatees here!"
    for manatee1 in manatees:
        for manatee2 in manatees:
            if manatee1['age'] < manatee2['age']:
                oldest_manatee = manatee2['name']
            else:
                oldest_manatee = manatee1['name']
    print oldest_manatee

Example 1

We iterate over every manatee in the manatees list with the for loop. Since we're given that manatees has n elements, our code will take approximately O(n) time to run.

Example 2

We look at two specific properties of a specific manatee. We aren't iterating over anything - just doing constant-time lookups on lists and dictionaries. Thus the code will complete in constant, or O(1), time.

Example 3

There are two for loops, and nested for loops are a good sign that you need to multiply two runtimes. Here, for every manatee, we check every property. If we had 4 manatees, each with 5 properties, then we would need 5+5+5+5 steps. This logic simplifies to the number of manatees times the number of properties, or O(mn).

Example 4

Again we have nested for loops. This time we're iterating over the manatees list twice - every time we see a manatee, we compare it to every other manatee's age. We end up with O(nn), or O(n2) (which is read as "n squared").

Throughout the course, you can reference the Big-O Cheat Sheet to keep track of time complexities for many of the algorithms and data structures we study.