Array Sequences
All of which support indexing.
Low-level comp architecture
Referential Arrays
Copying Arrays
backup = list(primes)
shallow copy
in that it references the same elements as in the first list. - If the contents of the list were of a mutable type, a deep copy
, meaning a new list with new elements, can be produced by using the deepcopy function from the copy moduleprimes.extend(extras)
will add the references to the first listReview
Using amortization, we can show that performing a sequence of such append operations on a dynamic array is actually quite efficient!
Amotized Anaylsis
Once we hit a full array in items being asserted, we conclude an overflow and we implement the doubling.
With amortization, after we have continually doubled the size at an overflow, cost when we are not in overflow is a cost of 1 whereas the cost with inserting for overflow is the n
.
Amortized Cost = ( 1 + 2 + 3 + 5 + 1 + 1 + 9 + 1 ... ) / n = [(1 + 1 + 1 + ... ) + ( 1 + 2 + 4 + ...)] = [n + 2n] / n = 3 = O(1)
What is a stack?
Stack implementation
# Stack() creates a new empty stack # push(item) add to stack # pop() removes item from the top # peek() shows you the top but does not remove # isEmpty() bool # size() return item size class Stack(object): def __init__(self): self.items = [] def isEmpty(self): return self.items == [] def push(self, item): self.items.append(item) def pop(self): self.items.pop() def peek(self): return self.items[len(self.items)-1] def size(self): return len(self.items) s = Stack() print s.isEmpty() # true s.push('two') s.peek() s.push(True) s.size() # 1 s.isEmpty() # false s.pop() # 'two'
What are Queues?
Queue Implementation
# Queue() to create a queue # enqueue(item) to add to the rear # dequeue() removes from the front # isEmpty() is the bool # size() returns the size class Queue(object): def __init__(self): self.items = [] def isEmpty(self): return self.items == [] def enqueue(self, item): self.items.insert(0, item) # insert for FIFO def dequeue(): return self.items.pop() def size(): return len(self.items) q = Queue() q.size() # 0 q.enqueue(1) q.enqueue(2) q.dequeue() # 1
What is a deque?
Implement a deque
# Deque() create a deque # addFront(item) # addRear(item) # removeFront() # removeRear() # isEmpty() # size() class Deque(object): def __init__: self.items = [] def isEmpty(self): return self.items == [] # rear is the first index def addRear(self, item): self.items.insert(0, item) # front is the len(self.items) index def addFront(self, item): self.items.append(item) def removeFront(self): return self.items.pop() def removeRear(self): return self.items.pop(0) def size(self): return len(self.items) d = Deque() d.addFront('hello') d.addRear('world') d.size() # 2 print d.removeFront() + ' ' + d.removeRear() # 'hello world' d.size() # 0
What is a singly linked list?
doubly linked list
Implementation of a singly linked list
class Node(object): def __init__(self, value): self.value = value self.nextNode = None a = Node(1) b = Node(2) c = Node(3) # how to link the nodes? a.nextNode = b b.nextNode = c
What is a doubly linked list?
next
and prev
for references to nodes that are both next and what precedes itImplementation of a Doubly Linked List
class Node(object): def __init__(self, value): self.value = value self.nextNode = None self.prevNode = None a = Node(1) b = Node(2) c = Node(3) a.nextNode = b b.prevNode = a b.nextNode = c c.prevNode = b
What is recursion?
Tree Section
What are trees?
Nodes in the tree
Recursive Definition of a tree
# what we are aiming for myTree = ['a', # root ['b', # left subtree ['d', [], []], ['e', [], []], ], ['c', [], []] # right subtree ]
Implementing a Tree using a List of Lists
# define the style of tree with a root and left/right empty child lists def BinaryTree(r): return [r, [], []] def insertLeft(root, newBranch): t = root.pop(1) if len(t) > 1: root.insert(1, [newBranch, t, []]) else: root.insert(1, [newBranch], [], []) return root def insertRight(root, newBranch): t = root.pop(1) if len(t) > 1: # reordered compared to above root.insert(2, [newBranch, [], t) else: root.insert(2, [newBranch], [], []) return root def getRootVal(root): return root[0] def setRootVal(root, newVal): root[0] = newVal def getLeftChild(root): return root[1] def getRightChild(root): return root[2] r = BinaryTree(3) insertLeft(r, 4) # [3, [4, [], []], []] insertLeft(r, 5) # [3, [5 [4, [], []], []], []] insertRight(r, 6) # [3, [5 [4, [], []], []], [6, [], []]] insertRight(r, 7) # [3, [5 [4, [], []], []], [7, [], [6, [], []]]] l = getLeftChild(r) print l # [5 [4, [], []], []] setRootVal(1, 9) # [3, [9 [4, [], []], []], [7, [], [6, [], []]]]
class BinaryTree(object): def __init__(self, rootObj): self.key = rootObj self.leftChild = None self.rightChild = None def insertLeft(self, newNode): if self.leftChild == None: self.leftChild = BinaryTree(newNode) else: t = BinaryTree(newNode) t.leftChild = self.leftChild self.leftChild = t def insertRight(self, newNode): if self.rightChild == None: self.rightChild = BinaryTree(newNode) else: t = BinaryTree(newNode) t.rightChild(self.rightChild) self.rightChild=(t) # bring back Object Address values def getRightChild(self): return self.rightChild def getLeftChild(self): return self.leftChild def setRootVal(self, obj): self.key = obj def getRootVal(self): return self.key r = BinaryTree('a') r.getRootVal() # 'a' print r.getLeftChild() # None r.insertLeft('b') r.getLeftChild() # get address of another binary tree r.getLeftChild().getRootVal() # 'b'
3 Main Methods
How to use "Preorder"
Preorder implementation
BinaryTree
class - Must check for the existence of the left and the right children before making the recursive call to preorder - In this case, probably better implementing it as an external function - The reason is that you rarely just want to traverse the tree - Most cases you want to accomplish something else during traversaldef preorder(tree): if tree != None: print(tree.getRootVal()) preorder(tree.getLeftChild()) preorder(tree.getRightChild()) # implementation as a BinaryTree method # generally not what you will want to do def preorder(self): print(self.key) if self.leftChild: self.leftChild.preorder() if self.rightChild: self.rightChild.preorder()
Postorder Implementation
def postorder(tree): if tree != None: preorder(tree.getLeftChild()) preorder(tree.getRightChild()) print(tree.getRootVal())
Inorder Implementation
def inorder(tree): if tree != None: inorder(tree.getLeftChild()) print(tree.getRootVal()) # print root for Proof of Concept inorder(tree.getRightChild())
What are Binary Queues?
What are Binary Heaps
Implementing a Binary Heap
2p
and 2p+1
- we can use this to make an efficient implementation of the tree - index 0 is set as 0 and then that math operation works# BinaryHeap() - create new heap # insert(k) - adds a new item to the heap # findMin() - returns the item with the minimum key value, leaving item in the heap # delMin() - returns the item with the minimum key value, removing item from the heap # requires that we take the last position and set it to the root # restore order by pushing down new root node # swap new root with smallest child recursively # isEmpty() # size # buildHeap(list) builds a new heap from a list of keys # we can build a heap in O(n) operations class BinaryHeap(object): def __init__(self): self.heapList = [0] self.currentSize = 0 def percUp(self, i): while i // 2 > 0: if self.heapList[i] < self.heapList[i // 2]: tmp = self.heapList[i//2] self.heapList[i//2] = self.heapList[i] self.heapList[i] = tmp i = i // 2 def insert(self, k): self.heapList.append(k) self.currentSize = self.currentSize + 1 self.percUp(self.currentSize) def percDown(self, i): while (i * 2) <= self.currentSize: mc = self.minChild(i) if self.heapList[i] > self.heapList[mc]: tmp = self.heapList[i] self.heapList[i] = self.heapList[mc] self.heapList[mc] = tmp i = mc def minChild(self, i): if i * 2 + 1 > self.currentSize: return i * 2 else: if self.heapList[i*2] < self.heapList[i*2+1]: return i * 2 else: return i * 2 + 1 def delMin(self): retVal = self.heapList[1] self.heapList[1] = self.heapList[self.currentSize] self.currentSize = self.currentSize - 1 self.heapList.pop() self.percDown(1) return retVal def buildHeap(self, aList): i = len(aList) // 2 self.currentSize = len(aList) self.heapList = [0] + aList[:] while (i > 0): self.percDown(i) i = i - 1
Generally implemented as an array due t the nature of accessing children.
We do not need to check the other child in the end, as the transistive relation
will ensure that it holds.
The procedure for deleting the root form the heap.
map abstract data type
Implementation of Binary Search Trees
bst
property - refers just to direct parent - understand when something becomes to left and right tree# order of insert data = [70,31,93,94,14,23,73] # put moethod - check if tree already has a root # if not, create a new TreeNode and install it as the root of the tree # if root node in place, call private, helper function put # start at root of tree, search tree comparing new key to key in current node # if less, search left, if more, search right # get method is easier since it searches tree recursively until it gets to a non-matching leaf node of finds a matching key # when matching key found, the value stored in the payload of the node is returned # delete is more difficult # if tree has more than one node, we search using the _get method to find the TreeNode that needs to be removed # if single node, remove root but must check if root key == param key # if we find node, 3 options to consider # does node to delete have children? # remove reference to parent -> set [left|right] child to None # does node to delete have single child? # slightly more complex -> promote child to take parent class BinarySearchTree: def __init__(self): self.root = None self.size = 0 def length(self): return self.size # allows to call len(<BST object>) def __len__(self): return self.size def __iter__(self): return self.root.__iter__() def _put(self, key, val, currentNode): if key < currentNode.key: if currentNode.hasLeftChild(): self._put(key,val,currentNode.leftChild) else: currentNode.leftChild = TreeNode(key,val,parent=currentNode) else: if currentNode.hasRightChild(): self._put(key,val,currentNode.rightChild) else: currentNode.rightChild = TreeNode(key, val, parent = currentNode) def __setitem__(self, k, v): self.put(k,v) def put(self, key, val): if self.root: self._put(key, val, self.root) else: self.root = TreeNode(key, val) self.size = self.size + 1 # _ for code refactoring reasons as help def _put(self, key, val, currentNode): if key < currentNode.key: if currentNode.hasLeftChild(): self._put(ley, val, currentNode.leftChild) else: currentNode.leftChild = TreeNode(key, val, parent = currentNode) else: if currentNode.hasRightChild(): self._put(key, val, currentNode.rightChild) else: currentNode.rightChild = TreeNode(key, val, parent = currentNode) def get(self, key): if self.root: res = self._get(key,self.root) if res: return res.payload else: return None else: return None def _get(self, key, currentNode): if not currentNode: return None elif currentNode.key == key: return currentNode elif key < currentNode.key: return self._get(key, currentNode.leftChild) else: return self._get(key, currentNode.rightChild) def __getitem__(self, key): return self.get(key) def __contains__(self, key): if self._get(key, self.root): return True else: return False def delete(self, key): if self.size > 1: nodeToRemove = self._get(key, self.root) if nodeToRemove: self.remove(nodeToRemove) self.size = self.size-1 else: raise KeyError('Error, key not in tree') elif self.size == 1 and self.root.key == key: self.root = None self.size = self.size - 1 else: raise KeyError('Error, key not in tree') def __delitem__(self, key): self.delete(key) def spliceOut(self): if self.isLeaf(): if self.isLeftChild(): self.parent.leftChild = None else: self.parent.rightChild = None elif self.hasAnyChildren(): if self.hasLeftChild(): if self.isLeftChild(): self.parent.leftChild = self.leftChild else: self.parent.rightChild = self.leftChild self.leftChild.parent = self.parent else: if self.isLeftChild(): self.parent.leftChild = self.rightChild else: self.parent.rightChild = self.rightChild self.rightChild.parent = self.parent def findSuccessor(self): succ = None if self.hasRightChild(): succ = self.rightChild.findMin() else: if self.parent: if self.isLeftChild(): succ = self.parent else: self.parent.rightChild = None succ = self.parent.findSuccessor() self.parent.rightChild = self return succ def findMin(self): current = self while current.hasLeftChild(): current = current.leftChild return current def remove(self, currentNode): if currentNode.isLeft(): #Leaf if currentNode == currentNode.parent.leftChild: currentNode.parent.leftChild = None else: currentNode.parent.rightChild = None elif currentNode.hasBothChildren(): #interior succ = currentNode.findSuccessor() succ.spliceOut() currentNode.key = succ.key currentNode.payload = succ.payload else: # this node has one child if currentNode.hasLeftChild(): if currentNode.isLeftChild(): currentNode.leftChild.parent = currentNode.parent currentNode.parent.leftChild = currentNode.leftChild elif currentNode.isRightChild(): currentNode.leftChild.parent = currentNode.parent currentNode.parent.rightChild = currentNode.leftChild else: currentNode.replaceNodeData(currentNode.leftChild.key, currentNode.leftChild.payload, currentNode.leftChild.leftChild, currentNode.leftChild.rightChild) else: if currentNode.isLeftChild(): currentNode.rightChild.parent = currentNode.parent currentNode.parent.leftChild = currentNode.rightChild elif currentNode.isRightChild(): currentNode.rightChild.parent = currentNode.parent currentNode.parent.rightChild = currentNode.rightChild else: currentNode.replaceNodeData(currentNode.rightChild.key, currentNode.rightChild.payload, currentNode.rightChild.leftChild, currentNode.rightChild.rightChild) class TreeNode: def __init__(self, key, val, left=None, right=None, parent=None): self.key=key self.val=val self.left=left self.right=right self.parent=parent def hasLeftChild(self): return self.leftChild def hasRightChild(self): return self.rightChild def isLeftChild(self): return self.parent and self.parent.leftChild == self def isRightChild(self): return self.parent and self.parent.rightChild == self def isRoot(self): return not self.parent def isLeaf(self): return not (self.rightChild or self.leftChild) def hasAnyChildren(self): return self.rightChild or self.leftChild def hasBothChildren(self): return self.rightChild and self.leftChild def replaceNodeData(self, key, value, lc, rc): self.key = key self.payload = value self.leftChild = lc self.rightChild = rc if self.hasLeftChild(): self.leftChild.parent = self if self.hasRightChild(): self.rightChild.parent = self
Notes:
Given a binary tree, check whether it's a binary search tree or not.
# Solution 1 tree_vals = [] # traversal should lead to sorted order def inorder(tree): if tree != None: inorder(tree.getLeftChild()) treeVals.append(tree.getRootVal()) inorder(tree.getRightChild()) def sortCheck(treeVals): return treeVals == sorted(treeVals) inorder(tree) sortCheck(treeVals) # Solution 2 # class Node # def treeMax - if not node, return -inf # def tree Min - not node, return inf # def verify - if not node, return True - if tree.max and tree.min and verify(node.right) return True - else return False
Note: In python, we can use in
to check if an element is in a list.
15 in [5, 4, 15] # true
How does this work? What is the best way to search?
Basic searching technique. Sequentially go through a data subject and compare as you go along.
Eg. traversing an unordered list for 50 and comparing as you go.
If 50 was not present, we still had to check every element in the array.
But what if it was ordered?
If the array was sorted, then we only have to search until we get a match or find something greater than our search target.
Ordered vs Unordered sequential search
Average time for unordered will be n
, whereas for ordered it will be n/2
Unordered List
def seqSearch(arr, el): pos = 0 found = False while pos < len(arr) and not found: if arr[pos] == el: found = True else: pos += 1 return found
Ordered List
def orderedSeqSearch(arr, el): pos = 0 found = False stopped = False while pos < len(arr) and not found and not stopped: if arr[pos] == el: found = True else: if arr[pos] > el: stopped = True else: pos += 1 return found
If the list is ordered, we can do a binary search!
This item starts from the middle.
If the item is greater, we know the entire lower half of the list can be ignore.
Then, we can repeat the process with the upper half.
So it uses Divide and Conquer
- divide to smaller pieces, solve the smaller pieces and then repeat.
// Comparisons 1: n/2 comparisons left 2: n/4 comparisons left 3: n/8 comparisons left ... i: n/2^i comparisons left
2 Versions - iterative and recusive
Iterative
def binarySearch(arr, el): first = 0 last = len(arr)-1 found = False while first <= last and not found: mid = (first+last)/2 if arr[mid] == el: found = True else: if el < arr[mid]: last = mid-1 else: first = mid+1 return found arr = [1,2,3,4...] # must be sorted binarySearch(arr, 4) # True binarySeach(arr, 13) # False
Recursive
Remember: Must always have a base case!
def recBinSearch(arr, el): if len(arr) == 0: return False else: mid = len(arr)/2 if arr[mid] == el: return True else: if el < arr[mid]: return recBinSearch(arr[:mid], el) else: return recBinSearch(arr[mid+1:], el])
Hashing
We've seen how we can improve search by knowing about structures beforehand.
We can build a data structure that can be accessed in O(1) time - this is hashing!
A hash table
is a collection of items that are stored in such a way that it becomes easy to find them later.
slots
, can hold an item and is named by an integer value starting at 0.Hash Tables
The mapping between an item and the slot were that item belongs in the hash table is called the hash function
One hash function
we can use is the remainder method. When preseted with an item, the hash function is the item divided by the table size, this is then its slot number.
Example:
h(item)=item%11
Let's see the results!
Item | Hash Value --------------------- 54 | 10 26 | 4 93 | 5 17 | 6 77 | 0 31 | 9
We are now ready to occupy 6 out of the 11 slots
load factor
and is commonly denoted by lambda = number of items / table size
lambda = 6/11
Our Hash Table has now been loaded.
When we want to search, we just need to use the hash function to compute the slot name for the item and then chack the hash table to see if it is present.
Therefore the operation is O(1), since a constant amount o time is required to compute the hash value and then index the hash table at that location.
What if we have two items that have the same location? This is known as a collision
(sometimes a clash
).
We can talk about this resolution soon.
Hash Function
A hash func that maps each item into a unique slot is referred to as a perfect hash function
.
Our goal is to create a hash func that minimizes collisions, so it is easy to compute and evenly distributes the items in the hash table.
Folding Method
This method for constructing hash functions begins by dividing the item into equal-size pieces (last piece may not be of equal size).
These pieces are then added together to give the resulting hash value.
Example:
Give the number 436-555-4601
- we can divide these numbers into groups of two.
After add the numbers now, we get 210.
If we assume our hash table has 11 slots, we need to perform the extra step of dividing by 11 and keeping its remainder.
210 % 11 is 1, so the phone number 436-555-4601 hashes to slot 1.
Mid-square method
We first square the item, and then extract some portion of the resulting digits.
Example, if it were 44, we computer 44^2 = 1936.
By extracting the middle two digits, 93, and performing the remainder step, we get 93%11 = 5
Non-Integer elements We can also create hash funcs for character-based items.
The word cat
can be thought of as a sequence of ordinal values.
If you use Python, you can just use the function ord('a')
and then get the values.
So ord('a')
= 97 % 11 = 11
For cat, you can sum up all the ordinal values, get 312 % 11 = 4.
Collision Resolution
One method for resolving it to look into the hash table and find another open slot to hold the item that caused the collision. This is known as Open Addressing
.
By systematically visiting every spot, we are doing something known as linear probing
.
Example if we had 77, 44, and 55, we then move 44 and 55 up until it finds something that fits.
One way to deal with clustering is to skip slots, thereby more evenly distributing the items that have caused collisions.
rehashing
is the general process of looking for another slot.
quadratic probing
is a variation of the linear probing idea.
Instead of using a constant "skip" value, we use a rehash function that increments the hash value by 1, 3, 5, 7, 9 and so on.
h
, successive values are h+1
, h+4
, h+9
, h+16
Alternative Option
We can also allow each slot to hold a reference to a collection (or chain) or items.
Chaining
allows many items to exist at the same location in the hash table.
When collisions happen, the item is still placed in the proper slot of the hash table.
As more items hash to the same location, the difficulty for finding the item also increases.
Map
The idea of a dictionary used as a hash table to get and retrieve items using keys
is often referred to as a mapping.
# HashTable() - Create a new, empty map. It returns an empty map colleciton. # put(key, val) - Add a new key-value pair to the map. If the key is already in the map then replace the old value with the new value # get(key) - Give a key, return the value stored in the map or None # del - Delete the key-value pair from the map using a statement of the form del map[key] # len() - Return the number of key-value pairs stored # in the map Return True for a statement of the form `key in map`, if the given key is in the map, False otherwise # # Note - you can take this and play around with other hash functions # class HashTable(object): def __init__(self, size): self.size = size self.slots = [None] * self.size # list with an empty item self.data = [None] * self.size def put(self, key, data): hashvalue = self.hashfunction(key, len(self.slots)) if self.slots[hashvalue] == None: self.slots[hashvalue] = key self.data[hashvalue] = data else: if self.slots[hashvalue] == key: self.data[hashvalue] = data else: nextslot = self.rehash(hashvalue, len(self.slots)) while self.slots[nextslot] != None and self.slots[nextslot] != key: nextslot = self.rehash(nextslot, len(self.slots)) if self.slots[nextslot] == None: self.slots[nextslot] = key self.data[nextslot] = data else: self.data[nextslot] = data def hashfunction(self, key, size): return key%size def rehash(self, oldhash, size): return (oldhash+1)%size def get(self,key): startslot = self.hashfunction(key, len(self.slots)) data = None stop = False found = False position = startslot while self.slots[position] != None and not found and not stop: if self.slots[position] == key: found = True data = self.slots[position] else: position = self.rehash(position, len(self.slots)) if position == startslot: stop = True return data def __getitem__(self, key): return self.get(key) def __setitem__(self, key, data): self.put(key, data) h = HashTable(5) h[1] = 'one' h[2] = 'two' h[3] = 'three' h[1] # 'one' h[2] # 'two'
We've discussed how to search for items, but now we will look at how to sort!
Explanations and Implementations:
Common interview questions consist of being asked to implement a sorting algorithm.
The best way to understand algorithms is to understand two things:
Examples for seeing them:
Sorting Algorithms website
Visualgo
The bubble sort
makes multiple passes through a list
In the visualization of the sort, imagine the comparision beside each other to see if they are out of order.
If so, you swap and repeat (exchange), if not (no exchange), set the element to compare to the next number.
There can be multiple passes.
def bubbleSort(arr): for n in range(len(arr)-1, 0, -1): # print n for k in range(n): # print k if arr[k] > arr[k+1]: temp = arr[k] arr[k] = arr[k+1] arr[k+1] = temp arr = [5, 3, 7, 2] bubbleSort(arr) arr # [2 3, 5, 7]
Selection sort improves on the bubble sort by making only one exchange for every pass through the list.
It looks for the largest value as it makes a pass and, after completing the pass, places it in the proper location.
The process continues and requires n-1
passes to sort n
items since the final item is placed on the nth pass.
Example: [10, 4, 3, 2]
We notice 10 is the largest, so we swap last place with it. Then 4, so second last place. Repeat.
Resources:
Implementation of the Selection Sort
``python def selectionSort(arr): for fillslot in range(len(arr)-1, 0, -1): positionOfMax = 0
for location in range(1, fillslot+1): if arr[location] > arr[positionOfMax]: positionOfMax = location temp = arr[fillslot] arr[fillslot] = arr[positionOfMax] arr[positionOfMax] = temp
arr = [10, 4, 3, 2] selectionSort(arr)
## 17.11: Insertion Sort Always maintains a sorted sublist in the lower positions of the list. Each new item is inserted back into the previous sublist such that the sorted sublist is one item larger. - Begin by assuming list with one item (position 0) is already sorted. - On each pass pass, one for each item 1 through n-1, the current item is checked against those in the already sorted sublist. - As we look back into the already sorted sublist, we shift those items that are greater to the right. - When we reach a smaller item or the end of the sublist, the current item can be inserted. Insertion sort builds the final sorted array one item at a time. It is much less effecient on large lists than more advanced algorithms such as quicksort, heapsort or merge sort. This runs in O(n^2) but a best case costs 1 on each pass. This requires a third of the processing power though! **Implementation of the Insertion Sort** ```python def insertionSort(arr): for i in range(1, len(arr)): currentValue = arr[i] position = i while position > 0 and arr[position-1] > currentValue: arr[position] = arr[position-1] position = position-1 arr[position] = currentValue arr = [10, 4, 3, 2] insertionSort(arr) # [2, 3, 4, 10]
Shell sort improves on the insertion sort by breaking the original list into a number of smaller sublists.
The unique way that these sublists are chosen is the key to the shell sort.
Instead of breaking lists into contiguous sublists, shell sort uses an increment i
to create a sublist by choose all items that are i
items apart.
If we have 9 items and we have an increment of 3, each n+3 form the sublist.
After completing these sublists, we've moved these closer to where they belong.
Then if we do the final sort with an increment of one (so in this case, just a standard insertion sort).
Implementation of Shell Sort
def shellSort(arr): sublistCount = len(arr)/2 # While we still have sub lists while sublistCount > 0: for start in range(sublistCount): # Use a gap insertion gapInsertionSort(arr, start, sublistCount) print 'After increments of size: ', sublistCount print 'Current array: ', arr sublistCount = sublistCount / 2 def gapInsertionSort(arr, start, gap): for i in range(start+gap, len(arr), gap): currentValue = arr[i] position = i # Using the gap while position >= gap and arr[position - gap] > currentValue: arr[position] = arr[position - gap] position = position - gap # Set current value arr[position] = currentValue arr = [10, 4, 3, 2, 12, 35] shellSort(arr) # [2, 3, 4, 10, 12, 35]
Merge sort is a recursive algorithm that continual splits a list a half. Going with a divide and conquer
strategy.
If the list is empty or has one item, it is sorted by definition (the base case).
If the list has more than one item, we split the list and recursively invoke a merge sort on both halves.
merge
is performed.split
and at the end, recusively merge!Implementation of a Merge Sort
def mergesort(arr): if len(arr) > 1: mid = len(arr) / 2 lefthalf = arr[:mid] righthalf = arr[mid:] mergesort(lefthalf) mergesort(righthalf) i = 0 j = 0 k = 0 while i < len(lefthalf) and j < len(righthalf): if lefthalf[i] < righthalf[i]: arr[k] = lefthalf[i] i += 1 else: arr[k] = righthalf[j] j += 1 k += 1 while i < len(lefthalf): arr[k] = lefthalf[i] i += 1 k += 1 while j < len(righthalf): arr[k] = righthalf[j] j += 1 k += 1 print 'Merging', arr arr = [11,2,5,4,7,56,2,12,23] mergesort(arr) # output [2,2,4,5,7,11,12,23,56]
pivot value
and there will be a left mark
and right mark
pivot value
, the partition
process happens next - It will find the split point and at the same time move other items to the appropriate side of the list, either less than or greater than the pivot valuesplit point
, will be used to divide the list for subsequent calls to the quick sortleft mark
will check if greater than pivot value
and right mark
will check if less thanright mark
is < than the left mark
on the array, we call this the split point
- Once crossed, we swap right mark with the pivot value
Implementation of the Quick Sort
You can choose different pivot values, but this implementation will choose the first item in the list.
def quickSort(arr): quickSortHelper(arr, 0, len(arr-1)) # quickSort recursively calls def quickSortHelper(arr, first, last): if first < last: splitPoint = partition(arr, first, last) quickSortHelper(arr, first, splitPoint-1) quickSortHelper(arr, splitPoint+1, last) def partition(arr, first, last): pivotValue = arr[first] leftmark = first+1 rightmark = last done = False while not done: while leftmark <= rightmark and arr[leftmark] <= pivotValue: leftmark += 1 while arr[rightmark] >= pivotValue and rightmark >= leftmark: rightmark -= 1 if rightmark < leftmark: done = True else: temp = arr[leftmark] arr[leftmark] = arr[rightmark] arr[rightmark] = temp temp = arr[first] arr[first] = arr[rightmark] arr[rightmark] = temp return rightmark
acyclic
- we will see that we can solve several important problems if the problem can be represented as a directed acyclic graph
Adjacency Matrix
adjacent
Adjacency List
Implementation of a Graph as an Adjacency List
# Vertex() - create a new vertice with an id and what it is connected to # addNeighbour() - create a neighbour that it is connected to # getWeight() returns the weight of the edge from this vertex class Vertix: def __init__(self, key): self.key = key self.connectedTo = {} def addNeighbour(self, nbr, weight=0): self.connectedTo[nbr] = weight def getConnections(self): return self.connectedTo.keys() def getId(self): return self.id def getWeight(self, nbr): return self.connectedTo[nbr] def __str__(self): return str(self.id) + ' connected to: ' + str([x.id for x in self.connctedTo]) # Graph() - create new, empty graph # addVertex(vert) - create new instance of a vertex # addEdge(fromVert, toVert, weight) # addEdge (fromtVert, toVert) - without weight # getVertex(vertKey) - return vertex # getVertices() - return list of all vertices # in - returns True for a statement of the form vertex in graph, if the given vertex is in the graph, False otherwise class Graph: def __init__(self): # dict, but modelled after adjacency list self.vertList = {} self.numVert = 0 def addVertex(self, key): self.numVertices = self.numVertices + 1 newVertex = Vertix(key) self.vertList[key] = newVertex return newVertex def getVertex(self, n): if n in self.vertList: return self.vertList[n] else: return None # f: from, t: to, cost: weight def addEdge(self, f, t, cost=0): if f not in self.vertList: nv = self.addVertex(f) if t not in self.vertList: nv = self.addVertex(t) self.vertList[f].addNeighbour(self.vertList[t], cost) def getVertices(self): return self.vertList.keys() def __iter__(self): return iter(self.vertList.values()) def __contains__(self, n): return n in self.vertList g = Graph() for i in range(6): g.addVertex(i) g.vertList # gives back dict of vertices # [ 0: <memory pos>, 1: ...] g.addEdge(0,1,2) for vertex in g: print vertex print vertex.getConnections() print '\n' # prints list with edge connected from 0 to 1
G
and starting node s
is that explores all vertices that are distance k
from s
before finding any that are k+1
from s
Algorithm for exploration if vertex is unexplored:
nbr
is coloured graynbr
is set to the current node currentVert
nbr
is set to the distance to currentVert + 1
nbr
is added to the end of a queue - adding nbr
to the end of the queue effectively schedules this node for further exploration - but not until all the other vertices on the adjacency list of currentVert
have been exploreddef bfs(g, start): # would need to implement setDistance and setPred start.setDistance(0) start.setPred(None) vertQueue = Queue() vertQueue.enqueue(start) while (vertQueue.size() > 0) currentVert = vertQueue.dequeue() for nbr in currentVert.getConnections(): if (ngr.getColor() == 'white'): nbr.setColor('gray') nbr.setDistance(currentVert.getDistance() + 1) nbr.setPred(currentVert) vertQueue.enqueue(nbr) currentVert.setColor('black')
Knight's Tour Problem
On a chess board, how can the knight move? And from there, making more moves, how can we move?
Again, ajacency matrix would be sparse - so we certainly want an adjaency list.
DFS will explore each node as deeply as possible. That being said, the nodes can be visited more than once.
It will then return as far back as it can to find the next legal move.
The knightTour
function takes four parameters:
n
- the current depth of the search treepath
- a list of vertices visited up to this pointu
- the vertex in the graph we wish to explorelimit
- the number of nodes in the pathThe function itself is also recursive.
We use a queue to keep a list of what vertice to visit next.
def knightTour(n, path, u, limit): u.setColor('gray') path.append(u) if n < limit: nbrList = list(u.getConnections()) i = 0 done = False while i < len(nbrList) and not done: if nbrList[i].getColor() == 'white': done = knightTour(n+1, path, nbrList[i], limit) i = i + 1 if not done: # prepare to backtrack - haven't reach limit path.pop() u.setColor('white') else: done = True return done
DFS Overview
DFS
class DFSGraph(Graph): def __init__(self): super().__init__() self.time = 0 def dfs(self): for aVertex in self: aVertex.setColor('white') aVertex.setPred(-1) for aVertex in self: if aVertex.getColor() == 'white': self.dfsvisit(aVertex) # dfsvisit uses a stack def dfsvisit(self, startVertex): startVertex.setColor('gray') self.time += 1 startVertex.setDiscovery(self.time) for nextVertex in startVertex.getConnections(): if nextVertex.getColor() == 'white': nextVertex.setPred(startVertex) self.dfsvisit(nextVertex) startVertex.setColor('black') self.time += 1 startVertex.setFinish(self.time)
parenthesis property