Anda di halaman 1dari 1

Data Structures and Algorithms II – Exam 2 • O(n +K) when do we hit n? when do we hit k?

Chapt 5 – Randomized Algos


• Why do we need randomized algorithms? *When do we use them?
○ Why isn’t it O(n) and O(k)?
○ Randomization helps us avoid worst case running times where the
• Be able to analyze the following:
○ Counting sort
data may arrive or be processed in an order that is not
advantageous. Since we have no idea of the distribution of data • Stable sorting. How does counting sort give a stable sort?
when it arrives, we have no control over the order. However, by • Radix sort
randomizing the data as a course of processing it, we nearly • Bucket sort
always avoid the worst case (and the best case  ) and tend  When do you hit worst case?
towards an average case.

Logb an = nlogba
• Linearity of expectation-the expectation of the sum of two random
variables is the sum of their expectations, that is:
E[X+Y] = E[X] + E[Y]
Whenever E[X] and Y[X] are defined. It is the key property that
enables us to perform probabilistic analysis.
Chapt 6 – Heapsort
• What are the properties of a heap/max-heap (think in terms of d-ary,
not just binary)?
○ We can represent a d-ary heap in a 1-dimensional array as
follows. The root is in A[1], its d children reside in order in A[2]
through A[d +1], their children reside in order in A[d +2] through
A[d2 + d +1], and so on. The following two procedures map a
node with index i to its parents and to its jth child (for 1 <= j <=
d)
○ How do you find the parent of a child?
D-ary Parent (i)= floor[ (i-2 ) / d +1 ]
Binary parent(i)=floor[ i/2]
○ How do you find the child of a parent?
D-ary child (i , j)= d( i-1 ) + j + 1
Binary left child(i)=2i and right child(i)= 2i+1
Remember: d-ary-parent(d-ary-child( i, j )) = i , for any 1<=j<=d.
○ What is the height of a heap?
Since each node has d children, the height of a d-ary heap with n
nodes is theta(logd n) = theta(lg n / lg d)
 In the binary case: height of a node in a heap is the number of
edges on the longest simple downward path from the node to a
lead; the height of the heap is the height of its root. Since a heap
of n elements is based on a complete binary tree, its height is
theta(lg n)
○ How is a heap stored in an array?
• Understand when to use:
○ Heapsort: runs in O(n lgn), sorts an array in place. Calls build-
max-heap to create a max-heap, then runs through the heap to
order the elements in the array. Since the sort may move a new
key to the root that violates the max-heap property, it also runs
max-heapify.
○ Build-Max-Heap: runs in linear time, produces a max-heap from
an unordered input array. Implements max-heapify to order
elements in the heap.
○ Max-Heapify: runs in O(lg n), key to maintaining the max-heap
property.
• How do you implement a heap in a priority queue? – page 166
○ Priority queues rely on the following functions: MAX-HEAP-INSERT,
HEAP-EXTRACT-MAX, HEAP-INCREASE-KEY HEAP-MAXIMUM. Each
of these operations runs in O(lg n)
• Understand how to convert heap to recurrence in order to analyze
with Master Method.
Chapt 7 – Quicksort
○ Be able to sort array using pivot/quicksort. What happens in
selecting a good pivot.
○ What is best case running time: in the most even possible split,
partition produces two subproblems, each of size no more than
n/2. Runtime here is T(n)=2T(n/2) + theta(n)= theta(nlgn) [by
Master Method case 2].
○ Worst case: occurs when the partitioning routine produces one
subproblem with n1 elements and one with 0 elements. Runtime:
T(n) = T(n-1) + T(0) + theta(n) = T(n-1) + theta (n) = theta(n2) .
This occurs when the input array is already completely sorted.
○ Average case matches the best case, and occurs
whenever the split has constant proportionality. The
randomized variant carries this further; thus, expected
running time of random-quicksort is theta(nlgn).
• Review PARTITION algo.- runs in theta(n), where n = r – p
+1
○ Partition always selects an element x=A[r] as a pivot
element around which to partition the subarray A[p..r]
Chapt 8 – Sort in linear time
• Linear sort is only available when we know that the range of
our inputs will be <= the number of inputs.

Anda mungkin juga menyukai