CPSC 201 -- Introduction to Programming Solutions to Assignment 7 ------------------------- Written by Paul Hudak March 26, 2008 > module Sort where > import Random Problem 1 --------- a) Lower bound: O(n) (must look at each element) Upper bound: O(1.5n) (recall class discussion) b) Lower bound: O(n+m) (must look at each element in list and pattern) Upper bound: O(m*n), using the most obvious algorithm, but see Chapter 61 in the Omnibus, which describes the "Boyer-Moore algorithm," which achieves the lower bound. c) Lower bound: O(1) is the best I can come up with! However, if table lookups are prohibited, and the only operation allowed is multiplication, then O(log n) is a lower bound. Upper bound: O(n) using the obvious algorithm. But O(log n) is also easy, taking into account the fact that if n is even, exp(x,n) = y*y where y = exp(x,n/2). Problem 2 --------- Heapsort -------- Datatype for a heap: > data Heap a = Empty > | Branch (Heap a) a (Heap a) > deriving Show > leaf :: a -> Heap a > leaf x = Branch Empty x Empty The main sorting function: > heapSort :: Ord a => [a] -> [a] > heapSort = sortH . initH Initialize the heap: > initH :: Ord a => [a] -> Heap a > initH = foldl (flip insert) Empty > insert :: Ord a => a -> Heap a -> Heap a > insert x Empty = leaf x > insert x (Branch left y right) = > if x else Branch right x (insert y left) Sort the heap: > sortH :: Ord a => Heap a -> [a] > sortH Empty = [] > sortH (Branch left x right) = x : sortH (promote left right) > promote :: Ord a => Heap a -> Heap a -> Heap a > promote Empty right = right > promote left Empty = left > promote l@(Branch l1 x1 r1) r@(Branch l2 x2 r2) = > if x1>x2 then Branch (promote l1 r1) x1 r > else Branch l x2 (promote l2 r2) Complexity analysis: Counting only comparisons, the recurrence equations for insert are: C_i(0) = 0 C_i(n) = C(n/2) + 1 whose solution is log(n). Now recalling the behavior of foldl, note that initH is equivalent to: xn `insert` (...(x2 `insert` (x1 `insert` Empty))...) So the complexity of initH is bounded by: C_initH(n) = O(nlog(n)) For sortH, note that promote requires log(n) comparisons worst-case. But the heap is not guaranteed to be balanced after each element is drawn, so sortH is bounded by O(nlog(n)) as well. The overall complexity is given by: C(n) = C_initH(n) + C_sortH(n) = O(nlog(n)) + O(nlog(n)) = O(nlog(n)) In the best case the heap will remain balanced in the sortH phase, so things will be a little faster, but still in the O(nlog(n)) range. Thus the average case won't be much better either. Mergesort --------- > mergeSort :: Ord a => [a] -> [a] > mergeSort [] = [] > mergeSort [x] = [x] > mergeSort xs = > let (ys, zs) = splitAt (length xs `quot` 2) xs > in merge (mergeSort ys) (mergeSort zs) > merge :: Ord a => [a] -> [a] -> [a] > merge [] ys = ys > merge xs [] = xs > merge (x:xs) (y:ys) = > if x < y then x : merge xs (y:ys) > else y : merge (x:xs) ys Complexity analysis: Given equal length lists, the worst-case complexity of merge is O(n/2), where n is the sum of the length of the two lists. So, the worst-case complexity of mergeSort is given by: C(0) = 0 C(1) = 1 C(n) = 2C(n/2) + (n/2) < 2C(n/2) + n The solution to this is nlog(n). In the best case merge takes NO comparisons, which will reduce the overall complexity to O(n)! The average case is somewhere in between. Testing: -------- Create a list of random numbers: > seed :: Int > seed = 42 > rands :: [Int] > rands = take 1000 (randomRs (1,1000) (mkStdGen seed)) Check to be sure that initH creates a balanced tree: > height :: Heap a -> Int > height Empty = 0 > height (Branch l _ r) = 1 + max (height l) (height r) > t1 = height (initH rands) Sort the list: > t2 = heapSort rands > t3 = mergeSort rands