Anda di halaman 1dari 10

World Applied Programming, Vol (1), No (1), April 2011.

62-71
ISSN: 2222-2510
2011 WAP journal. www.waprogramming.com

Popular sorting algorithms


C. Canaan *

M. S. Garai

M. Daya

Information institute
Chiredzi, Zimbabwe
canaancan@gmail.com

Information institute
Chiredzi, Zimbabwe
mat.s.g@mail.com

Information institute
Chiredzi, Zimbabwe
d2020m@yahoo.com

Abstract: Here we want to introduce some of sorting algorithms. So, we begin with Bubble sort
and will continue with Selection sort, Insertion sort, Shell sort, Merge sort, Heapsort, Quicksort
and Bucket sort. These are the most popular sorting algorithms. All of these algorithms are
perfectly described during the paper.
Key word: Bubble sort Selection sort
Quicksort Bucket sort
I.

Insertion sort

Shell sort

Merge sort

Heapsort,

INTRODUCTION

In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order.
The most-used orders are numerical order and lexicographical order. Efficient sorting is important for
optimizing the use of other algorithms (such as search and merge algorithms) that require sorted lists to
work correctly; it is also often useful for canonicalizing data and for producing human-readable output.
More formally, the output must satisfy two conditions:
1. The output is in nondecreasing order (each element is no smaller than the previous element
according to the desired total order);
2. The output is a permutation, or reordering, of the input.
Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due
to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort
was analyzed as early as 1956.[1] Although many consider it a solved problem, useful new sorting
algorithms are still being invented (for example, library sort was first published in 2004). Sorting
algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for
the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation,
divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case
analysis, time-space tradeoffs, and lower bounds.
II.

POPULAR SORTING ALGORITHMS

In this section we are going to list some of the most popular sorting algorithms and describe them.
They are: Bubble sort, Selection sort, Insertion sort, Shell sort, Merge sort, Heapsort, Quicksort and
Bucket sort.

Bubble sort
Bubble sort, also known as sinking sort, is a simple sorting algorithm that works by repeatedly
stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they
are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates
that the list is sorted. The algorithm gets its name from the way smaller elements "bubble" to the top of
the list. Because it only uses comparisons to operate on elements, it is a comparison sort. The equally

62

C. Canaan et al., World Applied Programming, Vol (1), No (1), April 2011.

simple insertion sort has better performance than bubble sort, so some have suggested no longer teaching
the bubble sort [2] [3].
Bubble sort has worst-case and average complexity both (n2), where n is the number of items being
sorted. There exist many sorting algorithms with substantially better worst-case or average complexity of
O(n log n). Even other (n2) sorting algorithms, such as insertion sort, tend to have better performance
than bubble sort. Therefore, bubble sort is not a practical sorting algorithm when n is large.
The only significant advantage that bubble sort has over most other implementations, even quicksort,
but not insertion sort, is that the ability to detect that the list is sorted is efficiently built into the algorithm.
Performance of bubble sort over an already-sorted list (best-case) is O(n). By contrast, most other
algorithms, even those with better average-case complexity, perform their entire sorting process on the set
and thus are more complex. However, not only does insertion sort have this mechanism too, but it also
performs better on a list that is substantially sorted (having a small number of inversions).
The positions of the elements in bubble sort will play a large part in determining its performance.
Large elements at the beginning of the list do not pose a problem, as they are quickly swapped. Small
elements towards the end, however, move to the beginning extremely slowly. This has led to these types
of elements being named rabbits and turtles, respectively.
Various efforts have been made to eliminate turtles to improve upon the speed of bubble sort.
Cocktail sort achieves this goal fairly well, but it retains O(n2) worst-case complexity. Comb sort
compares elements separated by large gaps, and can move turtles extremely quickly before proceeding to
smaller and smaller gaps to smooth out the list. Its average speed is comparable to faster algorithms like
quicksort. The algorithm can be expressed as:
procedure bubbleSort( A : list of sortable items )
do
swapped = false
for each i in 1 to length(A) - 1 inclusive do:
if A[i-1] > A[i] then
swap( A[i-1], A[i] )
swapped = true
end if
end for
while swapped
end procedure

Selection sort
Selection sort is a sorting algorithm, specifically an in-place comparison sort. It has O(n2)
complexity, making it inefficient on large lists, and generally performs worse than the similar insertion
sort. Selection sort is noted for its simplicity, and also has performance advantages over more
complicated algorithms in certain situations.
Selection sort is not difficult to analyze compared to other sorting algorithms since none of the loops
depend on the data in the array. Selecting the lowest element requires scanning all n elements (this takes n
1 comparisons) and then swapping it into the first position. Finding the next lowest element requires
scanning the remaining n 1 elements and so on, for (n 1) + (n 2) + ... + 2 + 1 = n(n 1) / 2 (n2)
comparisons. Each of these scans requires one swap for n 1 elements (the final element is already in
place).
Selection Sort's philosophy most closely matches human intuition: It finds the largest element and
puts it in its place. Then it finds the next largest and places it and so on until the array is sorted. To put an
element in its place, it trades positions with the element in that location (this is called a swap). As a result,

63

C. Canaan et al., World Applied Programming, Vol (1), No (1), April 2011.

the array will have a section that is sorted growing from the end of the array and the rest of the array will
remain unsorted [4].
Algorithm is as follow [5]:
for i 1 to n-1 do
min j i;
min x A[i]
for j i + 1 to n do
If A[j] < min x then
min j j
min x A[j]
A[min j] A [i]
A[i] min x

Insertion sort
Insertion sort is a simple sorting algorithm: a comparison sort in which the sorted array (or list) is built
one entry at a time. It is much less efficient on large lists than more advanced algorithms such as
quicksort, heapsort, or merge sort. However, insertion sort provides several advantages:

Simple implementation
Efficient for (quite) small data sets
Adaptive, i.e. efficient for data sets that are already substantially sorted: the time complexity is O(n +
d), where d is the number of inversions
More efficient in practice than most other simple quadratic, i.e. O(n2) algorithms such as selection
sort or bubble sort; the best case (nearly sorted input) is O(n)
Stable, i.e. does not change the relative order of elements with equal keys
In-place, i.e. only requires a constant amount O(1) of additional memory space
Online, i.e. can sort a list as it receives it
Most humans when sortingordering a deck of cards, for exampleuse a method that is similar to
insertion sort.[6]

Every repetition of insertion sort removes an element from the input data, inserting it into the correct
position in the already-sorted list, until no input elements remain. The choice of which element to remove
from the input is arbitrary, and can be made using almost any choice algorithm.
Sorting is typically done in-place. The resulting array after k iterations has the property where the
first k + 1 entries are sorted. In each iteration the first remaining entry of the input is removed, inserted
into the result at the correct position, thus extending the result:

Becomes

with each element greater than x copied to the right as it is compared against x.
The most common variant of insertion sort, which operates on arrays, can be described as follows:

64

C. Canaan et al., World Applied Programming, Vol (1), No (1), April 2011.

1.

2.

Suppose that there exists a function called Insert designed to insert a value into a sorted sequence
at the beginning of an array. It operates by beginning at the end of the sequence and shifting each
element one place to the right until a suitable position is found for the new element. The function
has the side effect of overwriting the value stored immediately after the sorted sequence in the
array.
To perform an insertion sort, begin at the left-most element of the array and invoke Insert to
insert each element encountered into its correct position. The ordered sequence into which the
element is inserted is stored at the beginning of the array in the set of indices already examined.
Each insertion overwrites a single value: the value being inserted.

Below is the pseudocode for insertion sort for a zero-based array:


for j 1 to length(A)-1
key A[ j ]
> A[ j ] is added in the sorted sequence A[1, .. j-1]
i j - 1
while i >= 0 and A [ i ] > key
A[ i +1 ] A[ i ]
i i -1
A [i +1] key

Shell sort
Shell sort is a sorting algorithm, devised by Donald Shell in 1959, that is a generalization of insertion
sort, which exploits the fact that insertion sort works efficiently on input that is already almost sorted. It
improves on insertion sort by allowing the comparison and exchange of elements that are far apart. The
last step of Shell sort is a plain insertion sort, but by then, the array of data is guaranteed to be almost
sorted.
The algorithm is an example of an algorithm that is simple to code but difficult to analyze
theoretically.
Although Shell sort is easy to code, analyzing its performance is very difficult and depends on the
choice of increment sequence. The algorithm was one of the first to break the quadratic time barrier, but
this fact was not proven until some time after its discovery.[4]
The initial increment sequence suggested by Donald Shell was [1,2,4,8,16,...,2k], but this is a very
poor choice in practice because it means that elements in odd positions are not compared with elements in
even positions until the very last step. The original implementation performs O(n2) comparisons and
exchanges in the worst case [7]. A simple change, replacing 2k with 2k-1, improves the worst-case running
time to O(N3/2) [8], a bound that cannot be improved.[9]
A minor change given in V. Pratt's book[9] improved the bound to O(n log2 n). This is worse than the
optimal comparison sorts, which are O(n log n), but lends itself to sorting networks and has the same
asymptotic gate complexity as Batcher's bitonic sorter.
Consider a small value that is initially stored in the wrong end of the array. Using an O(n2) sort such
as bubble sort or insertion sort, it will take roughly n comparisons and exchanges to move this value all
the way to the other end of the array. Shell sort first moves values using giant step sizes, so a small value
will move a long way towards its final position, with just a few comparisons and exchanges.
One can visualize Shell sort in the following way: arrange the list into a table and sort the columns
(using an insertion sort). Repeat this process, each time with smaller number of longer columns. At the

65

C. Canaan et al., World Applied Programming, Vol (1), No (1), April 2011.

end, the table has only one column. While transforming the list into a table makes it easier to visualize,
the algorithm itself does its sorting in-place (by incrementing the index by the step size, i.e. using i +=
step_size instead of i++).
The principle of Shell sort is to rearrange the file so that looking at every hth element yields a sorted
file. We call such a file h-sorted. If the file is then k-sorted for some other integer k, then the file remains
h-sorted [7]. For instance, if a list was 5-sorted and then 3-sorted, the list is now not only 3-sorted, but
both 5- and 3-sorted. If this were not true, the algorithm would undo work that it had done in previous
iterations, and would not achieve such a low running time.
The algorithm draws upon a sequence of positive integers known as the increment sequence. Any
sequence will do, as long as it ends with 1, but some sequences perform better than others [8]. The
algorithm begins by performing a gap insertion sort, with the gap being the first number in the increment
sequence. It continues to perform a gap insertion sort for each number in the sequence, until it finishes
with a gap of 1. When the increment reaches 1, the gap insertion sort is simply an ordinary insertion sort,
guaranteeing that the final list is sorted. Beginning with large increments allows elements in the file to
move quickly towards their final positions, and makes it easier to subsequently sort for smaller
increments [7].
Although sorting algorithms exist that are more efficient, Shell sort remains a good choice for
moderately large files because it has good running time and is easy to code.
The following is an implementation of Shell sort written in pseudocode. The increment sequence is a
geometric sequence in which every term is roughly 2.2 times smaller than the previous one:
input: an array a of length n with array elements numbered 0 to n 1
inc round(n/2)
while inc > 0 do:
for i = inc .. n 1 do:
temp a[i]
j i
while j inc and a[j inc] > temp do:
a[j] a[j inc]
j j inc
a[j] temp
inc round(inc / 2.2)

Merge sort
Merge sort is an O(n log n) comparison-based sorting algorithm. Most implementations produce a
stable sort, meaning that the implementation preserves the input order of equal elements in the sorted
output. It is a divide and conquer algorithm. Merge sort was invented by John von Neumann in 1945 [10].
In sorting n objects, merge sort has an average and worst-case performance of O(n log n). If the
running time of merge sort for a list of length n is T(n), then the recurrence T(n) = 2T(n/2) + n follows
from the definition of the algorithm (apply the algorithm to two lists of half the size of the original list,
and add the n steps taken to merge the resulting two lists). The closed form follows from the master
theorem.
In the worst case, merge sort does an amount of comparisons equal to or slightly smaller than
(n lg n - 2lg n + 1), which is between (n lg n - n + 1) and (n lg n + n + O(lg n)) [11].

66

C. Canaan et al., World Applied Programming, Vol (1), No (1), April 2011.

For large n and a randomly ordered input list, merge sort's expected (average) number of
comparisons approaches n fewer than the worst case where

The worst case, merge sort does about 39% fewer comparisons than quicksort does in the average
case; merge sort always makes fewer comparisons than quicksort, except in extremely rare cases, when
they tie, where merge sort's worst case is found simultaneously with quicksort's best case. In terms of
moves, merge sort's worst case complexity is O(n log n)the same complexity as quicksort's best case,
and merge sort's best case takes about half as many iterations as the worst case.
Recursive implementations of merge sort make 2n 1 method calls in the worst case, compared to
quicksort's n, thus merge sort has roughly twice as much recursive overhead as quicksort. However,
iterative, non-recursive implementations of merge sort, avoiding method call overhead, are not difficult to
code. Merge sort's most common implementation does not sort in place; therefore, the memory size of the
input must be allocated for the sorted output to be stored in.
Merge sort as described here also has an often overlooked, but practically important, best-case
property. If the input is already sorted, its complexity falls to O(n). Specifically, n-1 comparisons and
zero moves are performed, which is the same as for simply running through the input, checking if it is
pre-sorted.
Sorting in-place is possible (e.g., using lists rather than arrays) but is very complicated, and will offer
little performance gains in practice, even if the algorithm runs in O(n log n) time. (Katajainen, Pasanen &
Teuhola 1996) In these cases, algorithms like heapsort usually offer comparable speed, and are far less
complex. Additionally, unlike the standard merge sort, in-place merge sort is not a stable sort. In the case
of linked lists the algorithm does not use more space than that the already used by the list representation,
but the O(log(k)) used for the recursion trace.
Merge sort is more efficient than quick sort for some types of lists if the data to be sorted can only be
efficiently accessed sequentially, and is thus popular in languages such as Lisp, where sequentially
accessed data structures are very common. Unlike some (efficient) implementations of quicksort, merge
sort is a stable sort as long as the merge operation is implemented properly.
As can be seen from the procedure merge sort, there are some demerits. One complaint we might
raise is its use of 2n locations; the additional n locations were needed because one couldn't reasonably
merge two sorted sets in place. But despite the use of this space the algorithm must still work hard: The
contents of m are first copied into left and right and later into the list result on each invocation of
merge_sort (variable names according to the pseudocode above). An alternative to this copying is to
associate a new field of information with each key (the elements in m are called keys). This field will be
used to link the keys and any associated information together in a sorted list (a key and its related
information is called a record). Then the merging of the sorted lists proceeds by changing the link values;
no records need to be moved at all. A field which contains only a link will generally be smaller than an
entire record so less space will also be used.
Another alternative for reducing the space overhead to n/2 is to maintain left and right as a combined
structure, copy only the left part of m into temporary space, and to direct the merge routine to place the
merged output into m. With this version it is better to allocate the temporary space outside the merge
routine, so that only one allocation is needed. The excessive copying mentioned in the previous paragraph
is also mitigated, since the last pair of lines before the return result statement (function merge in the
pseudo code above) become superfluous.

67

C. Canaan et al., World Applied Programming, Vol (1), No (1), April 2011.

function merge_sort(m)
if length(m) 1
return m
var list left, right, result
var integer middle = length(m) / 2
for each x in m up to middle
add x to left
for each x in m after middle
add x to right
left = merge_sort(left)
right = merge_sort(right)
result = merge(left, right)
return result

Heapsort
Heapsort is a comparison-based sorting algorithm, and is part of the selection sort family. Although
somewhat slower in practice on most machines than a well implemented quicksort, it has the advantage of
a more favorable worst-case O(n log n) runtime. Heapsort is an in-place algorithm, but is not a stable sort.
Heapsort begins by building a heap out of the data set, and then removing the largest item and
placing it at the end of the partially sorted array. After removing the largest item, it reconstructs the heap,
removes the largest remaining item, and places it in the next open position from the end of the partially
sorted array. This is repeated until there are no items left in the heap and the sorted array is full.
Elementary implementations require two arrays - one to hold the heap and the other to hold the sorted
elements.
Heapsort inserts the input list elements into a binary heap data structure. The largest value (in a maxheap) or the smallest value (in a min-heap) are extracted until none remain, the values having been
extracted in sorted order. The heap's invariant is preserved after each extraction, so the only cost is that of
extraction.
During extraction, the only space required is that needed to store the heap. To achieve constant space
overhead, the heap is stored in the part of the input array not yet sorted. (The storage of heaps as arrays is
diagrammed at Binary heap#Heap implementation.)
Heapsort uses two heap operations: insertion and root deletion. Each extraction places an element in
the last empty location of the array. The remaining prefix of the array stores the unsorted elements.
The following is the "simple" way to implement the algorithm in pseudocode. Arrays are zero based
and swap is used to exchange two elements of the array. Movement 'down' means from the root towards
the leaves, or from lower indices to higher. Note that during the sort, the smallest element is at the root of
the heap at a[0], while at the end of the sort, the largest element is in a[end].

function heapSort(a, count) is


input: an unordered array a of length count
(first place a in max-heap order)
heapify(a, count)
end := count-1 //in languages with zero-based arrays the children are 2*i+1 and
2*i+2
while end > 0 do
(swap the root(maximum value) of the heap with the last element of the heap)
swap(a[end], a[0])
(put the heap back in max-heap order)

68

C. Canaan et al., World Applied Programming, Vol (1), No (1), April 2011.

siftDown(a, 0, end-1)
(decrease the size of the heap by one so that the previous max value will
stay in its proper placement)
end := end - 1
function heapify(a, count) is
(start is assigned the index in a of the last parent node)
start := count / 2 - 1
while start 0 do
(sift down the node at index start to the proper place such that all nodes
below
the start index are in heap order)
siftDown(a, start, count-1)
start := start - 1
(after sifting down the root all nodes/elements are in heap order)
function siftDown(a, start, end) is
input: end represents the limit of how far down the heap
to sift.
root := start
while root * 2 + 1 end do
(While the root has at least one child)
child := root * 2 + 1
(root*2 + 1 points to the left child)
swap := root
(keeps track of child to swap with)
(check if root is smaller than left child)
if a[swap] < a[child]
swap := child
(check if right child exists, and if it's bigger than what we're currently
swapping with)
if child < end and a[swap] < a[child+1]
swap := child + 1
(check if we need to swap at all)
if swap != root
swap(a[root], a[swap])
root := swap
(repeat to continue sifting down the child now)
else
return

Quicksort
Quicksort is a sorting algorithm developed by C. A. R. Hoare that, on average, makes O(nlogn) (big
O notation) comparisons to sort n items. In the worst case, it makes O(n2) comparisons, though if
implemented correctly this behavior is rare. Typically, quicksort is significantly faster in practice than
other O(nlogn) algorithms, because its inner loop can be efficiently implemented on most architectures,
and in most real-world data it is possible to make design choices that minimize the probability of
requiring quadratic time. Additionally, quicksort tends to make excellent usage of the memory hierarchy,
taking perfect advantage of virtual memory and available caches. Although quicksort is usually not
implemented as an in-place sort, it is possible to create such an implementation.[12]
Quicksort (also known as "partition-exchange sort") is a comparison sort and, in efficient
implementations, is not a stable sort.
Quicksort sorts by employing a divide and conquer strategy to divide a list into two sub-lists.The
steps are:
1.
2.

Pick an element, called a pivot, from the list.


Reorder the list so that all elements with values less than the pivot come before the pivot, while
all elements with values greater than the pivot come after it (equal values can go either way).
After this partitioning, the pivot is in its final position. This is called the partition operation.

69

C. Canaan et al., World Applied Programming, Vol (1), No (1), April 2011.

3.

Recursively sort the sub-list of lesser elements and the sub-list of greater elements.

The bases cases of the recursion are lists of size zero or one, which never need to be sorted.
Algorithm is listed bellow:
function quicksort(array)
var list less, greater
if length(array) 1
return array // an array of zero or one elements is already sorted
select and remove a pivot value pivot from array
for each x in array
if x pivot then append x to less
else append x to greater
return concatenate(quicksort(less), pivot, quicksort(greater))

Bucket sort
Bucket sort, or bin sort, is a sorting algorithm that works by partitioning an array into a number of
buckets. Each bucket is then sorted individually, either using a different sorting algorithm, or by
recursively applying the bucket sorting algorithm. It is a distribution sort, and is a cousin of radix sort in
the most to least significant digit flavour. Bucket sort is a generalization of pigeonhole sort. Since bucket
sort is not a comparison sort, the O(n log n) lower bound is inapplicable. The computational complexity
estimates involve the number of buckets.
Bucket sort works as follows:
1.
2.
3.
4.

Set up an array of initially empty "buckets."


Scatter: Go over the original array, putting each object in its bucket.
Sort each non-empty bucket.
Gather: Visit the buckets in order and put all elements back into the original array.

Algorithm of Bucket sort is listed bellow:


function bucket-sort(array, n) is
buckets new array of n empty lists
for i = 0 to (length(array)-1) do
insert array[i] into buckets[msbits(array[i], k)]
for i = 0 to n - 1 do
next-sort(buckets[i])
return the concatenation of buckets[0], ..., buckets[n-1]

III.

CONCLUSION

In this paper, we got into sorting problem and investigated different solutions. We talked about the
most popular algorithms that are useful for sorting lists. They are: Bubble sort, Selection sort, Insertion
sort, Shell sort, Merge sort, Heapsort, Quicksort and Bucket sort. Algorithms were represented with perfect
descriptions. Also, it was tried to indicate the computational complexity of them in the worst, middle and
best cases. At the end, implementation code was placed.
REFERENCES
[1]
[2]
[3]
[4]
[5]

Wikipedia. Address: http://www.wikipedia.com


Owen Astrachan. Bubble Sort: An Archaeological Algorithmic Analysis. SIGCSE 2003 Hannan Akhtar. Available at:
http://www.cs.duke.edu/~ola/papers/bubble.pdf.
Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Second Edition. Addison-Wesley, 1998.
ISBN 0-201-89685-0. Pages 106110 of section 5.2.2: Sorting by Exchanging.
Available at: http://webspace.ship.edu/cawell/Sorting/selintro.htm
Available at: http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/Sorting/selectionSort.htm

70

C. Canaan et al., World Applied Programming, Vol (1), No (1), April 2011.

[6]
[7]
[8]
[9]

Robert Sedgewick, Algorithms, Addison-Wesley 1983 (chapter 8 p. 95)


Sedgewick, Robert (1998). Algorithms in C. Addison Wesley. pp. 273279.
Weiss, Mark Allen (1997). Data Structures and Algorithm Analysis in C. Addison Wesley Longman. pp. 222226.
Pratt, V (1979). Shellsort and sorting networks (Outstanding dissertations in the computer sciences). Garland. ISBN 0-82404406-1. (This was originally presented as the author's Ph.D. thesis, Stanford University, 1971)
[10] Merge Sort - Wolfram MathWorld, available at: http://mathworld.wolfram.com/MergeSort.html
[11] The worst case number given here does not agree with that given in Knuth's Art of Computer Programming, Vol 3. The
discrepancy is due to Knuth analyzing a variant implementation of merge sort that is slightly sub-optimal.
[12] R. Sedgewick, Implementing quicksort programs, Comm. ACM, 21(10):847-857, 1978. Implementing Quicksort Programs,
available:
http://delivery.acm.org/10.1145/360000/359631/p847sedgewick.pdf?key1=359631&key2=9191985921&coll=DL&dl=ACM&CFID=6618157&CFTOKEN=73435998

71

Anda mungkin juga menyukai