Anda di halaman 1dari 6

2.4. An Anagram Detection Example Problem Solving with Algori...

6/11/17, 2:56 PM

2.4. An Anagram Detection Example


A good example problem f or show ing algorithms w ith diff erent orders of magnitude is the classic
anagram detection problem f or strings. One string is an anagram of another if the second is simply a
rearrangement of the f irst. For example, 'heart' and 'earth' are anagrams. The strings 'python'
and 'typhon' are anagrams as w ell. For the sake of simplicity, w e w ill assume that the tw o strings in
question are of equal length and that they are made up of symbols f rom the set of 26 low ercase
alphabetic characters. Our goal is to w rite a boolean f unction that w ill take tw o strings and return
w hether they are anagrams.

2.4.1. Solution 1: Checking Off


Our f irst solution to the anagram problem w ill check to see that each character in the f irst string actually
occurs in the second. If it is possible to checkoff each character, then the tw o strings must be
anagrams. Checking off a character w ill be accomplished by replacing it w ith the special Python value
None . How ever, since strings in Python are immutable, the f irst step in the process w ill be to convert
the second string to a list. Each character f rom the f irst string can be checked against the characters in
the list and if f ound, checked off by replacement. ActiveCode 1 show s this f unction.

Run Load History Show CodeLens

1 def anagramSolution1(s1,s2):
2 alist = list(s2)
3
4 pos1 = 0
5 stillOK = True
6
7 while pos1 < len(s1) and stillOK:
8 pos2 = 0
9 found = False
10 while pos2 < len(alist) and not found:
11 if s1[pos1] == alist[pos2]:
12 found = True
13 else:
14 pos2 = pos2 + 1
15
16 if found:
17 alist[pos2] = None
18 else:
19 stillOK = False
20
21 pos1 = pos1 + 1
22
23 return stillOK
24

ActiveCode: 1 Checking Off (active5)

1 of 6
2.4. An Anagram Detection Example Problem Solving with Algori...
6/11/17, 2:56 PM

To analyze this algorithm, w e need to note that each of the n characters in s1 w ill cause an iteration
through up to n characters in the list f rom s2 . Each of the n positions in the list w ill be visited once to
match a character f rom s1 . The number of visits then becomes the sum of the integers f rom 1 to n. We
stated earlier that this can be w ritten as
n
n(n + 1)
i =
i=1
2
1 1
= n2 + n
2 2
1
As n gets large, the n2 term w ill dominate the n term and the 2 can be ignored. Theref ore, this solution
is O(n2 ).

2.4.2. Solution 2: Sort and Compare


Another solution to the anagram problem w ill make use of the f act that even though s1 and s2 are
diff erent, they are anagrams only if they consist of exactly the same characters. So, if w e begin by
sorting each string alphabetically, f rom a to z, w e w ill end up w ith the same string if the original tw o
strings are anagrams. ActiveCode 2 show s this solution. Again, in Python w e can use the built-in sort
method on lists by simply converting each string to a list at the start.

Run Load History Show CodeLens

1 def anagramSolution2(s1,s2):
2 alist1 = list(s1)
3 alist2 = list(s2)
4
5 alist1.sort()
6 alist2.sort()
7
8 pos = 0
9 matches = True
10
11 while pos < len(s1) and matches:
12 if alist1[pos]==alist2[pos]:
13 pos = pos + 1
14 else:
15 matches = False
16
17 return matches
18
19 print(anagramSolution2('abcde','edcba'))
20

ActiveCode: 2 Sort and Compare (active6)

At f irst glance you may be tempted to think that this algorithm is O(n) , since there is one simple iteration

2 of 6
2.4. An Anagram Detection Example Problem Solving with Algori...
6/11/17, 2:56 PM

to compare the n characters af ter the sorting process. How ever, the tw o calls to the Python sort
method are not w ithout their ow n cost. As w e w ill see in a later chapter, sorting is typically either
O(n2 ) or O(n log n), so the sorting operations dominate the iteration. In the end, this algorithm w ill
have the same order of magnitude as that of the sorting process.

2.4.3. Solution 3: Brute Force


A brute force technique f or solving a problem typically tries to exhaust all possibilities. For the
anagram detection problem, w e can simply generate a list of all possible strings using the characters
f rom s1 and then see if s2 occurs. How ever, there is a diff iculty w ith this approach. When generating
all possible strings f rom s1 , there are n possible f irst characters, n 1 possible characters f or the
second position, n 2 f or the third, and so on. The total number of candidate strings is
n (n 1) (n 2). . . 3 2 1, w hich is n!. Although some of the strings may be duplicates, the
program cannot know this ahead of time and so it w ill still generate n! diff erent strings.

It turns out that n! grow s even f aster than 2n as n gets large. In f act, if s1 w ere 20 characters long,
there w ould be 20! = 2, 432, 902, 008, 176, 640, 000 possible candidate strings. If w e processed one
possibility every second, it w ould still take us 77,146,816,596 years to go through the entire list. This is
probably not going to be a good solution.

2.4.4. Solution 4: Count and Compare


Our f inal solution to the anagram problem takes advantage of the f act that any tw o anagrams w ill have
the same number of as, the same number of bs, the same number of cs, and so on. In order to decide
w hether tw o strings are anagrams, w e w ill f irst count the number of times each character occurs.
Since there are 26 possible characters, w e can use a list of 26 counters, one f or each possible
character. Each time w e see a particular character, w e w ill increment the counter at that position. In the
end, if the tw o lists of counters are identical, the strings must be anagrams. ActiveCode 3 show s this
solution.

Run Load History Show CodeLens

1 def anagramSolution4(s1,s2):
2 c1 = [0]*26
3 c2 = [0]*26
4
5 for i in range(len(s1)):
6 pos = ord(s1[i])-ord('a')
7 c1[pos] = c1[pos] + 1
8
9 for i in range(len(s2)):
10 pos = ord(s2[i])-ord('a')
11 c2[pos] = c2[pos] + 1
12
13 j = 0
14 stillOK = True
15 while j<26 and stillOK:
16 if c1[j]==c2[j]:
17 j = j + 1

3 of 6
2.4. An Anagram Detection Example Problem Solving with Algori...
6/11/17, 2:56 PM

18 else:
19 stillOK = False
20
21 return stillOK
22
23 print(anagramSolution4('apple','pleap'))
24

ActiveCode: 3 Count and Compare (active7)

Again, the solution has a number of iterations. How ever, unlike the f irst solution, none of them are
nested. The f irst tw o iterations used to count the characters are both based on n. The third iteration,
comparing the tw o lists of counts, alw ays takes 26 steps since there are 26 possible characters in the
strings. Adding it all up gives us T (n) = 2n + 26 steps. That is O(n) . We have f ound a linear order of
magnitude algorithm f or solving this problem.

Bef ore leaving this example, w e need to say something about space requirements. Although the last
solution w as able to run in linear time, it could only do so by using additional storage to keep the tw o
lists of character counts. In other w ords, this algorithm sacrif iced space in order to gain time.

This is a common occurrence. On many occasions you w ill need to make decisions betw een time and
space trade-off s. In this case, the amount of extra space is not signif icant. How ever, if the underlying
alphabet had millions of characters, there w ould be more concern. As a computer scientist, w hen given
a choice of algorithms, it w ill be up to you to determine the best use of computing resources given a
particular problem.

Se lf Che ck

Q-1: Given the f ollow ing code f ragment, w hat is its Big-O running time?

4 of 6
2.4. An Anagram Detection Example Problem Solving with Algori...
6/11/17, 2:56 PM

test = 0
for i in range(n):
for j in range(n):
test = test + i * j

(A) O(n)

(B) O(n^2)

(C) O(log n)

(D) O(n^3)

Check Me

Q-2: Given the f ollow ing code f ragment w hat is its Big-O running time?

test = 0
for i in range(n):
test = test + 1

for j in range(n):
test = test - 1

(A) O(n)

(B) O(n^2)

(C) O(log n)

(D) O(n^3)

Check Me

Q-3: Given the f ollow ing code f ragment w hat is its Big-O running time?

i = n
while i > 0:
k = 2 + 2
i = i // 2

(A) O(n)

(B) O(n^2)

(C) O(log n)

5 of 6
2.4. An Anagram Detection Example Problem Solving with Algori...
6/11/17, 2:56 PM

(D) O(n^3)

Check Me

6 of 6

Anda mungkin juga menyukai