Session 04 - Paper 64

On order equivalences between normalized tree distance and similarity
Hector-Hugo Franco-Penya
1
, Gurpreet Singh
2
1
Trinitv College Dublin
2
Lovelv Professional Universitv
Abstract
Our previous work [1] exposes a theoretical comparison on brute-tree-distance and brute-tree-similarity algorithms
both producing Tai-mappings and showing three dierent kinds of dualities between them: A-duals if they can gener-
ate the same ranking of alignments for two given trees, N-duals, if they can reproduce the same ranking of aligments
of a given tree and a neigbourhood set of trees, and P-duals, if they can generate the ranking of pairs of trees for a
given set of trees. This article expands the previous work claiming that N-duality and P-duality are equivalent, and
showing that for normalized measures A-duality implies N-duality and P-duality.
Keywords: tree edit distance, tree edit similarity, pattern matching.
1. Background and previous work
This section is a review of our previous work which can be extensively found in [1].
1.1. Assumptions about equivalence of distance and similarity
The sequence and tree comparison measures are sometimes described as distances and sometimes as similari-
ties. We are concerned, in what follows, about rst distinguishing between these, and then with the question whether
orderings induced by a distance measure can be dualized by a similarity measure, and vice-versa. To some extent
this can be seen as applying the same kind of analysis to sequence and tree comparison measures as has been applied
to set and vector comparison measures. [2, 3, 4]
Statements such as the following the wikipedia entry on Needleman-Wunsh [5] :
Needleman and Wunsch formulated their problem in terms of maximizing similarity. Another possibility
is to minimize the edit distance between sequences, introduced by Vladimir Levenshtein. Peter H. Sellers
showed [6] that the two problems are equivalent.
reect the belief that distance and similarity are interchangeable notions. These kinds of comments are not uncommon
in the literature [7, 8, 9], and they give the impression that similarity and distance (on sequences and trees) are
straightforwardly interchangeable notions.
1.2. Tree edit distance
[10] introduced a criterion for matching nodes between tree representations, and [11, 12] developed an algorithm
that nds an optimal matching tree solution for a given pair of trees. The importance of the algorithm is that its
computational cost is O(t
1
t
2
min(depth(t
1
), leaves(t
1
))min(depth(t
2
), leaves(t
2
))) and its spatial cost is O(t
1
t
2
).
Proceedings oI International ConIerence on Computing Sciences
WILKES100 ICCS 2013
ISBN: 978-93-5107-172-3
461 Elsevier Publications, 2013
*
Corresponding author: Hector-Hugo Franco-Penya
A map f denes a correspondence between the set of nodes of a source tree and the set of nodes of a target tree.
The image of the node is : f () =
Figure 1: Map example
1.2.1. Tai mappings
A Tai map is a mapping between the nodes of two trees. It can be dene as:
S is the set of nodes of the source tree.
T is the set of nodes of the target tree.
f () is a map, f : S T where S is the domain and T is the range .
If ( f (a) = x) ( f (b) = y), the Tai mapping restrictions can be dened as:
1. One-to-one: One node cannot be matched with more than one node.
a = b i x = y.
2. Left-to-right order preserved: If a node a is on the left of b, they cannot swap after the mapping.
a < b i x < y.
3. Ancestor order preserved. ancestor(a, b) i ancestor(x, y).
1
Figure 2 shows a sample of Tai mapping(2a), and samples of impermissible Tai mappings where one-to-one node
is not preserved (2b), ancestry is not preserved (2c), and left to right order is not preserved (2d).
In order for assigning a score to a [10] mapping it is convenient to identify three sets:
A the (i, j) : the matches and swaps
1 the i S s.t. j T, (i, j) : the deletions
J the j T s.t. i S, (i, j) : the insertions
1
ancestor(a, b) should be read as: a is an ancestor of b.
Hector-Hugo Franco-Penyaa, Gurpreet Singh
(a) Tai map (b) Multiple node match
(c) Ancestry not preserved (d) Left to right order not preserved
Figure 2: Tai-mapping restrictions
Ais just the mapping of the S nodes into the nodes of T. 1and J are just the remaining nodes of S and T which
are not touched by the mapping. Let (.)
give the label of a node and let C
be a cost table, indexed by ,

where is the alphabet of labels, which assigns costs to A, 1and J according to
2
:
for (i, j) A cost is C
(i
, j
)
for i 1 cost is C
(i
, )
for j J cost is C
(, j
)
Where : S T is any 1-to-1 mapping from S into T, dene ( : S T) by:
Denition 1.1. Distance scoring of an alignment distance scoring of an alignment
( : S T) =
(i, j)A
C
(i
, j
) +
i1
C
(i
, ) +
jJ
C
(, j
)
From this costs of alignments, a distance score on tree pairs is dened by minimization:
2
Note that in this general setting even a pairing of two nodes with identical labels can make a non-zero cost contribution
Denition 1.2. Distance scoring of a tree pair S and T
The Tree- or Tai-distance (S, T) between two trees S and T is the minimum value of ( : S T) over possible
Tai-mappings from S to T, relative to a chosen cost table C
.
There is an illustration of the denitions in Figure 3.
a
a
b a b
b
c
a b
b
a
b
With
C
(x, ) = C
(, x) = 1,
C
(x, x) = 0, C
(x, y) = 1 for x y,
the alignment has score () = 3
and this is minimal for the given C
Figure 3: An illustration of tree distance.

(S, T) can be computed by the algorithm of [11]. Sequences can be encoded as vertical trees, and on this domain of
trees the tree distance coincides with a well known comparison measure on sequences, the (alphabet-weighted) string
edit distance [13, 14].
I had formulated the denition
3
in terms of costs applied to mappings which respect tree-ordering properties.
In contrast to this declarative perspective, there is a procedural denition via the notion of an edit-script of atomic
operations transforming S to T in a succession of stages. For both sequences and trees the mapping-based and script-
based notions coincide [13, 10, 15], and so I omitted further details of the denition via edit-scripts.
While the correctness of the Tai distance [11] algorithm
4
does not require the cost-table C
to satisfy any par-

ticular properties, some settings of C
clearly make little sense. The combination of deletion/insertion cost-entries

which are negative (C
(x, ) < 0, C
(, y) < 0) with swap/match cost entries which are not negative gives the coun-
terintuitive eect that a supertree of S is closer ( in the sense of having a lower score ) to S than S itself
5
. This is
a rationale for the following non-negativity assumption:
x, y (C
(x, y) 0, C
(x, ) 0, C
(, y) 0) (1)
which is a pretty universal assumption, and from which it follows that (S, T) 0, giving a minimum consistency
with the everyday notion of distance. In what follows I will conne attention to distance based on a table C
which satises at least (1).

3
The literature contains quite a number of inequivalent notions, all referred to as tree distance; in this Chapter Denition 1.2 will be understood
to dene the term.
4
ie. that it truly nds the minimal value of ( : S T) given cost-table C
5
or a subtree
When the cost-table C
(x, y) is constrained more strictly than this to satisfy all the conditions of a distance-metric,
then it is well known that (S, T) will also be a distance-metric. Whether such further restriction is desirable is moot:
in so-called stochastic variants [16, 17? ], in which the entries in C
are interpreted as negated logs of probabilities,

6
these additional distance-metric assumptions are not fullled. The present studies only assume the cost-table C
satises the non-negativity requirement of distance.

(a) Alignment 1 (b) Alignment 2
For this sample there are two dierent Tai maps for the minimal three cost.The dierent colours of the nodes mean
dierent labels.
Figure 4: Isert alignments sample extracted from [18].
Finally, as an important note it should be mentioned that for a pair of trees there could be more than one mapping
at the same distance, see sample in Figure 4. The implementation used in the experiments of this thesis promotes the
swaps for equal cost over deletions/insertions. Therefore, as the nodes are processed in traversal post order, mapping
(a) will be prevalent over mapping (b). This feature will have an important impact in the results of the system when
the cost of deleting and inserting two nodes would be the same cost as swapping both of them.
1.2.2. Zhang and Shasha algorithm
Figure 5 shows a recursive description of how to nd the cost of minimal Tai map. Some modications allow not
just the cost of the map to be found, but also the map itself.
The Zhang and Shasha algorithm presented in [11, 12] follows the same pattern, but it is a successful attempt to
make it dynamic (not recursive).
Since the [10] denition of Tai mappings and the [11] implementation, few new algorithms overpass the compu-
tational eciency of [11] producing the same mappings [19, 20, 21].
Now turning to similarity, rather than approach the problem of comparison by minimizing accumulated costs
assigned to an alignment, a widely followed alternative, especially for sequence comparison, has been to maximize a
score assigned to an alignment, with swaps/matches rewarded, and deletions/insertions punished.
Like C
2 let C
be a similarity table, again indexed by , where is the alphabet of labels

7
, and where
: S T is any mapping from S to T, and then let ( : S T) be dened by
Denition 1.3. similarity scoring of an alignment
6
Therefore, the values are positive.
7
To keep notational overhead to a minimum, we will use for arbitrary members the label alphabet .
(a) Tree Distance (b) Forest Distance
The function (A, B) gives the forest/tree cost/distance between A and B.
The function (x, y) gives the atomic cost of modifying a node x into a node y.
Figure 5: Tree distance
( : S T) =
(i, j)A
C
(i
, j
i1
C
(i
, )
jJ
C
(, j
) (2)
From this costing of alignments, a similarity score on tree pairs is dened by maximisation:
Denition 1.4. similarity scoring of a tree pair
The Tree- or Tai-similarity (S, T) between two trees S and T is the maximum value of ( : S T) over possible
Tai-mappings from S to T, relative to a chosen cost table C
a
a
b a b
b
c
a b
b
a
b
with
C
(x, ) = C
(, x) = 0,
C
(x, x) = 2, C
(x, y) = 1 for x y,
the shown alignment has score () = 9
Copy of Figure 3 at page 4, similarity version.
Figure 6: An illustration of tree similarity.
Applied to the same example as shown in Figure 3 at page 4, which is duplicated as Figure 6 but with similarity score
values, the shown alignment has score () = 9, which is maximal for the given C
.
(S, T) can be computed via a simple modication of the [11] algorithm. On the domain of vertical trees this
coincides with the well known approach to sequence comparison, the (alphabet-weighted) string similarity [22, 14].
As with distance(), while the correctness of the algorithm for similarity() is not dependent on any assumptions
about the cost-table C
, some settings of C
make little sense. Given the formulation in Equation 2, which subtracts

the contribution from deletions and insertions, a setting where deletion/insertion cost entries are negative ( C
(x, ) <
0, C
(, x) < 0 ) gives the counter-intuitive eect that a supertree of S would be more similar (in the sense of higher
score) to S than S itself.
This gives a rationale for the nearly universal assumption of non-negative deletion/insertions entries in C
:
x, y (C
(x, ) 0, C
(, y) 0) (3)
In what follows we will conne attention always to similarity based on a table C
satisfying (3)
8
. For the
C
-entries which are not deletions or insertions, it is quite common in biological sequence comparisons to have both
positive and negative entries. In contrast to the notion of a distance-metric, the notion of a set of axioms for a similarity
is less well established. [? ] have recently made a proposal concerning this (see section ??).
To reiterate, for the purposes of this discussion a tree distance measure will imply a cost-table C
, satisfying (1)
9
,
used in accordance to denitions 1.1 and 1.2 to score alignments and tree pairs. A tree similarity measure will imply
a cost-table C
, satisfying (3), used in accordance to denitions 2 and 1.4 to score alignments and tree pairs. This is
sucient to distinguish the distance approach from the similarity approach in an intuitive way without committing
to any further axioms.
1.3. Order-equivalence: Notions between Tai Distance and Similarity
Given a distance scoring of alignments, it can be set to work to induce orderings of at least three dierent
kinds entities
Alignment ordering Given xed S , and xed T, rank the possible alignments : S T by ( : S T)
Neighbour ordering Given xed S , and varying candidate neighbours T
i
, rank the neighbours T
i
by (S, T
i
)
typically used in k-NN classication.
Pair ordering Given varying S
i
, and varying T
j
, rank the pairings (S
i
, T
j
) by (S
i
, T
j
) typically used in hierarchi-
cal clustering.
Similarly, a similarity scoring of alignments induces orderings of the above kinds of entities. Comparing these
orderings motivates the following denition
Denition 1.5. A -,N- and P-dual
When the alignment orderings induced by a choice of C
(used in accordance with (1.1)) and by a choice C
(used
in accordance with (2)) are the reverse of each other, we will say that C
is an A-dual of C
. Similarly we will say

we have an N-dual when neighbour ordering is reversed, and a P-dual where pair-ordering is reversed.
For example, the following are A-duals in this sense (proven in section 1.5):
8
While Denition (equation)2 formulates with deletion/insertion contributions subtracted, as is often done [22, 23], an alternative formulation
has these treated additively [14]. With the additive formulation, the same consideration suggests making deletion/insertions non-positive.
Example 1.6.:
.
with
(x, ) = 0.5
C
(x, x) = 0
C
(x, y) = 0.5
with
(x, ) = 0
C
(x, x) = 1
C
(x, y) = 0.5
1.4. Order-relating Conjectures
A natural question that presents itself then is whether for every choice of C
, there is a choice of C
which is an
A-dual, N-dual or P-dual, and vice-versa. More precisely, there are the following:
A-duality
(i) C
(C
and C
are A-duals)
(ii) C
(C
and C
are A-duals)
N-duality
(i) C
( C
and C
are N-duals)
(ii) C
( C
and C
are N-duals)
P-duality
(i) C
( C
and C
are P-duals)
(ii) C
( C
and C
are P-duals)
Arguably these notions go to the heart of the question whether there is really anything that can be accomplished
using an alignment distance score, which cannot by accomplised via an alignment similarity score, and vice-versa.
For example, if it turns out that all these order conjectures hold, then any alignment outcome, any categorisation
outcome via k-NN and any hierarchical clustering outcome, achieved by a particular distance can be replicated by a
similarity, and vice-versa, making the choice merely a matter of personal taste. On the other hand, if these duality
conjectures do not hold, then there is substantive dierence, with the outcomes achievable by distances and similarities
being distinct.
For a number of similarity and distance measures based on sets and vectors, notions analogous to N-dual and P-
dual have been considered [2, 3, 4], motivated similarly by the question whether anything which can be accomplished
by one such measure can be replicated by another such measure.
1.5. Alignment-duality
The following lemma will be useful for considering the A-duality conjectures above:
Lemma 1.7. For any C
, and some choice such that

0 /2 min(C
(, ), C
(, )) let C
be dened according to (i) below. For any C
, and choice such that

0 max(C
(, )) let C
be dened according to (ii) below.

(i)
(x, ) = C
(x, ) /2
C
(, y) = C
(, y) /2
C
(x, y) = C
(x, y)
(ii)
(x, ) = C
(x, ) + /2
C
(, y) = C
(, y) + /2
C
(x, y) = C
(x, y)
then in either case, for any : S T
() + () = /2 (
sS
(1) +
tT
(1)) (4)
The proof of Lemma 1.7 can be found in of [1]
The proof of the aligment sum property can be found in the appendix of [1]
Theorem 1.8. A-duality (i) and (ii) hold
The proof of Theorem 1.8 can be found in [1]
Corollary 1.9 ( shifting ). for any C
1
, an alignment equivalent C
2
can be derived by the conversion (a) below,
and for any C
1
, an alignment equivalent C
2
can be derived by the conversion (b)
(a)
2
(x, ) = C
1
(x, ) /2
C
2
(, y) = C
1
(, y) /2
C
2
(x, y) = C
1
(x, y) +
(b)
2
(x, ) = C
1
(x, ) + /2
C
2
(, y) = C
1
(, y) + /2
C
2
(x, y) = C
1
+
Proof 1.10. Proof of Corollorary 1.9(expaind in [1]) (a) is the composition of (ii), for some
1
, with (i), for some
2
, giving =
2

1
. (b) is the composition (i), for some
1
, with (ii), for some
2
, giving =
2

1
The property of alignment dualizability between distance and similarity (and vice-versa) expressed above in Lemma 1.7
and Theorem 1.8 was essentially rst proven for the case of sequence comparison by [22]. On this basis it seems that
it is the case that distance and similarity concepts are interchangeable. However, as noted in Section 1.3, there
is more than one kind of ordering that one might wish to be sure of replicating in switching between distance and
similarity, with N-duality coming to the fore in the context of k-NN classication, and P-duality coming to the fore
in the context of hierarchical clustering. Section 1.6.1 considers the N-duality (i) and P-duality (i) order conjectures,
and Section 1.6.2 considers the N-duality (ii) and P-duality (ii) conjectures.
1.6. Neighbour and pair ordering
1.6.1. Distance to Similarity
Having seen that A-duals can always be created in both directions, attention shifts to N-duals and P-duals.
The case of using = 0 in the (i) conversion of Lemma 1.7 from C
to C
gives non-positive values for all

non-deletion, non-insertion entries in C
, and is an especially trivial case of dualizing a distance setting C
, with the
eect that (S, T) = 1(S, T). Because of this, this distance-to-similarity conversion not only makes A-duals, but
also N-duals and P-duals.
Theorem 1.11. N-duality (i) and P-duality(i) hold
Proof 1.12. P roof of Theorem 1.11 (expained in [1])By choosing = 0 in the (i) conversion of Lemma 1.7 from C
to C
, we have (S, T) = 1 (S, T), and hence (S

1
, T
1
) (S
2
, T
2
) (S
1
, T
1
) (S
2
, T
2
)
This distance-to-similarity by negation is well known. On the other hand, concerning similarity-to-distance, in the
(ii) conversion of Lemma 1.7 fromC
to C
, you can only choose = 0 if all C
(x, y) 0, and clearly there are many

natural settings of C
where that is not true.

9
1.6.2. Similarity to Distance
The remaining order-equivalence conjectures of section 1.3 are N-duality(ii) and P-duality(ii), concerning the similarity-
to-distance direction. Of the remaining conjectures, P-duality(ii) is stronger than N-duality(ii). We can fairly easily
show P-duality(ii) does not hold
Theorem 1.13. P-duality (ii) does not hold, that is, there are C
such that there is no C
such that C
and C
are
P-duals.
Of the order-relating conjectures of section 1.3 the only remaining one is N-duality(ii) that is the question
whether every neighbour-ordering via some C
can be replicated by a neighbour ordering via some C
. We can show
that there are neighbour-orderings by a Tai-similarity which cannot be dualized by any Tai-distance whose deletion
and insertion costs are symmetric.
Theorem 1.14. There is C
such that there is no C
with C
(x, ) = C
(, x) such that C
and C
are N-duals
2. N-Duality implies P-Duality
Theorem 2.1. N-Duality implies P-Duality
Proof 2.2. Proof of Therem 2.1 if C
and C
are N-duals those imply C
are C
are P-duals?
(S
0
, T
0
) < (S
1
, T
1
) (S
0
, T
0
) > (S
1
, T
1
) being C
and C
P-duals ?
Assuming that it is always possible to nd a sequence n of trees (X
1
...X
n
) between T
0
and S
1
such that would
satisfy:
(S
0
, T
0
) < (T
0
, X
1
) (X
1
, X
2
) ... (X
n
, S
1
) (S
1
, T
1
)
As is and are P-duals and in the previous sequence of inequalities each item on the sequence is a distance com-
parison which shares one tree with each of its neighbour items It is possible to create a new sequence of inequalities
by using the P-duality property: each (X
m1
, X
m
) < (X
m
, X
m+1
) would infer a (X
m1
, X
m
) > (X
m
, X
m+1
)
what leads to:
(S
0
, T
0
) > (T
0
, X
1
) (X
1
, X
2
) (X
n
, S
1
) (S
1
, T
1
)
Therefore, as the proof can work in both directions:
(S
0
, T
0
) < (S
1
, T
1
) (S
0
, T
0
) > (S
1
, T
1
)
N-duality and P-duality are the same property.
Remark The proof requires a demonstration that it will be always possible to nd the sequences of trees X
0
...X
n
. It is
easy to nd an example for which it is inposible to nd such sequence, however it is easy to nd such sequence in a
extended alphabet

+
in which we can dene the atomic cost for the new letters of the alphabet C
+
.
(5)
This may not hold if the triangle inequality property is not preserved.
Remark On the assumption of symmetry and triangular inequality.
In [1] was argued that a distance measure should not be negative, because it is nonsensical the concept of a
negative distance, which have an important consequence: The ordering result produce by any distance measure can
be reproduced by a similarity measure but not all ordering results obtain by a similarity measure can be replicated
though a distance measure.
In the same way we can assume that the symmetry property for distance (and similarity as well):
A B
.
.
.C
(A, B) = C
(B, A) (6)
The distance from the top of a mountain to the valley should be the same as the distance from the valley to the top of
the mountain, however we can have a cost score which is not symmetric, (or can even be negative) for instance the
energy consumption that would take starting from the valley can be much higher than the energy starting from the top.
3. Normalization eect
Theorem 3.1. Normalising (dividing them by the amount of nodes) A-dual measures makes them N-duals and P-
duals.
Proof 3.2. Proof of theorem 3.1
As previously shown in Equation 4 for any given mapping: Twice the amount of swaps plus all deletions and
insertions sum exactly the amount of nodes in both trees together:
2A + 1 + J =
sS
(1) +
t7
(1)
Therefore, normalising by amount of nodes in both trees is the same as normalizing by 2A + 1 + J for any given
mapping.
If equation 4 on page 8 is normalized, then:
() + ()
sS
(1) +
tT
(1)
= /2
sS
(1) +
tT
(1)
sS
(1) +
tT
(1)
That makes:

() +

() = /2
As the value of is independent of the pair of trees for any mapping x:
(x) = /2

(x)
Let be and two mappings between dierent pairs of trees,
If

() <

() then:
/2

() < /2

()
() >

(): what makes them A-dual, N-dual and P-dual.
Acknowledgements
Thanks to Martin Emms, for his kind advices on this work.
References
[1] M. Emms, H.-H. Franco-Penya, On order equivalences between distance and similarity measures on sequences and trees, in: ICPRAM (1),
2012, pp. 1524.
[2] V. Batagelj, M. Bren, Comparing resemblance measures, Journal of Classication 12 (1) (1995) 7390.
[3] J.-F. Omhover, M. Rifqi, M. Detyniecki, Ranking Invariance Based on Similarity Measures in Document Retrieval, in: Adaptive Multimedia
Retrieval, 2006, pp. 5564.
[4] M.-J. Lesot, M. Rifqi, Order-based equivalence degrees for similarity and distance measures, in: Proceedings of the Computational intel-
ligence for knowledge-based systems design, and 13th international conference on Information processing and management of uncertainty,
IPMU10, Springer-Verlag, Berlin, Heidelberg, 2010, pp. 1928.
[5] Wikipedia, Needleman (2012).
[6] P. H. Sellers, On the theory and computation of evolutionary distances, SIAM Journal on Applied Mathematics 26 (4) (1974) 787793.
doi:10.1137/0126070.
URL http://link.aip.org/link/?SMM/26/787/1
[7] C. E. R. Alves, E. N. Cceres, F. Dehne, Parallel dynamic programming for solving the string editing problemon a CGM/BSP, in: Proceedings
of the fourteenth annual ACM symposium on Parallel algorithms and architectures, SPAA 02, ACM, New York, NY, USA, 2002, pp. 275
281.
[8] G. Kondrak, Phonetic alignment and similarity, Computers and the Humanities 37.
[9] R. P. J. C. Bose, W. M. P. van der Aalst, Context Aware Trace Clustering: Towards Improving Process Mining Results., in: SAIM International
Conference on Data Mining, SDM, 2009, pp. 401412.
[10] K. K.-C. Tai, The tree-to-tree correction problem, Journal of the ACM (JACM) 26 (3) (1979) 422433.
doi:http://doi.acm.org/10.1145/322139.322143.
URL http://dl.acm.org/citation.cfm?id=322143
[11] K. Zhang, D. Shasha, Simple fast algorithms for the editing distance between trees and related problems, SIAM J. Comput. 18 (6) (1989)
12451262. doi:http://dx.doi.org/10.1137/0218082.
[12] D. Shasha, K. Zhang, Fast algorithms for the unit cost editing distance between trees, J. Algorithms 11 (4) (1990) 581621.
doi:http://dx.doi.org/10.1016/0196-6774(90)90011-3.
[13] R. A. Wagner, M. J. Fischer, The String-to-String Correction Problem, Journal of the Association for Computing Machinery 21 (1) (1974)
168173.
[14] D. Guseld, Algorithms on strings, trees and sequences: computer science and computational biology, Cambridge Univ. Press, 1997.
[15] T. Kuboyama, Matching and Learning in Trees, Ph.D. thesis, University of Tokyo (2007).
[16] E. S. Ristad, P. N. Yianilos, Learning String Edit Distance, IEEE Transactions on Pattern Recognition and Machine Intelligence 20 (5) (1998)
522532.
[17] M. Bernard, L. Boyer, A. Habrard, M. Sebban, Learning probabilistic models of tree edit distance, Pattern Recognition 41 (8) (2008) 2611
2629. doi:http://dx.doi.org/10.1016/j.patcog.2008.01.011.
[18] C. Isert, The Editing Distance Between Trees, Institut fr Informatik, Technische Universitt Mnchen., Tech. rep. (1999).
[19] P. N. Klein, Computing the Edit-Distance Between Unrooted Ordered Trees, in: In Proceedings of the 6th Ann. European Symp. on Algo-
rithms (ESA), - ESA98, Springer Berlin Heidelberg., 1998, pp. 91102.
[20] E. D. Demaine, S. Mozes, B. Rossman, O. Weimann, An optimal decomposition algorithm for tree edit distance, ACM Transactions on
Algorithms 6 (1) (2009) 119. doi:10.1145/1644015.1644017.
URL http://portal.acm.org/citation.cfm?doid=1644015.1644017
[21] M. A. N. Pawlik, RTED: a robust algorithm for the tree edit distance, Proceedings of the VLDB Endowment 5 (4) (2011) 334345.
URL http://dl.acm.org/citation.cfm?id=2095692
[22] T. F. Smith, M. S. Waterman, Comparison of biosequences, Advances in Applied Mathematics 2 (4) (1981) 482489.
[23] A. Stojmirovic, Y.-K. Yu, Geometric Aspects of Biological Sequence Comparison, Journal of Computational Biology 16 (2009) 579610.
Index

A
ALU. see Arithmetic and logical unit (ALU)
Arithmetic and logical unit (ALU), 455, 457
fault tolerant, 456
fault tolerant 4-bit, transient response of, 458
schematic of fault tolerant, 458

D
Disagreement detector
circuit, 457
output, 457

F
Fault tolerant system, 453454
Fault-tolerant design mechanism
redundancy, functions of, 454

G
GDI technique, 455
advantage of, 456457

M
Majority voter logic circuit, 457
Majority voting logic, 457

P
Power calculations, 459

R
Redundancy technique, 454
arithmetic and logical unit, 455

T
TMR. see Triple Modular Redundancy (TMR)
Triple Modular Redundancy (TMR), 454

Session 04 - Paper 64

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Session 04 - Paper 64

Diunggah oleh

Hak Cipta:

Format Tersedia

On order equivalences between normalized tree distance and similarity

give the label of a node and let C

be a cost table, indexed by ,

Figure 3: An illustration of tree distance.

to satisfy any par-

clearly make little sense. The combination of deletion/insertion cost-entries

which satises at least (1).

are interpreted as negated logs of probabilities,

satises the non-negativity requirement of distance.

be a similarity table, again indexed by , where is the alphabet of labels

make little sense. Given the formulation in Equation 2, which subtracts

(used in accordance with (1.1)) and by a choice C

. Similarly we will say

, and some choice such that

be dened according to (i) below. For any C

, and choice such that

be dened according to (ii) below.

gives non-positive values for all

, and is an especially trivial case of dualizing a distance setting C

, we have (S, T) = 1 (S, T), and hence (S

, you can only choose = 0 if all C

(x, y) 0, and clearly there are many

where that is not true.

such that there is no C

can be replicated by a neighbour ordering via some C

such that there is no C

are N-duals those imply C

Anda mungkin juga menyukai