Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Modern Information Processing: From Theory to Applications
Modern Information Processing: From Theory to Applications
Modern Information Processing: From Theory to Applications
Ebook936 pages

Modern Information Processing: From Theory to Applications

Rating: 0 out of 5 stars

()

Read preview

About this ebook

The volume "Modern Information Processing: From Theory to Applications," edited by Bernadette Bouchon-Meunier, Giulianella Coletti and Ronald Yager, is a collection of carefully selected papers drawn from the program of IPMU'04, which was held in Perugia, Italy.

The book represents the cultural policy of IPMU conference which is not focused on narrow range of methodologies, but on the contrary welcomes all the theories for the management of uncertainty and aggregation of information in intelligent systems, providing a medium for the exchange of ideas between theoreticians and practitioners in these and related areas.

The book is composed by 7 sections:

UNCERTAINTY
PREFERENCES
CLASSIFICATION AND DATA MINING
AGGREGATION AND MULTI-CRITERIA DECISION MAKING
KNOWLEDGE REPRESENTATION

•The book contributes to enhancement of our ability to deal effectively with uncertainty in all of its manifestations.
•The book can help to build brigs among theories and methods methods for the management of uncertainty.
•The book addresses issues which have a position of centrality in our information-centric world.
•The book presents interesting results devoted to representing knowledge: the goal is to capture the subtlety of human knowledge (richness) and to allow computer manipulation (formalization).
•The book contributes to the goal: an efficient use of the information for a good decision strategy.
APPLIED DOMAINS

· The book contributes to enhancement of our ability to deal effectively with uncertainty in all of its manifestations.
· The book can help to build brigs among theories and methods methods for the management of uncertainty.
· The book addresses issues which have a position of centrality in our information-centric world.
· The book presents interesting results devoted to representing knowledge: the goal is to capture the subtlety of human knowledge (richness) and to allow computer manipulation (formalization).
· The book contributes to the goal: an efficient use of the information for a good decision strategy.
LanguageEnglish
Release dateOct 13, 2011
ISBN9780080461694
Modern Information Processing: From Theory to Applications

Related to Modern Information Processing

Intelligence (AI) & Semantics For You

View More

Reviews for Modern Information Processing

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Modern Information Processing - Bernadette Bouchon-Meunier

    Uncertainty

    Entropies, Characterizations, Applications and Some History

    János Aczél,     Faculty of Mathematics University of Waterloo, Waterloo, ON N2L 3G1, Canada. E-mail address: jdaczel@math.uwaterloo.ca

    Abstract

    Entropies with useful and/or interesting properties are presented. Characterizations are given, based on such properties and some applications are mentioned. Attention is directed to an example of discovery and rediscovery, and to new applications in utility theory.

    1 INTRODUCTION, ENTROPY

    Defining new entropies in addition to the classical Shannon entropy seems to be an ongoing industry. I am still convinced, however, of what I wrote in [1]:

    "In the best of all possible worlds there is an information measure that originated from an applied problem, has interesting properties (usually attractive, reasonable generalizations of properties of Shannon’s entropy or of similar widely used measures), and those characterize it. Less ideal but still acceptable is in my opinion the following situation. Some natural looking weakening or generalization of properties characterizing Shannon type measures are isolated and all measures having these properties are determined. If the properties are indeed intuitive and significant then there is a good chance that the measures thus obtained may have future applications.

    But what many authors seem to do is to contrive some generalization of known information measures (usually by sticking parameters almost at random here and there), derive its often not very interesting or natural or even attractive properties and then characterize by several of these properties the ‘measures’ they have defined in the first place. Not many good or useful results can be expected from this kind of activity."

    An earlier version of the present paper appeared in [2], As there, I express also here my belief that two families of probabilistic entropies (of which Shannon’s entropy is a limit case) suffice - in addition to entropies depending on objects other than probabilities. I will define them and state (without proof but with references) some of their properties and characterizations that I consider reasonable, and mention some applications.

    Our models will mostly be complete systems of mutually exclusive events E1,.., En (such as the possible outcomes of an experiment), with probabilities p1, ….,pn, respectively

    Entropies are measures of uncertainty in or measures of information

    2 SHANNON ENTROPY

    0) by

    (1)

    then the definition

    is used. Here n = 1 is also permissible and gives as entropy of a single event

    We take here the following ‘reasonable’ properties of the Shannon entropy. It is

    (i) SYMMETRIC:Hn is a symmetric function of its n variables (invariant under exchange of pj and pk (J,k = 1,….,n)),

    (e) EXPANSIBLE:Hn+1(p1,….,pn,0) = Hn(p1,….,pn) : enlarging by an event of probability 0 does not change the entropy (expected information),

    (c) SMALL FOR SMALL PROBABILITIES: H2(1,0) = 0, limgq→0H2(1 − q,q) = 0 : little information can be expected if one of the events is almost certain (has probability close to 1, thus the other(s) close to 0). Thus H2 is continuous at (1,0).

    Now we get to important properties. We deal with three experiments:P with outcomes D1,…., Dm, probabilities pj = p(Dj) (j = 1,….,m);

    Q with outcomes E1,…., En, probabilities qk = p(Ek) (k = 1,…., n);

    "P and Q", denoted by P*Q, with outcomes " Dj and ek ", denoted by DjEk having the probabilities pjk = p(Djek) (j = 1,…, m; k = 1,…, n).

    The respective entropies (information expected from experiments P, Q, P*Q) are H(P), H(Q), H(P * Q). The remaining two properties are:

    (s) SUBADDITIVITY: H(P*Q) < H(P) + H(Q) : information expected from two experiments is not greater than the sum of informations expected from the single experiments;

    (a) ADDITIVITY: H(P*Q) = H(P)+H(Q) if P and Q are independent, that is, if p(DjEk) = p(Dj)p(Ek) (j = 1,…,m,; k = 1,…, n), or, in words: information expected from two independent experiments equals the sum of informations expected from the single experiments.

    The following characterization theorem has been proved in [6] (see also [4]).

    THEOREM 1. If and only if H is (i) symmetric, (n) normalized, (e) expansible, (c) small for small probabilities, (s) subadditive and (a) additive, then H is the Shannon entropy.

    NOTE 2. The nonnegativity of the constant multiplier is guaranteed by (s). Without (c) and (n) a nonnegative constant times the logarithm of the number of events with positive probability can be added. The latter logarithm is often called Hartley entropy.

    An older, classical characterization of the Shannon entropy is by

    (r) RECURSIVITY:

    Improving a result of D.K. Faddeev [17], Z. Daroczy [13,4] has proved the following.

    THEOREM 2 If and only if H is (i) symmetric, (n) normalized, (c) small for small probabilities, and (r) recursive, then H is the Shannon entropy (1).

    G.T. Diderrich [16] weakened conditon (c) to boundedness on an interval (square).

    3 RÉNYI ENTROPY

    The Shannon entropy for positive probabilities is the weighted arithmetic mean (with the probabilities as weights) of the quantities – log2Pk(k = 1,…,n) which can be considered (see Note 1) entropies of single events. The arithmetic mean is not the only interesting average and the Shannon entropy is not the only interesting entropy.

    The Rényi [24,25] (see also [4]) entropy of order a is defined by

    (2)

    [pk> 0 (k = 1,…, n) for sake of simplicity]. Here α ≠ 1 but limα→1(αHn(p1, …,pn

    For further reference we define a straightforward generalization of the weighted arithmetic mean of the entropies − log2Pk of the single events, weighted by Pk (k = 1,…,n):

    (3)

    where f respectively.

    The Rényi entropies of positive order (including the Shannon entropy as of order 1) have the following characterization ([3], see also [4]).

    THEOREM 3. The weighted quasiariihrnetic mean (3) is (a) additive and(c) small for small probabilities if and only if it is either the Shannon’s entropy (1) or a Rényi entropy (2) of positive (but ≠ 1)order.

    The Rényi entropies α Hn (α ≠ 1) need not be (s) subadditive. Rényi entropies have applications to random search problems [24,25], questionnaire theory [8], optimal coding (the greatest lower bounds of the arithmetic or exponential mean codeword lengths are the Shannon and the Rényi entropies, respectively) [11], even to differential geometry [12].

    4 HAVRDA-CHÁRVAT-DARÓCZY-TSALLIS ENTROPY

    In 1988, C. Tsallis [27] introduced the entropy

    (4)

    (this is an equivalent form, slightly altered for the sake of comparisons below). It has been named Tsallis entropy and applied to generalizing Boltzmann-Gibbs statistical mechanics and related fields [27,26]. Characterizations were also supplied, including the most concise set of axioms in [26].

    Already in 1967, however, J. Havrda and F. Chárvat [19] defined the entropy

    (5)

    that is constant times (4). In 1970 Z. Daróczy [is the Shannon entropy. The entropy (5) (and also (4)) satisfies

    (rα) RECURSIVITY OF DEGREE α: For P1 + P2 > 0,

    The entropies (5) have the following characterisation [14,4] (and (4) has a similar one).

    THEOREM 4. An entropy (sequence of functions of p1, …,pn (n = 2,3,…)) is of the form (5) if and only if it is (i) symmetric, (n) normalized, and (rα) recursive of degree α.

    The entropies (5) are (s) subadditive if α > 1. Neither (5) nor (4) need to be additive but they are pseudoadditive:

    B. Forte and C.T. Ng ([18]) gave the following characterization of (5) for all α by conditions that do not contain α.

    THEOREM 5. An entropy is of the form (5) if and only if it is (i) symmetric, (n) normalized, furthermore continuous, and satisfies H2(1,0) = 1, branching:

    andcompositivity:

    where

    Notice that the branching property is a generalization of recursivity and of recursivity of degree α. Compositivity is related to weighted quasiarithmeticity (3).

    There is a connection between (2) and (5) or between (2) and (4) [though not so simple as between (4) and (5)]:

    5 ENTROPIES CONTAINING OBJECTS OTHER THAN PROBABILITIES

    There exists a theory of information without probability. We dont go into it here but refer the reader to the survey [20] by J. Kampé de Fériet. But we make short mention of the mixed theory of information (see e.g. [5,1]). There entropies, called inset entropies, may depend upon the events themselves, not only upon their probabilities. Boolean rings of sets contain Ω itself. By an appropriate generalization of the (r) recursivity, one can characterize the simplest inset entropies ([5]):

    (6)

    where E1,…, En are elements of the Boolean ring of sets, p1, …,pn are their probabilities, h is an arbitrary real valued function on the Boolean ring, and a is an arbitrary constant.

    It has applications, among others, in geographical and economic analysis [9,10], in the fuzzy theory of information and in gas dynamics [1].

    An interesting early application (before the mixed theory of information was formally developped) is due to J. R. Meginnis ([22]; cf. [1]). He considers the second term in (6) as the expected gain in gambling (Ek being the k–th outcome of the gamble, with the gain h(Ek) attached to it). So it is the first term that has to be explained. Since the expected gain alone would not motivate gambling (it is almost always nonpositive), he interprets the first term as quantifying the joy in gambling. He characterized expression (6) in this interpretation as utility of a gamble, and also the one corresponding to (5),

    (0 < c ≠ 1; see also [7] for characterization of the latter inset entropy under weaker assumptions within the mixed theory of information). R. D. Luce recently initiated with coauthors ([21,23]) a new theory of entropy-modified linear utility where, in both of Meginniss ([22])’s expressions, probabilities are replaced by more general weights associated with events. They also characterized these expressions under considerably weaker assumptions than Meginniss ([22]).

    We could have spoken about entropies of incomplete systems also in sections 3 and 4, about conditional entropies, entropies of continuous distributions, information measures for several systems (distributions) etc. everywhere. But here we stop.

    Acknowledgement

    This work has been supported in part by Natural Sciences and Engineereing Research Council of Canada grant #OGP0002972.

    REFERENCES

    1. Aczél, J., Characterizing Information Measures: Approaching the End of an Era. International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems. Selected and Extended Contributions. Lecture Notes in Computer Science; Vol. 286. Springer, Berlin/New York, 1987:359–384.

    2. Aczél, J., Entropies Old and New (and Both New and Old) and Their Characterizations. Bayesian Inference and Maximum Entropy Methods in Science and Engineering. AIP Conference Proceedings; Vol. 707. American Institute of Technology, Melville, NY, 2004:119–126.

    3. Aczél, J., Daróczy, Z. Sur la caractérisation axiomatique des entropies d’ordre positf, y comprise l’entropie de Shannon. C. R. Acad. Sci. Paris. 1963; 257:1581–1584.

    4. Aczél, J., Daróczy, Z., On Measures of Information and Their Characterizations. Mathematics in Science and Engineering; Vol. 115. Academic Press, New York, 1975.

    5. Aczél, J., Daróczy, Z. A Mixed Theory of Information, I: Symmetric, Recursive and Measurable Entropies of Randomized Systems of Events. RAIRO Informat. Théor. 1978; 12:149–155.

    6. Aczél, J., Forte, B., Ng, C. T. Why the Shannon and Hartley Entropies Are ‘Natural’. Adv. in Appl. Probab. 1974; 6:131–146.

    7. Aczél, J., Kannappan, P. A Mixed Theory of Information, III. Inset Entropies of Degree β. Inform. and Control. 1978; 39:315–322.

    8. Aggarval, N. L., Cesari, Y., Picard, C. -F. Propriétés de branchement liées aux questionnaires de Campbell et à l’information de Rényi. C. R. Acad. Sci. Paris. 1972; 275A:437–440.

    9. Batten, D. F. Spatial Analysis of Intersecting Economies. Kluwer, Boston, 1983.

    10. Batty, M. Speculations on an Information Theoretical Approach to Spatial Representation. In: Spatial Representation and Spatial Interaction. Leiden/Boston: Nijhoft; 1978:115–147.

    11. Campbell, L. L. A Coding Theorem and Rényi’s Entropy. Inform. and Control. 1965; 8:423–429.

    12. Campbell, L. L. The Relation Between Information Theory and the Differential Geometry Approach to Statistics. Inform. Sci. 1985; 35:199–210.

    13. [English translation in Selected Translations in Mathematical Statistics and Probability, Vol. 10, Inst. Math. Statist. /Amer. Math. Soc., Providence, RI, 1972, pp. 193–210]. Daróczy, Z. On the Shannon Measure of Information (Hungarian). Magyar Tud. Akad. Mat. Fiz. Oszt. Közl. 1969; 19:9–24.

    14. Daróczy, Z. Generalized Information Functions. Information and Control. 1970; 16:36–51.

    15. Devijver, P. A. Entropies of Degree β and the Lower Bound of the Average Error Rate. Inform. and Control. 1977; 34:222–226.

    16. Diderrich, G. T. Local Boundedness and the Shannon Entropy. Inform. and Control. 1978; 36:149–161.

    17. [German translation in Mathematische Forshungsberichte, No. IV DVW, Berlin, 1967, pp. 88–90]. Faddeev, D. K. On the Concept of Entropy of a Finite Probabilistic Scheme (Russian). Uspekhi Mat. Nauk. 1956; 11(No. 1(67)):227–231.

    18. Forte, B., Ng, C. T. On a Characterization of the Entropies of Degree β. Utilitas Math. 1973; 4:193–205.

    19. Havrda, J., Charvát, F. Quantification Method of Classification Processes. Concept of Structural α-Entropy. Kybernetika (Prague). 1967; 3:30–35.

    20. Kampé de Fériet, J. La théorie géneralisée de l’information et la mesure subjective de l’information. In: Théories de l’information, Actes Rencontres, Marseille-Luminy. New York: Springer; 1974:1–35.

    21. R. D. Luce, C. T. Ng, A. A. J. Marley, and J. Aczél, Merging Savage and Shannon: Entropy-Modified Linear Additive Utility, in preparation.

    22. Meginnis, J. R. A New Class of Symmetric Utility Rules for Gambles, Subjective Marginal Probability Functions, and a Generalized Bayes Rule. Bus. and Econ. Stat. Sec. Proc. Amer. Stat. Assoc. 1976; 1976:471–476.

    23. C. T. Ng, R. D. Luce, and A. A. J. Marley, Utility of Gambling: Extending the Approach of Meginniss, in preparation.

    24. Rényi, A., On Measures of Entropy and Information. Proceedings of the 4th Berkeley Symposium on Mathematical Statistics and Probability; Vol. I. University of California Press, Berkeley, 1961:547–561.

    25. Rényi, A. Probability Theory. North Holland/Elsevier, Amsterdam/New York, 1970.

    26. Suyari, H. On the Most Concise Set of Axioms and the Uniqueness Theorem for Tsallis Entropy. J. Phys. A Math. Gen. 2002; 35:10731–10738.

    27. Tsallis, C. Possible Generalization of Boltzmann-Gibbs Statistics. J. Statist. Phys. 1988; 52(no. 1-2):479–487.

    Belief function theory on the continuous space with an application to model based classification

    B Ristica and Ph. Smetsb,     aISR Division, DSTO, Edinburgh, Australia; bIRIDIA, Université libre de Bruxelles, Bruxelles, Belgium

    Abstract

    . When our domain knowledge is represented by the pignistic probability density, then we build the corresponding least committed belief function. The theory is applied to model based classification and the results are compared to the classical Bayesian approach.

    Key words

    Belief function theory

    evidential theory

    transferrable belief model

    target classification

    1 Introduction

    The belief function theory (evidential theory) has been primarily developed for discrete frames of discernment (frames). Following [n.

    When our domain knowledge is partial and represented only by a potential betting behavior on the observation (in the continuous domain), we model it by a pignistic probability density. In this case we can build the least committed belief function among those which correspond to the given one. Then we can apply the usual tools of the belief function theory, such as the Generalised Bayesian theorem, combination rules (e.g. Dempster’s rule of combination), etc. The theory is applied to model based target classification and the results are compared to those achieved by the classical Bayesian approach.

    We accept that beliefs are quantified by belief functions as described in the transferable belief model (TBM) [.

    2 Belief functions on

    This section presents the extracts from a more thorough study presented in [[α,β] be a set of closed intervals in [α,β]. Formally,

    such that

    we have:

    2.1 Finite number of focal sets

    [0,1] consisting of a finite number of non-empty intervals on [0,1]:

    For convenience, use notation A0 → [0,1] is a basic belief assignment (bba) with the property

    are the focal sets of this bba.

    There is a very convenient graphical representation of these intervals: every A = [a, b], such that a, b ∈ [0,1] and a b, corresponds to a single point in the triangle of Figure 1, and vice versa. This triangle is defined as:

    ([a, b]) is assigned to the point (a, b[0,1] for every A . The convention for axes x and y is adopted as shown in Figure 1. In order to further illustrate this concept, consider the following example.

    Fig. 1 Point K = (a, b, uniquely defines the interval [a, b] ⊆ [0,1]

    2.1.1 Example 1.

    . Let A = [a,b] be an interval in [0,1], with a = 0.2 and b of interval A

    Table 1

    bba defined on A with six focal sets, and the corresponding belief, commonality and plausibility of A = [0.2,0.7]

    Fig. 2 Graphical representation of the focal set corresponding to Table 1

    is the sum of all the masses given to the subsets of A, thus to the non-empty intervals

    must lie in the shaded triangle of Figure 3.(a) - this triangle contains all (and only) the intervals [x,y] such that

    Fig. 3 Graphimathcal representation of (a) belief; (b)commonality; (c)plausibility

    x > a and y < b(A= 0.12.

    is denned as the sum of the masses given to the intervals

    must lie in the shaded rectangle of Figure 3.(b) - this rectangle contains all (and only) the intervals [x,y] such that x a and y b

    is defined as the sum of the masses given to the intervals

    must lie in the shaded area of Figure 3.(c) - this area contains all (and only) the intervals [x,y] such that x b and y a

    are zero-length intervals. If all the focal sets are non-empty intervals (as in our example), we can compute the pignistic probability density function (pdf) over singletons s (where 0 ≤ s ≤ 1) as follows [11]:

    where

    defined by 0 ≥ x s and s ≥ y ≥ 1. In our example, the pignistic pdf of say s = 0.35 would involve the focal sets 1,2 and 3 and would result in:

    Bet f is a proper probability density function.

    2.2 Continuous domain

    What we described so far essentially will remain valid, except that masses become densities and sums become integrals. Let m([a, b]) be a basic belief density (bbd) (we replace bbm by bbd to enhance that m is now a density). Let

    with the property that:

    (1)

    may be allowed to result in a value that is less than 1, with the missing belief allocated to the empty set, just as it was done in TBM [, with the limits of integration defined by the shaded areas in Figure 3. Thus we have:

    Using derivative-integral identities one can also write:

    (2)

    (3)

    as follows:

    (4)

    for a∈ [0,1]. We do not put directly ∈ = 0 in (4) in order to avoid division by zero.

    2.2.1 Example 2.

    , that is

    Then using (4) we get

    for 0 < a < 1 (see Figure 4).

    Fig. 4 Bet f (a) generated by a uniform density on T[0,1]

    2.2.2 Generalisation to .

    So far we have developed belief functions on I; one just needs to replace 0 (the lower limit) with −∞ and 1 (the upper limit) with +∞. Thus [0,1] is replaced with (−∞, ∞). Let us denote by I and by T the set of pairs (x, y² : x y. Then we say that m, bel, q and pl are defined on the Borel sigma algebra generated by I and f .

    3 The least committed bbd

    Suppose your domain knowledge is partial and based only on some potential betting behaviour, represented by the pignistic density function Bet f (a). Since the pignistic transform is many-to-one transform, an infinite number of belief density functions can induce the same Bet f. These belief funtions are said to be isopignistic. In order to apply the belief function theory (in the continuous domain) one needs to formulate a method of building a belief density (BD) from the pignistic density. The least commitment principle [11],[5] suggests to choose among all iso-pignistic belief densities, the belief density which maximizes the commonality function q As in the discrete case [12], the q this means that all focal sets on I are nested, i.e. can be ordered in such a way that each focal interval is contained by the following one.

    We will further concentrate on a unimodal pignistic density with a mode μ = argraaxa Bet f (a). The focal sets of the least committed (LC) belief density are intervals [a, b] which satisfy: Bet f (a) : Bet f (b). Consequently, for every focal

    interval of the LC-BD, [a,b], we have that μ ∈ [a,b]. Another very important property of the focal intervals of the LC-BD is that they form a line on the triangle

    T. This line has the following properties:

    – It starts from (x,y) = (μ, μ); the plausibility at this point is plx([μ, μ]) = 1

    – For all symmetrical pignistic densities Bet f (e.g. normal, Laplace, Cauchy), centered at μ, this is a straight line given by :

    Figure 5 shows the line of focal intervals in T for (a) normal pignistic density with μ = 2.5 and σ = 1; (b) gamma pignistic density Bet f(s) = s e−s, (s > 0), with the mode μ = 1.

    Fig. 5 The focal sets of the LC belief density (solid line in the upper triangle) induced by: (a)normal pignistic density; (b) gamma pignistic density

    The relationship between Bet f and any basic belief density in general is expressed by (4). Let us denote the LC bbd (induced by Bel f) as φ(u) where u ≥ 0. We have seen that the focal sets of this bbd are points on a line in T, and u corresponds to the distance from the point (μ, μ). Due to this specific form of the LC bbd, the relationship between Bet f(s) and φ(u) has a much simpler form than in (4). If s ≥ μ, then

    (5)

    is a function of u. By differentiation of (5) we obtain that:

    (6)

    The bbd φ(s) is always positive since:

    For model based classification problems, we apply the generalised Bayes theorem [11] which requires to compute the plausibility function from the bbd. Since the LC bbd is a consonant belief function, with the property that its focal sets are the points along a line in T, we can write:

    (7)

    (8)

    The limits of integration in (7) reflect the fact that only the focal intervals with the property x a ≤ ∞ will have a non-empty intersection with xwe obtain:

    (9)

    3.1 Example 3.

    Suppose the pignistic density is a normal density, i.e. Bet f(x(x; μ, σ). In order to work out the LC bbd φ(x) and its corresponding plausibility pl(xApplication of (6) and (9) yields for y > 0:

    (10)

    (11)

    It follows than: φ(x) = φ(y)/σ and pl x) = pl(y). The two functions are shown in Figure 6 for μ = 1 and σ = 1.5.

    Fig. 6 The LC bbd φ(x) (thin line)and its plausibility pl(x) (thick line), corresponding to Bet f(x(x;1,1.5)

    3.2 Example 4.

    Let Betf(x) be an exponential density:

    (12)

    Using the substitution y = (x aAs before,φ(x) = φ(y)/θ and pl(x) = pl(y).

    4 Application to model-based target classification

    In order to demonstrate an application of the theory presented above, let us consider one of the most difficult problems in military air surveillance: correct identification of non-cooperative flying objects in the surveillance volume. In general three groups of target attributes (features) are exploited for identification, those based on target shape, kinematic behaviour and electro-magnetic (EM) emissions [1]. Let us consider a simple example where the aim is to classify targets into one of the three platform categories [7]:

    Class 1 –. Commercial planes;

    Class 2 –. Large military aircrafts (such as transporters, bombers);

    Class 3 –. Light and agile military aircrafts (fighter planes).

    4.1 Speed as a target feature

    We will assume that the only available target feature is its speed (a kinematic feature obtained from the radar) [2], [16]. The speed profiles for our three classes can be described by Table 2 [2]:

    Table 2

    Speed intervals for three air platform categories (in km/h)

    First we present target classification using the Bayesian classifier, which is followed by the Belief function classifier.

    4.1.1 Bayesian analysis

    In order to apply the Bayesian classifier we must adopt a suitable probability density function of the speed conditioned on the class. Various possibilities are applicable, such the uniform, beta, Gaussian, etc. Let us adopt the Gaussian densities, with the parameters selected in such a way that P{s min < x < smax} = 0.99876, where [s min,smax] is the speed interval given in Table 2. Figure 7 shows the distribution of speed feature conditioned on the class. The hypothesis space is defined

    Fig. 7 Adopted pdf models (Gaussian) of target speed, conditioned on the class

    as C = {c ¹,c ²,c ³}. The Bayesian classifier (assuming the uniform prior for classes) will compute the probability of class Ci (i = 1,2,3) given feature x as:

    (13)

    where α is a normalisation constant. Figure 8 displays the class probabilities P{c i|x} computed for a range of speed values x ∈ [400,1000]

    Enjoying the preview?
    Page 1 of 1