Modern Information Processing: From Theory to Applications

Ebook936 pages

Modern Information Processing: From Theory to Applications

Name: Modern Information Processing: From Theory to Applications
ISBN: 9780080461694

By Ronald R. Yager

Rating: 0 out of 5 stars

()

Read preview

About this ebook

The volume "Modern Information Processing: From Theory to Applications," edited by Bernadette Bouchon-Meunier, Giulianella Coletti and Ronald Yager, is a collection of carefully selected papers drawn from the program of IPMU'04, which was held in Perugia, Italy.

The book represents the cultural policy of IPMU conference which is not focused on narrow range of methodologies, but on the contrary welcomes all the theories for the management of uncertainty and aggregation of information in intelligent systems, providing a medium for the exchange of ideas between theoreticians and practitioners in these and related areas.

The book is composed by 7 sections:

UNCERTAINTY
PREFERENCES
CLASSIFICATION AND DATA MINING
AGGREGATION AND MULTI-CRITERIA DECISION MAKING
KNOWLEDGE REPRESENTATION

•The book contributes to enhancement of our ability to deal effectively with uncertainty in all of its manifestations.
•The book can help to build brigs among theories and methods methods for the management of uncertainty.
•The book addresses issues which have a position of centrality in our information-centric world.
•The book presents interesting results devoted to representing knowledge: the goal is to capture the subtlety of human knowledge (richness) and to allow computer manipulation (formalization).
•The book contributes to the goal: an efficient use of the information for a good decision strategy.
APPLIED DOMAINS

· The book contributes to enhancement of our ability to deal effectively with uncertainty in all of its manifestations.
· The book can help to build brigs among theories and methods methods for the management of uncertainty.
· The book addresses issues which have a position of centrality in our information-centric world.
· The book presents interesting results devoted to representing knowledge: the goal is to capture the subtlety of human knowledge (richness) and to allow computer manipulation (formalization).
· The book contributes to the goal: an efficient use of the information for a good decision strategy.

Skip carousel

Intelligence (AI) & Semantics

LanguageEnglish

PublisherElsevier Science

Release dateOct 13, 2011

ISBN9780080461694

Related to Modern Information Processing

Intelligence (AI) & Semantics For You

Skip carousel

Midjourney Mastery - The Ultimate Handbook of Prompts
Ebook
Midjourney Mastery - The Ultimate Handbook of Prompts
byAndreea Todinca
Rating: 5 out of 5 stars
5/5
101 Midjourney Prompt Secrets
Ebook
101 Midjourney Prompt Secrets
byMarcus Byrne
Rating: 3 out of 5 stars
3/5
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
Ebook
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
Ebook
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
byTJ Books
Rating: 3 out of 5 stars
3/5
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®)
Ebook
A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®)
byS M Howard
Rating: 4 out of 5 stars
4/5
ChatGPT For Fiction Writing: AI for Authors
Ebook
ChatGPT For Fiction Writing: AI for Authors
byNova Leigh
Rating: 5 out of 5 stars
5/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
AI for Educators: AI for Educators
Ebook
AI for Educators: AI for Educators
byMatt Miller
Rating: 5 out of 5 stars
5/5
ChatGPT For Dummies
Ebook
ChatGPT For Dummies
byPam Baker
Rating: 0 out of 5 stars
0 ratings
Mastering ChatGPT: Unlock the Power of AI for Enhanced Communication and Relationships: English
Ebook
Mastering ChatGPT: Unlock the Power of AI for Enhanced Communication and Relationships: English
byVasyl Kolomiiets
Rating: 0 out of 5 stars
0 ratings
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
Ebook
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
byHadelin de Ponteves
Rating: 0 out of 5 stars
0 ratings
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
Ebook
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
byMatthew Hayes
Rating: 0 out of 5 stars
0 ratings
Rise of Generative AI and ChatGPT: Understand how Generative AI and ChatGPT are transforming and reshaping the business world (English Edition)
Ebook
Rise of Generative AI and ChatGPT: Understand how Generative AI and ChatGPT are transforming and reshaping the business world (English Edition)
byUtpal Chakraborty
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
Ebook
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
bySebastian Raschka
Rating: 5 out of 5 stars
5/5
Dancing with Qubits: How quantum computing works and how it can change the world
Ebook
Dancing with Qubits: How quantum computing works and how it can change the world
byRobert S. Sutor
Rating: 5 out of 5 stars
5/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Artificial Intelligence: A Guide for Thinking Humans
Ebook
Artificial Intelligence: A Guide for Thinking Humans
byMelanie Mitchell
Rating: 4 out of 5 stars
4/5
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
Ebook
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
byThe Passive Income Strategist
Rating: 4 out of 5 stars
4/5
ChatGPT
Ebook
ChatGPT
byRobert Conway
Rating: 1 out of 5 stars
1/5
Ways of Being: Animals, Plants, Machines: The Search for a Planetary Intelligence
Ebook
Ways of Being: Animals, Plants, Machines: The Search for a Planetary Intelligence
byJames Bridle
Rating: 4 out of 5 stars
4/5
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
The Algorithm of the Universe (A New Perspective to Cognitive AI)
Ebook
The Algorithm of the Universe (A New Perspective to Cognitive AI)
byAncient Philosophy
Rating: 5 out of 5 stars
5/5
What Makes Us Human: An Artificial Intelligence Answers Life's Biggest Questions
Ebook
What Makes Us Human: An Artificial Intelligence Answers Life's Biggest Questions
byJasmine Wang
Rating: 5 out of 5 stars
5/5
THE CHATGPT MILLIONAIRE'S HANDBOOK: UNLOCKING WEALTH THROUGH AI AUTOMATION
Ebook
THE CHATGPT MILLIONAIRE'S HANDBOOK: UNLOCKING WEALTH THROUGH AI AUTOMATION
byLogan Rivers
Rating: 5 out of 5 stars
5/5
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
Ebook
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
byDavid Mayer
Rating: 0 out of 5 stars
0 ratings
The Secrets of ChatGPT Prompt Engineering for Non-Developers
Ebook
The Secrets of ChatGPT Prompt Engineering for Non-Developers
byCea West
Rating: 5 out of 5 stars
5/5
TensorFlow in 1 Day: Make your own Neural Network
Ebook
TensorFlow in 1 Day: Make your own Neural Network
byKrishna Rungta
Rating: 4 out of 5 stars
4/5
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
The Business Case for AI: A Leader's Guide to AI Strategies, Best Practices & Real-World Applications
Ebook
The Business Case for AI: A Leader's Guide to AI Strategies, Best Practices & Real-World Applications
byKavita Ganesan
Rating: 0 out of 5 stars
0 ratings

Related categories

Skip carousel

Reviews for Modern Information Processing

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Modern Information Processing - Bernadette Bouchon-Meunier

Uncertainty

Entropies, Characterizations, Applications and Some History

János Aczél, Faculty of Mathematics University of Waterloo, Waterloo, ON N2L 3G1, Canada. E-mail address: jdaczel@math.uwaterloo.ca

Abstract

Entropies with useful and/or interesting properties are presented. Characterizations are given, based on such properties and some applications are mentioned. Attention is directed to an example of discovery and rediscovery, and to new applications in utility theory.

1 INTRODUCTION, ENTROPY

Defining new entropies in addition to the classical Shannon entropy seems to be an ongoing industry. I am still convinced, however, of what I wrote in [1]:

"In the best of all possible worlds there is an information measure that originated from an applied problem, has interesting properties (usually attractive, reasonable generalizations of properties of Shannon’s entropy or of similar widely used measures), and those characterize it. Less ideal but still acceptable is in my opinion the following situation. Some natural looking weakening or generalization of properties characterizing Shannon type measures are isolated and all measures having these properties are determined. If the properties are indeed intuitive and significant then there is a good chance that the measures thus obtained may have future applications.

But what many authors seem to do is to contrive some generalization of known information measures (usually by sticking parameters almost at random here and there), derive its often not very interesting or natural or even attractive properties and then characterize by several of these properties the ‘measures’ they have defined in the first place. Not many good or useful results can be expected from this kind of activity."

An earlier version of the present paper appeared in [2], As there, I express also here my belief that two families of probabilistic entropies (of which Shannon’s entropy is a limit case) suffice - in addition to entropies depending on objects other than probabilities. I will define them and state (without proof but with references) some of their properties and characterizations that I consider reasonable, and mention some applications.

Our models will mostly be complete systems of mutually exclusive events E1,.., En (such as the possible outcomes of an experiment), with probabilities p1, ….,pn, respectively

Entropies are measures of uncertainty in or measures of information

2 SHANNON ENTROPY

0) by

(1)

then the definition

is used. Here n = 1 is also permissible and gives as entropy of a single event

We take here the following ‘reasonable’ properties of the Shannon entropy. It is

(i) SYMMETRIC:Hn is a symmetric function of its n variables (invariant under exchange of pj and pk (J,k = 1,….,n)),

(e) EXPANSIBLE:Hn+1(p1,….,pn,0) = Hn(p1,….,pn) : enlarging by an event of probability 0 does not change the entropy (expected information),

(c) SMALL FOR SMALL PROBABILITIES: H2(1,0) = 0, limgq→0H2(1 − q,q) = 0 : little information can be expected if one of the events is almost certain (has probability close to 1, thus the other(s) close to 0). Thus H2 is continuous at (1,0).

Now we get to important properties. We deal with three experiments:P with outcomes D1,…., Dm, probabilities pj = p(Dj) (j = 1,….,m);

Q with outcomes E1,…., En, probabilities qk = p(Ek) (k = 1,…., n);

"P and Q", denoted by P*Q, with outcomes " Dj and ek ", denoted by Dj∩Ek having the probabilities pjk = p(Dj∩ ek) (j = 1,…, m; k = 1,…, n).

The respective entropies (information expected from experiments P, Q, P*Q) are H(P), H(Q), H(P * Q). The remaining two properties are:

(s) SUBADDITIVITY: H(P*Q) < H(P) + H(Q) : information expected from two experiments is not greater than the sum of informations expected from the single experiments;

(a) ADDITIVITY: H(P*Q) = H(P)+H(Q) if P and Q are independent, that is, if p(Dj∩ Ek) = p(Dj)p(Ek) (j = 1,…,m,; k = 1,…, n), or, in words: information expected from two independent experiments equals the sum of informations expected from the single experiments.

The following characterization theorem has been proved in [6] (see also [4]).

THEOREM 1. If and only if H is (i) symmetric, (n) normalized, (e) expansible, (c) small for small probabilities, (s) subadditive and (a) additive, then H is the Shannon entropy.

NOTE 2. The nonnegativity of the constant multiplier is guaranteed by (s). Without (c) and (n) a nonnegative constant times the logarithm of the number of events with positive probability can be added. The latter logarithm is often called Hartley entropy.

An older, classical characterization of the Shannon entropy is by

(r) RECURSIVITY:

Improving a result of D.K. Faddeev [17], Z. Daroczy [13,4] has proved the following.

THEOREM 2 If and only if H is (i) symmetric, (n) normalized, (c) small for small probabilities, and (r) recursive, then H is the Shannon entropy (1).

G.T. Diderrich [16] weakened conditon (c) to boundedness on an interval (square).

3 RÉNYI ENTROPY

The Shannon entropy for positive probabilities is the weighted arithmetic mean (with the probabilities as weights) of the quantities – log2Pk(k = 1,…,n) which can be considered (see Note 1) entropies of single events. The arithmetic mean is not the only interesting average and the Shannon entropy is not the only interesting entropy.

The Rényi [24,25] (see also [4]) entropy of order a is defined by

(2)

[pk> 0 (k = 1,…, n) for sake of simplicity]. Here α ≠ 1 but limα→1(αHn(p1, …,pn

For further reference we define a straightforward generalization of the weighted arithmetic mean of the entropies − log2Pk of the single events, weighted by Pk (k = 1,…,n):

(3)

where f respectively.

The Rényi entropies of positive order (including the Shannon entropy as of order 1) have the following characterization ([3], see also [4]).

THEOREM 3. The weighted quasiariihrnetic mean (3) is (a) additive and(c) small for small probabilities if and only if it is either the Shannon’s entropy (1) or a Rényi entropy (2) of positive (but ≠ 1)order.

The Rényi entropies α Hn (α ≠ 1) need not be (s) subadditive. Rényi entropies have applications to random search problems [24,25], questionnaire theory [8], optimal coding (the greatest lower bounds of the arithmetic or exponential mean codeword lengths are the Shannon and the Rényi entropies, respectively) [11], even to differential geometry [12].

4 HAVRDA-CHÁRVAT-DARÓCZY-TSALLIS ENTROPY

In 1988, C. Tsallis [27] introduced the entropy

(4)

(this is an equivalent form, slightly altered for the sake of comparisons below). It has been named Tsallis entropy and applied to generalizing Boltzmann-Gibbs statistical mechanics and related fields [27,26]. Characterizations were also supplied, including the most concise set of axioms in [26].

Already in 1967, however, J. Havrda and F. Chárvat [19] defined the entropy

(5)

that is constant times (4). In 1970 Z. Daróczy [is the Shannon entropy. The entropy (5) (and also (4)) satisfies

(rα) RECURSIVITY OF DEGREE α: For P1 + P2 > 0,

The entropies (5) have the following characterisation [14,4] (and (4) has a similar one).

THEOREM 4. An entropy (sequence of functions of p1, …,pn (n = 2,3,…)) is of the form (5) if and only if it is (i) symmetric, (n) normalized, and (rα) recursive of degree α.

The entropies (5) are (s) subadditive if α > 1. Neither (5) nor (4) need to be additive but they are pseudoadditive:

B. Forte and C.T. Ng ([18]) gave the following characterization of (5) for all α by conditions that do not contain α.

THEOREM 5. An entropy is of the form (5) if and only if it is (i) symmetric, (n) normalized, furthermore continuous, and satisfies H2(1,0) = 1, branching:

andcompositivity:

where

Notice that the branching property is a generalization of recursivity and of recursivity of degree α. Compositivity is related to weighted quasiarithmeticity (3).

There is a connection between (2) and (5) or between (2) and (4) [though not so simple as between (4) and (5)]:

5 ENTROPIES CONTAINING OBJECTS OTHER THAN PROBABILITIES

There exists a theory of information without probability. We dont go into it here but refer the reader to the survey [20] by J. Kampé de Fériet. But we make short mention of the mixed theory of information (see e.g. [5,1]). There entropies, called inset entropies, may depend upon the events themselves, not only upon their probabilities. Boolean rings of sets contain Ω itself. By an appropriate generalization of the (r) recursivity, one can characterize the simplest inset entropies ([5]):

(6)

where E1,…, En are elements of the Boolean ring of sets, p1, …,pn are their probabilities, h is an arbitrary real valued function on the Boolean ring, and a is an arbitrary constant.

It has applications, among others, in geographical and economic analysis [9,10], in the fuzzy theory of information and in gas dynamics [1].

An interesting early application (before the mixed theory of information was formally developped) is due to J. R. Meginnis ([22]; cf. [1]). He considers the second term in (6) as the expected gain in gambling (Ek being the k–th outcome of the gamble, with the gain h(Ek) attached to it). So it is the first term that has to be explained. Since the expected gain alone would not motivate gambling (it is almost always nonpositive), he interprets the first term as quantifying the joy in gambling. He characterized expression (6) in this interpretation as utility of a gamble, and also the one corresponding to (5),

(0 < c ≠ 1; see also [7] for characterization of the latter inset entropy under weaker assumptions within the mixed theory of information). R. D. Luce recently initiated with coauthors ([21,23]) a new theory of entropy-modified linear utility where, in both of Meginniss ([22])’s expressions, probabilities are replaced by more general weights associated with events. They also characterized these expressions under considerably weaker assumptions than Meginniss ([22]).

We could have spoken about entropies of incomplete systems also in sections 3 and 4, about conditional entropies, entropies of continuous distributions, information measures for several systems (distributions) etc. everywhere. But here we stop.

Acknowledgement

This work has been supported in part by Natural Sciences and Engineereing Research Council of Canada grant #OGP0002972.

REFERENCES

1. Aczél, J., Characterizing Information Measures: Approaching the End of an Era. International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems. Selected and Extended Contributions. Lecture Notes in Computer Science; Vol. 286. Springer, Berlin/New York, 1987:359–384.

2. Aczél, J., Entropies Old and New (and Both New and Old) and Their Characterizations. Bayesian Inference and Maximum Entropy Methods in Science and Engineering. AIP Conference Proceedings; Vol. 707. American Institute of Technology, Melville, NY, 2004:119–126.

3. Aczél, J., Daróczy, Z. Sur la caractérisation axiomatique des entropies d’ordre positf, y comprise l’entropie de Shannon. C. R. Acad. Sci. Paris. 1963; 257:1581–1584.

4. Aczél, J., Daróczy, Z., On Measures of Information and Their Characterizations. Mathematics in Science and Engineering; Vol. 115. Academic Press, New York, 1975.

5. Aczél, J., Daróczy, Z. A Mixed Theory of Information, I: Symmetric, Recursive and Measurable Entropies of Randomized Systems of Events. RAIRO Informat. Théor. 1978; 12:149–155.

6. Aczél, J., Forte, B., Ng, C. T. Why the Shannon and Hartley Entropies Are ‘Natural’. Adv. in Appl. Probab. 1974; 6:131–146.

7. Aczél, J., Kannappan, P. A Mixed Theory of Information, III. Inset Entropies of Degree β. Inform. and Control. 1978; 39:315–322.

8. Aggarval, N. L., Cesari, Y., Picard, C. -F. Propriétés de branchement liées aux questionnaires de Campbell et à l’information de Rényi. C. R. Acad. Sci. Paris. 1972; 275A:437–440.

9. Batten, D. F. Spatial Analysis of Intersecting Economies. Kluwer, Boston, 1983.

10. Batty, M. Speculations on an Information Theoretical Approach to Spatial Representation. In: Spatial Representation and Spatial Interaction. Leiden/Boston: Nijhoft; 1978:115–147.

11. Campbell, L. L. A Coding Theorem and Rényi’s Entropy. Inform. and Control. 1965; 8:423–429.

12. Campbell, L. L. The Relation Between Information Theory and the Differential Geometry Approach to Statistics. Inform. Sci. 1985; 35:199–210.

13. [English translation in Selected Translations in Mathematical Statistics and Probability, Vol. 10, Inst. Math. Statist. /Amer. Math. Soc., Providence, RI, 1972, pp. 193–210]. Daróczy, Z. On the Shannon Measure of Information (Hungarian). Magyar Tud. Akad. Mat. Fiz. Oszt. Közl. 1969; 19:9–24.

14. Daróczy, Z. Generalized Information Functions. Information and Control. 1970; 16:36–51.

15. Devijver, P. A. Entropies of Degree β and the Lower Bound of the Average Error Rate. Inform. and Control. 1977; 34:222–226.

16. Diderrich, G. T. Local Boundedness and the Shannon Entropy. Inform. and Control. 1978; 36:149–161.

17. [German translation in Mathematische Forshungsberichte, No. IV DVW, Berlin, 1967, pp. 88–90]. Faddeev, D. K. On the Concept of Entropy of a Finite Probabilistic Scheme (Russian). Uspekhi Mat. Nauk. 1956; 11(No. 1(67)):227–231.

18. Forte, B., Ng, C. T. On a Characterization of the Entropies of Degree β. Utilitas Math. 1973; 4:193–205.

19. Havrda, J., Charvát, F. Quantification Method of Classification Processes. Concept of Structural α-Entropy. Kybernetika (Prague). 1967; 3:30–35.

20. Kampé de Fériet, J. La théorie géneralisée de l’information et la mesure subjective de l’information. In: Théories de l’information, Actes Rencontres, Marseille-Luminy. New York: Springer; 1974:1–35.

21. R. D. Luce, C. T. Ng, A. A. J. Marley, and J. Aczél, Merging Savage and Shannon: Entropy-Modified Linear Additive Utility, in preparation.

22. Meginnis, J. R. A New Class of Symmetric Utility Rules for Gambles, Subjective Marginal Probability Functions, and a Generalized Bayes Rule. Bus. and Econ. Stat. Sec. Proc. Amer. Stat. Assoc. 1976; 1976:471–476.

23. C. T. Ng, R. D. Luce, and A. A. J. Marley, Utility of Gambling: Extending the Approach of Meginniss, in preparation.

24. Rényi, A., On Measures of Entropy and Information. Proceedings of the 4th Berkeley Symposium on Mathematical Statistics and Probability; Vol. I. University of California Press, Berkeley, 1961:547–561.

25. Rényi, A. Probability Theory. North Holland/Elsevier, Amsterdam/New York, 1970.

26. Suyari, H. On the Most Concise Set of Axioms and the Uniqueness Theorem for Tsallis Entropy. J. Phys. A Math. Gen. 2002; 35:10731–10738.

27. Tsallis, C. Possible Generalization of Boltzmann-Gibbs Statistics. J. Statist. Phys. 1988; 52(no. 1-2):479–487.

Belief function theory on the continuous space with an application to model based classification

B Ristica and Ph. Smetsb, aISR Division, DSTO, Edinburgh, Australia; bIRIDIA, Université libre de Bruxelles, Bruxelles, Belgium

Abstract

. When our domain knowledge is represented by the pignistic probability density, then we build the corresponding least committed belief function. The theory is applied to model based classification and the results are compared to the classical Bayesian approach.

Key words

Belief function theory

evidential theory

transferrable belief model

target classification

1 Introduction

The belief function theory (evidential theory) has been primarily developed for discrete frames of discernment (frames). Following [n.

When our domain knowledge is partial and represented only by a potential betting behavior on the observation (in the continuous domain), we model it by a pignistic probability density. In this case we can build the least committed belief function among those which correspond to the given one. Then we can apply the usual tools of the belief function theory, such as the Generalised Bayesian theorem, combination rules (e.g. Dempster’s rule of combination), etc. The theory is applied to model based target classification and the results are compared to those achieved by the classical Bayesian approach.

We accept that beliefs are quantified by belief functions as described in the transferable belief model (TBM) [.

2 Belief functions on

This section presents the extracts from a more thorough study presented in [[α,β] be a set of closed intervals in [α,β]. Formally,

such that

we have:

2.1 Finite number of focal sets

[0,1] consisting of a finite number of non-empty intervals on [0,1]:

For convenience, use notation A0 → [0,1] is a basic belief assignment (bba) with the property

are the focal sets of this bba.

There is a very convenient graphical representation of these intervals: every A = [a, b], such that a, b ∈ [0,1] and a ≥ b, corresponds to a single point in the triangle of Figure 1, and vice versa. This triangle is defined as:

([a, b]) is assigned to the point (a, b[0,1] for every A . The convention for axes x and y is adopted as shown in Figure 1. In order to further illustrate this concept, consider the following example.

Fig. 1 Point K = (a, b, uniquely defines the interval [a, b] ⊆ [0,1]

2.1.1 Example 1.

. Let A = [a,b] be an interval in [0,1], with a = 0.2 and b of interval A

Table 1

bba defined on A with six focal sets, and the corresponding belief, commonality and plausibility of A = [0.2,0.7]

Fig. 2 Graphical representation of the focal set corresponding to Table 1

is the sum of all the masses given to the subsets of A, thus to the non-empty intervals

must lie in the shaded triangle of Figure 3.(a) - this triangle contains all (and only) the intervals [x,y] such that

Fig. 3 Graphimathcal representation of (a) belief; (b)commonality; (c)plausibility

x > a and y < b(A= 0.12.

is denned as the sum of the masses given to the intervals

must lie in the shaded rectangle of Figure 3.(b) - this rectangle contains all (and only) the intervals [x,y] such that x ≤ a and y ≥ b

is defined as the sum of the masses given to the intervals

must lie in the shaded area of Figure 3.(c) - this area contains all (and only) the intervals [x,y] such that x b and y a

are zero-length intervals. If all the focal sets are non-empty intervals (as in our example), we can compute the pignistic probability density function (pdf) over singletons s (where 0 ≤ s ≤ 1) as follows [11]:

where

defined by 0 ≥ x ≥ s and s ≥ y ≥ 1. In our example, the pignistic pdf of say s = 0.35 would involve the focal sets 1,2 and 3 and would result in:

Bet f is a proper probability density function.

2.2 Continuous domain

What we described so far essentially will remain valid, except that masses become densities and sums become integrals. Let m([a, b]) be a basic belief density (bbd) (we replace bbm by bbd to enhance that m is now a density). Let

with the property that:

(1)

may be allowed to result in a value that is less than 1, with the missing belief allocated to the empty set, just as it was done in TBM [, with the limits of integration defined by the shaded areas in Figure 3. Thus we have:

Using derivative-integral identities one can also write:

(2)

(3)

as follows:

(4)

for a∈ [0,1]. We do not put directly ∈ = 0 in (4) in order to avoid division by zero.

2.2.1 Example 2.

, that is

Then using (4) we get

for 0 < a < 1 (see Figure 4).

Fig. 4 Bet f (a) generated by a uniform density on T[0,1]

2.2.2 Generalisation to .

So far we have developed belief functions on I; one just needs to replace 0 (the lower limit) with −∞ and 1 (the upper limit) with +∞. Thus [0,1] is replaced with (−∞, ∞). Let us denote by I and by T the set of pairs (x, y² : x ≤ y. Then we say that m, bel, q and pl are defined on the Borel sigma algebra generated by I and f .

3 The least committed bbd

Suppose your domain knowledge is partial and based only on some potential betting behaviour, represented by the pignistic density function Bet f (a). Since the pignistic transform is many-to-one transform, an infinite number of belief density functions can induce the same Bet f. These belief funtions are said to be isopignistic. In order to apply the belief function theory (in the continuous domain) one needs to formulate a method of building a belief density (BD) from the pignistic density. The least commitment principle [11],[5] suggests to choose among all iso-pignistic belief densities, the belief density which maximizes the commonality function q As in the discrete case [12], the q this means that all focal sets on I are nested, i.e. can be ordered in such a way that each focal interval is contained by the following one.

We will further concentrate on a unimodal pignistic density with a mode μ = argraaxa Bet f (a). The focal sets of the least committed (LC) belief density are intervals [a, b] which satisfy: Bet f (a) : Bet f (b). Consequently, for every focal

interval of the LC-BD, [a,b], we have that μ ∈ [a,b]. Another very important property of the focal intervals of the LC-BD is that they form a line on the triangle

T. This line has the following properties:

– It starts from (x,y) = (μ, μ); the plausibility at this point is plx([μ, μ]) = 1

– For all symmetrical pignistic densities Bet f (e.g. normal, Laplace, Cauchy), centered at μ, this is a straight line given by :

Figure 5 shows the line of focal intervals in T for (a) normal pignistic density with μ = 2.5 and σ = 1; (b) gamma pignistic density Bet f(s) = s e−s, (s > 0), with the mode μ = 1.

Fig. 5 The focal sets of the LC belief density (solid line in the upper triangle) induced by: (a)normal pignistic density; (b) gamma pignistic density

The relationship between Bet f and any basic belief density in general is expressed by (4). Let us denote the LC bbd (induced by Bel f) as φ(u) where u ≥ 0. We have seen that the focal sets of this bbd are points on a line in T, and u corresponds to the distance from the point (μ, μ). Due to this specific form of the LC bbd, the relationship between Bet f(s) and φ(u) has a much simpler form than in (4). If s ≥ μ, then

(5)

is a function of u. By differentiation of (5) we obtain that:

(6)

The bbd φ(s) is always positive since:

For model based classification problems, we apply the generalised Bayes theorem [11] which requires to compute the plausibility function from the bbd. Since the LC bbd is a consonant belief function, with the property that its focal sets are the points along a line in T, we can write:

(7)

(8)

The limits of integration in (7) reflect the fact that only the focal intervals with the property x ≤ a ≤ ∞ will have a non-empty intersection with xwe obtain:

(9)

3.1 Example 3.

Suppose the pignistic density is a normal density, i.e. Bet f(x(x; μ, σ). In order to work out the LC bbd φ(x) and its corresponding plausibility pl(xApplication of (6) and (9) yields for y > 0:

(10)

(11)

It follows than: φ(x) = φ(y)/σ and pl x) = pl(y). The two functions are shown in Figure 6 for μ = 1 and σ = 1.5.

Fig. 6 The LC bbd φ(x) (thin line)and its plausibility pl(x) (thick line), corresponding to Bet f(x(x;1,1.5)

3.2 Example 4.

Let Betf(x) be an exponential density:

(12)

Using the substitution y = (x −aAs before,φ(x) = φ(y)/θ and pl(x) = pl(y).

4 Application to model-based target classification

In order to demonstrate an application of the theory presented above, let us consider one of the most difficult problems in military air surveillance: correct identification of non-cooperative flying objects in the surveillance volume. In general three groups of target attributes (features) are exploited for identification, those based on target shape, kinematic behaviour and electro-magnetic (EM) emissions [1]. Let us consider a simple example where the aim is to classify targets into one of the three platform categories [7]:

Class 1 –. Commercial planes;

Class 2 –. Large military aircrafts (such as transporters, bombers);

Class 3 –. Light and agile military aircrafts (fighter planes).

4.1 Speed as a target feature

We will assume that the only available target feature is its speed (a kinematic feature obtained from the radar) [2], [16]. The speed profiles for our three classes can be described by Table 2 [2]:

Table 2

Speed intervals for three air platform categories (in km/h)

First we present target classification using the Bayesian classifier, which is followed by the Belief function classifier.

4.1.1 Bayesian analysis

In order to apply the Bayesian classifier we must adopt a suitable probability density function of the speed conditioned on the class. Various possibilities are applicable, such the uniform, beta, Gaussian, etc. Let us adopt the Gaussian densities, with the parameters selected in such a way that P{s min < x < smax} = 0.99876, where [s min,smax] is the speed interval given in Table 2. Figure 7 shows the distribution of speed feature conditioned on the class. The hypothesis space is defined

Fig. 7 Adopted pdf models (Gaussian) of target speed, conditioned on the class

as C = {c ¹,c ²,c ³}. The Bayesian classifier (assuming the uniform prior for classes) will compute the probability of class Ci (i = 1,2,3) given feature x as:

(13)

where α is a normalisation constant. Figure 8 displays the class probabilities P{c i|x} computed for a range of speed values x ∈ [400,1000]

Enjoying the preview?

Page 1 of 1

Modern Information Processing: From Theory to Applications

About this ebook

Related to Modern Information Processing

Intelligence (AI) & Semantics For You

Related categories

Reviews for Modern Information Processing

What did you think?

Book preview

Modern Information Processing - Bernadette Bouchon-Meunier

Abstract

1 INTRODUCTION, ENTROPY

2 SHANNON ENTROPY

3 RÉNYI ENTROPY

(2)

(3)

4 HAVRDA-CHÁRVAT-DARÓCZY-TSALLIS ENTROPY

(4)

(5)

5 ENTROPIES CONTAINING OBJECTS OTHER THAN PROBABILITIES

Acknowledgement

REFERENCES

Abstract

Key words

1 Introduction

2 Belief functions on

2.1 Finite number of focal sets

2.1.1 Example 1.

2.2 Continuous domain

(4)

2.2.1 Example 2.

2.2.2 Generalisation to .

3 The least committed bbd

(9)

3.1 Example 3.

3.2 Example 4.

4 Application to model-based target classification

4.1 Speed as a target feature

4.1.1 Bayesian analysis

(13)