Abstract
Traditional databases handle data, which is crisp, deterministic and precise in nature. However our
reasoning and decision-making process is uncertain and vague in nature. This paper gives an insight into
the world of uncertainty. The concept of fuzziness in databases and the ways of handling the fuzzy queries
to databases / fuzzy databases are explained in this paper. We have proposed two models by which
uncertainty can be handled in databases. The first model deals with fuzzy query to a crisp database while
the second model deals with storage and retrieval of fuzzy information in database. A prototype of both the
models has also been implemented in JAVA.
Keywords:
Fuzzy Logic, Fuzzy Sets, Fuzzy Relational Databases, Fuzzy SQL
Most of our traditional tools for formal modeling, reasoning and computing are
crisp, deterministic and precise in nature. Precision assumes that the parameters
of a model represent exactly either our perception of the phenomenon modeled
or the features of the real system that has been modeled. Certainty eventually
indicates that we assume the structures and parameters of the model to be
definitely known.
However, if the model or theory asserts factuality, then the modeling language
has to be suited to model the characteristics of the situation under study
appropriately. However we have a problem. For factual models or modeling
languages, two major complications arise:
1. Real situations are very often not crisp and deterministic and cannot be
described precisely i.e. real situations are very often uncertain or vague
in a number of ways.
The last two types of uncertainties can be classified as a higher uncertainty type,
ambiguity, which means any situation in which it remains unclear which of
several alternatives should be accepted as the genuine one. In general, ambiguity
results from lack of certain distinctions characterizing an object, from conflicting
distinctions or from both of these.
PROJECT (Student_Name)
WHERE 19 ≤ AGE ≤ 23 and 3 ≤ GPA ≤ 4
But this system has a major flaw. Consider a student, Krishna whose age is 24
and has a good GPA of 4 out of 4. He should have been selected but is not. It is
because of the rigid boundary conditions set by the normal crisp logic.
In fuzzy logic we would do the same by specifying two fuzzy sets YOUNG and
GPA
1 1
0 17 19 23 25 0 3 3.5 4
(a) (b)
Fig. 1 : (a) Age ; (b) GPA
and each student will have some membership grade associated with the two sets.
So according to our definition Krishna will have a non–zero membership grade
although it will be less than other students in the age group 19-23.
Hence even Krishna will be included in the result set to be considered as Krishna
also satisfies the query to some extent, which is represented by its membership
grade.
Definition:
q Union, OR
q Intersection, AND
q Complement, NOT
q Fuzzy Complement,
~A(x) = 1 - A(x)
q Fuzzy Union,
(A∪B)(x) = max[A(x), B(x)].
q Fuzzy Intersection,
(A∩B)(x) = min[A(x), B(x)].
More information regarding fuzzy operators and their properties can be found in
[5], [10].
3. Fuzzy Databases
3.1 Need For Fuzzy Databases
One of the major concerns in the design and implementation of fuzzy databases
is efficiency i.e. these systems must be fast enough to make interaction with the
human users feasible.
-d X +d
( APPROX X )
Fig. 2: Possibility Distribution for an approximate value
The parameter, d gives the range around which the information value
lies.
The information in this case is totally vague and we associate a fuzzy set with the
information. A linguistic term is the name given to the fuzzy set.
e.g., X is SMALL
Temperature is HOT
These are considered have a trapezoidal shaped possibility distribution as shown
below
SMALL
1
0
α β ã ä
Fig. 3: Possibility Distribution for a Linguistic Term
SMALL for the Linguistic Variable HEIGHT
The easiest way of introducing fuzziness in the database model is to use classical
relational databases and formulate a front end to it that shall allow fuzzy
querying to the database. A limitation imposed on the system is that because we
are not extending the database model nor are we defining a new model in any
way, the underlying database model is crisp and hence the fuzziness can only be
incorporated in the query.
0
αY βY γY,αM δY,βM γΜ,αΟ δΜ,βO γO δO
Fig. 4 : Age
For this we take the example of a student database which has a table STUDENTS
with the following attributes:
At the level of meta knowledge we need to add only a single table, LABELS with
the following structure:
LABELS
This table is used to store the information of all the fuzzy sets defined on all the
attribute domains. A description of each column in this table is as follows:
• Label: This is the primary key of this table and stores the linguistic term
associated with the fuzzy set.
• Column_Name: Stores the linguistic variable associated with the given
linguistic term.
• Alpha, Beta, Gamma, Delta: Stores the range of the fuzzy set as shown in
Fig. 3 above.
4.3 Implementation:
The main issue in the implementation of this system is the parsing of the input
fuzzy query.
As the underlying database is crisp, i.e. no fuzzy data is stored in the database,
the INSERT query will not change and need not be parsed therefore it can be
presented to the database as it is.
During parsing the query is parsed and divided into the following
The implementation of the proposed system has been done in JAVA using a
MySQL database as the backend and the mm.mysql.jdbc-1.2c type 3 JDBC
driver.
In the previous section, we have discussed how vague queries can be used on
relational databases. We now present the design of a Fuzzy Relational Database
in which not only the fuzzy queries can be applied rather fuzzy information can
also be stored in it.
Considering the same database as given in section 4.1, with the difference that,
now the attributes AGE, PERCENTAGE and ABSENCES can have fuzzy
information and the remaining are considered to be crisp.
Based on the information data classification, the attributes in the database are
defined to be two types:
• Type 1 : The attribute can store only crisp values.
• Type 2 : The attribute is fuzzy and can take either a crisp value, an
approximate value or a linguistic term.
5.2 Metadata
In this case, at the level of Meta knowledge we require three tables as discussed
below.
1. COLUMNS_IN_DB
Column_Name Type
Fig. 8. (a): Part of Meta Knowledge
This table stores the types (section 5.1) of all the attributes in the table. A
description of each column in this table is as follows:
• Column_Name: This is the primary key of this table and its tuples
correspond to the attributes in the table, STUDENTS.
• Type: This stores the type of the corresponding attribute and this
can have two values, namely, 1 and 2 for the two types as
mentioned in section 5.1 (crisp and fuzzy).
2. APPROXIMATE_VALUES_TABLE
Column_Name Margin
Fig. 8 (b): Part of Meta Knowledge
3. LABELS
Column
Label Alpha Beta Gamma Delta
Name
Fig. 8 (c): Part of Meta Knowledge
This table is used to store the information of all the fuzzy sets defined on
all the attribute domains, along with there parameters, α, β, ã and ä, as
shown in Fig. 3. A description of each column in this table is as follows:
• Label: This is the primary key of this table and stores the linguistic
term associated with the fuzzy set.
• Column_Name: This is a foreign key here and corresponds to
COLUMNS_IN_DB.
• Alpha / Beta / Gamma / Delta: These correspond to the
parameters α, β, ã and ä.
5.3 Implementation
Here again the main issue in the implementation of this system is the parsing of
the input fuzzy query.
During parsing the query is parsed and divided into the following
The implementation of the proposed system has been done in JAVA using a
MySQL database as the backend and the mm.mysql.jdbc-1.2c type 3 JDBC
driver.
6. Query Language:
The syntax of the query language remains the same for both the models and is
defined as follows:
SELECT
The syntax of the SELECT statement is as follows
And CON is a connective that is used to combine two conditions e.g. OR,
AND etc.
THOLD specifies the alpha cut [5][10] that is to be applied to the result
set.
The type of operations and their syntax that we shall allow to the database are:
a. INSERT
This is the same as specified in SQL and has the following structure,
b. DELETE
The structure of the DELETE statement is
DELETE
FROM <TABLE>
[WHERE <CONDITION1> [<CON> <CONDITION2> …]]
where CONDITION and CON is defined the same as in section 6 above.
e.g. DELETE
FROM STUDENTS
WHERE PERCENTAGE > 85 AND ABSENCES ARE LOW
c. UPDATE
The structure of the DELETE statement is
UPDATE <TABLE>
SET VALUES <ATTRIBUTE1> = <expression1>
[, <ATTRIBUTE2> = <expression2> …]
[WHERE <CONDITION1> [<CON> <CONDITION2> …]]
This is a minimal set of operators and more can be added if the need arises.
The type of operations and their syntax that we shall allow to the database are:
a. INSERT
b. DELETE
The structure of the DELETE statement is
DELETE
FROM <TABLE>
[WHERE <CONDITION1> [<CON> <CONDITION2> …]]
e.g. DELETE
FROM STUDENTS
WHERE PERCENTAGE > APPROX 85 AND ABSENCES ARE LOW
c. UPDATE
The structure of the DELETE statement is
UPDATE <TABLE>
SET VALUES <ATTRIBUTE1> = <expression1>
[, <ATTRIBUTE2> = <expression2> …]
[WHERE <CONDITION1> [<CON> <CONDITION2> …]]
This is a minimal set of operators and more can be added if the need arises.
Even though second model is more flexible, but fuzzy databases are still not very
much in use because people are reluctant to replace their crisp data by fuzzy data
before they are convinced that it is worthwhile or necessary to do so. From this
point of view, the first model scores over the second model as it can be used with
crisp data and also it is making use of the power of fuzzy theory.
[1] Bosc P., Liétard I and Pivert O. “Evaluation of flexible queries : The
quantified statement case”, Technologies for Constructing Intelligent
Systems I, Physica-Verlag Heidelberg New York, pp 337-350 (2002)
[5] Klir G. J. and Yuan B. [2001], “Fuzzy Sets and Fuzzy Logic : Theory and
Applications”, Prentice Hall, Inc. Englewood Cliffs, N. J., U.S.A.
[6] Medina J. M., Pons O., Vila M.A. “GEFRED, A Generalized Model of
Fuzzy Relational Databases”. Information Sciences, 76, 1-2, pp 87-109.
(1994)
[7] Yang Q., Zhang W., Liu C., Wu J., Nakajima H. and Rishe N.D. “Efficient
Processing of Nested Fuzzy SQL Queries in a Fuzzy Database”, IEEE
Trans. On Knowledge and Data Eng., vol. 13, no. 6, pp. 884-901, Nov/Dec
2001
[8] Zadeh, L.A. “Fuzzy Sets.” Information and Control, 8(3), pp. 338-353.
(1965)
[10] Zimmerman J. [2001], “Fuzzy Set Theory – And It’s Applications”, Kluwer
Academic Publishers, Norwell, Massachusetts, U.S.A.
Profile of the authors