Anda di halaman 1dari 8

Copyright Protection of Relational Database Systems

Ali Al-Haj1, Ashraf Odeh2, and Shadi Masadeh3


1
Princess Sumaya University for Technology, Amman, Jordan
ali@psut.edu.jo
2
Royal Scientific Society, Amman, Jordan
asrf_odeh@yahoo.com
3
Al-Isra Private University, Amman, Jordan
masadeh@ipu.edu.jo

Abstract. Due to the increasing use of databases in many real-life applications,


database watermarking has been suggested lately as a vital technique for copy-
right protection of databases. In this paper, we propose an efficient database
watermarking algorithm based on inserting a binary image watermark in Arabic
character attributes. Experimental results demonstrate the robustness of the pro-
posed algorithm against common database attacks.

Keywords: Relational databases, copyright protection, watermarking.

1 Introduction
Considerable amount of research has been done for watermarking multimedia data
[1,2,3], however there has been relatively little research done on watermarking data-
base systems [4,5,6]. Database watermarking is different than database security which
focuses on issues like access control techniques and data security issues and not on
securing proof of rights over relational data. This is why database watermarking has
been suggested lately as a vital technique for copyright protection of databases. The
increasing use of databases in many real-life applications is creating an ever increasing
need for watermarking databases. The followings are examples where database
watermarking might be of a crucial importance [7,8,9,10]:
Protecting rights over outsourced relational databases is ever increasing in-
terest, especially considering areas where sensitive, valuable data is to be
outsourced. A good example is a data mining application, where data is sold
in pieces to parties specialized in mining it. Given the nature of the data, it is
hard to associate rights of the originator of it. Watermarking can be used to
solve this issue.
There are companies specialized in compiling large number of semiconduc-
tor parts into databases. Such companies license these databases at high
prices to design engineers. So they need a methodology to verify their own-
ership of their databases in cases where design engineers may manipulate the
databases and claim its ownership.
The internet is exerting tremendous pressure on those data providers to cre-
ate web services that allow users to search and access databases remotely.

F. Zavoral et al. (Eds.): NDT 2010, Part I, CCIS 87, pp. 143150, 2010.
Springer-Verlag Berlin Heidelberg 2010
144 A. Al-Haj, A. Odeh, and S. Masadeh

While this trend is a boon to end users, it is exposing the data providers to
the threat of data theft. They are therefore demanding capabilities for identi-
fying pirated copies of their data.
Watermarking databases has unique requirements that differ from those required for
watermarking digital audio or video products. Such requirements include; maintaining
watermarked database usability, preserving database semantics, preserving database
structure, watermarked database robustness, blind watermark extraction, and incre-
mental updatability , among many other requirements.
In this paper we describe a database watermarking algorithm for Arabic character
attributes. The remaining of this paper is organized as follows. In section 2 the
watermarking algorithm is described in details. In section 3 the robustness of the
proposed algorithm is evaluated against common database attacks. Finally, conclud-
ing remarks are given in section 4.

2 Proposed Time Attribute-Based Watermarking


Many Arabic characters are expandable; and thuds watermark bits can be hidden in
their extensions, without sacrificing the readability and appearance of the character.
For example (one) can be written ( ) or in extended form () . Both have the
same meaning but the second can carry binary information. In the algorithm, a binary
image is used to watermark the relational database. The bits of the image are seg-
mented into short binary strings that are encoded in non-numeric, multi-word attrib-
utes of selected tuples of the database. The embedding process of each short string is
based on expanding the first character of a word whose location is determined by the
decimal equivalent of the short string. Extraction of a short string is done locating the
word in which one of its characters was expanded. The image watermark is then con-
structed by converting the decimals into binary strings. A major advantage of using
the space-based watermarking is the large bit-capacity available for hiding the water-
mark. This facilitates embedding large watermarks or multiple small watermarks.
This is in contrast to bit-based algorithms where watermark bits have limited potential
locations that can be used to hide bits without being subjected to removal or destruc-
tion. The proposed algorithm has two procedures: watermark embedding procedure
and watermark extraction procedure. The two procedures are described in the follow-
ing sub-sections.

2.1 Watermark Embedding Procedure

The watermark embedding procedure consists of the following operational steps:

Step 1: Arrange the watermark image into m strings each of n bits length.
Step 2: Divide the database logically into a sub-set has m tuples.
Step 3: Embed the m short stings of the watermark image into each m-tuple.
Step 4: Embed the n-bit binary string in the corresponding tuple of a sub-set as
follows:
Copyright Protection of Relational Database Systems 145

Find the decimal equivalent of the string. Let the decimal equivalent
be d.
Embed the decimal number d in a pre-selected non-numeric, multi-
word attribute by expanding the first expandable character of the dth
word of the attribute.
Step 5: Repeat step 4 for each tuple in the subset.
Step 6: Repeat steps 4 and 5 for each subset of the database under watermarking.
The watermark is a of 3 x 3 binary image. Each of the three 3-bit binary strings is
transformed into its decimal equivalent as shown in Figure 1(a), and embedded in the
3-tuple sub-set, as shown in Figure 1(b),. The count of the word with the red-colored
character-extension (-) indicates the decimal equivalent of the embedded short binary
string. Extension is performed on the fist expandable character of the word.



(a).

(a) (b)

Fig. 1. (a). Binary image watermark, and (b). its decimal equivalent vector

An illustration of the embedding procedure is shown in the two figures below. The
binary image watermark is transformed into its decimal equivalent vector as shown in
Figure 1, and a snapshot of the watermarked database is shown in Figure 2. The tu-
ples in the figure constitute the records of the database, and the Ais are the 'Time'
attributes of the database tuples.
A snapshot of the relational database after embedding the watermark throughout
the database is shown in Fig. 2. The tuples in the figure constitute the database, and
the A's are the watermarked non-numeric, multi-word attributes for each tuple.

2.2 Watermark Extraction Procedure


The Watermark extraction procedure is blind. It requires neither the knowledge of the
original un-watermarked database nor the watermark itself. This property is critical as
it allows the watermark to be detected in a copy of the database relation, irrespective
of later updates to the original relation. The watermark extraction procedure is a direct
reversal of the watermark embedding procedure as described in the following steps:
Step 1: Locate the tuples of each sub-set in the database.
Step 2: Locate the non-numeric multi-word attribute of each tuple in the sub-set.
146 A. Al-Haj, A. Odeh, and S. Masadeh

Step 3: In the selected attribute:


Find the word which has one of its characters expanded.
Count the number of the word starting from the begining.
Convert decimal equivalent of the count into a binary
string.
Step 4: Repeat steps 2 and 3 for all tuples of the sub-set.
Step 5: Construct watermark by putting together extracted strings into an m x n
image.
Step 6: Repeat steps 1 through 5 to extract all copies of the embedded watermark.

A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 An-1 An
tuple1
tuple2
tuple3
tuple4
tuple5
tuple6
tuple7
tuple8
tuple9
tuple10
tuplen

Fig. 2. A snapshot of the watermarked database

3 Performance Evaluation

The database watermarking algorithm described in this chapter has been evaluated
and tested on an experimental database that we have constructed. The database con-
sists of 1200 tuples, and runs under the Oracle platform. We concentrated our per-
formance evaluation on the robustness of the proposed algorithm by virtue of the fact
that, database watermarking algorithms must be developed in such a way to make it
difficult for an adversary to remove or alter the watermark beyond detection without
destroying the value of the object. In particular, the database watermarking algorithm
should make the watermarked database robust against the following types of attacks:
subset deletion attack, subset addition attack, subset alteration attack, and finally
subset selection attack.

(1). Subset Deletion Attack: In this type of attack, the attacker may delete a subset of
the tuples of the watermarked database hoping that the watermark will be removed.
The graph shown in Fig. 3 indicates that the watermark will be removed only and
only if most of the database tuples were deleted! That is, even the removal of 95% of
Copyright Protection of Relational Database Systems 147

100%
watermark detected

80%

60%

40%

20%

0%
0 10 20 30 40 50 60 70 80 90 100
sub deletion(%)
Watermark detected in sub deletion attack

Fig. 3. Robustness results due to the 'subset deletion attack'

the database will not result in removing the watermark. This is due to the fact that the
proposed algorithm embeds the same watermark everywhere in the database, making
this type of attack ineffective.

(2). Subset Addition Attack: In this type of attack, the attacker adds a set of tuples to
the original database. This type of attack has little impact on the watermark embedded
through our algorithm. The graph shown below in Fig. 4 indicates that the watermark
will never be removed even if the added tuples are as many as the original tuples.
Thats, only the added tuples will not carry the watermark information.

100%
watermark detected

80%

60%

40%

20%

0%
0 10 20 30 40 50 60 70 80 90 100
sub addition(%)
Watermark detected in sub addition attack

Fig. 4. Robustness results due to the 'subset addition attack'

(3). Subset Alteration Attack: In this type of attack, the attacker alters the tuples of
the database through operations such as such linear transformation. The attacker
hopes by doing so to remove the watermark from the database. The graph shown in
148 A. Al-Haj, A. Odeh, and S. Masadeh

Fig. 5 indicates that the watermark will remain in the watermarked database even if
90 % of the tuples of the database were altered. This is due to the fact that the pro-
posed algorithm embeds the same watermark everywhere in the database, making this
type of attack ineffective.

100%
watermark detected

80%

60%

40%

20%

0%
0 10 20 30 40 50 60 70 80 90 100
sub alteration(%)
Watermark detected in sub alteration attack

Fig. 5. Robustness results due to the 'subset alteration attack'

(4). Subset Selection Attack: In this type of attack, the attacker randomly selects a
subset of the original database that might still provide value for its intended purpose.
The attacker hopes by doing so that the selected subset will not contain the water-
mark. However, since the proposed algorithm embeds the watermark in the whole
database, this attack has little impact. The graph shown in Fig. 6 indicates that the
watermark will remain in the watermarked database even if the attacker selects a
subset as small as 10% of the original database. That is, no matter how small the
subset the attacker selects, the watermark will remain in the selected subset and thus
maintain the required copyright protection.

100%
watermark detected

80%

60%

40%

20%

0%
0 10 20 30 40 50 60 70 80 90 100
sub selection(%)
Watermark detected in sub selection attack

Fig. 6. Robustness results due to the 'subset selection attack'


Copyright Protection of Relational Database Systems 149

Finally we computed the embedding and extraction times of the Arabic-Character


Attribute based Watermarking Algorithm as a function of the size of the database
(number of tuples used for watermarking). Furthermore, to show the time-performance
of the algorithm compared with other reported database watermarking algorithms, we
applied the algorithms of [11,12] on the same experimental database. The results are
shown in Fig. 7 for the embedding time and in Fig. 8 for the extraction time. As seen in
the figures, our algorithm takes more time than other two algorithms.

Embedding Results
700
600
Embedding Time (ms)

500

400
300

200
100

0
1000 3000 5000 7000 9000 11000
Data Size (Tuples)

Statistical WM (Sion et al. 2003) Arabic-character attribute WM


Bit-based WM (Huang et al.2004)

Fig. 7. Embedding time of the Arabic-Character Algorithm WM based compared with two
other algorithms

Extracting Results
700

600
Extracting Time (ms)

500

400
300

200
100
0
1000 3000 5000 7000 9000 11000
Data Size (Tuples)

Statistical WM (Sion et al. 2003) Arabic-character attribute WM


Bit-based WM (Huang et al.2004)

Fig. 8. Extracting time of the Arabic-Character Algorithm WM based compared with two other
algorithms
150 A. Al-Haj, A. Odeh, and S. Masadeh

4 Conclusions
In this paper, we described a watermarking algorithm based on hiding watermark bits
in the extensions of expandable Arabic characters of non-numeric, multi-word, attrib-
utes of subsets of tuples. A major advantage of using this approach is the large
bit-capacity available to hide large watermarks. This is opposite to the other proposed
algorithms where watermark bits have limited potential bit-locations that can be used
to hide them effectively without being subjected to removal or destruction. The ro-
bustness of the proposed algorithm was verified against a number of database attacks
such subset deletion, subset addition, subset alteration and subset selection attacks.

References
1. Potdar, V., Han, S., Chang, E.: A Survey of Digital Image Watermarking Techniques. In:
Proceedings of the IEEE International Conference on Industrial Informatics, pp. 709716
(2005)
2. Langelaar, G., Setyawan, I.: Watermarking Digital Image and Video Data. IEEE Signal
Processing Magazine 17, 2043 (2000)
3. Arnold, M.: Audio Watermarking: Features, applications and Algorithms. In: Proc. of the
5th IEEE International Conference on Computer and Multimedia and Expo., pp. 1013
1016 (2000)
4. Agrawal, R., Hass, P., Kiernan, J.: Watermarking relational data: framework, algorithms
and analysis. The VLDB Journal The International Journal on Very Large Data
Bases 12(3), 157169 (2003)
5. Lee, Y., Swarup, V., Jajodia, S.: Fingerprinting Relational Databases: Schemes and Spe-
cialties. IEEE Trans. Dependable and secure Computing 2(1), 3445 (2005)
6. Agrawal, R., Kiernan, J.: Watermarking Relational Databases. In: Proc. of the 28th Inter-
national Conference on Very Large Databases, Hong Kong, August 2002, pp. 946950
(2002)
7. Zhang, Z., Jin, X., Wang, J., Li, D.: Watermarking Relational Database Using Image.
In: Proc. of the 2004 International Conference on Machine Learning and Cybernetics,
China, August 2004, pp. 17391744 (2004)
8. Hildebrandt, E., Saake, G.: User Authentication in Multi-database systems. In: The 9th In-
ternational Workshop on Database and Expert Systems Applications, Vienna, Austria
(1998)
9. SIIA: Database Protection: Making the case for a new federal database protection law
(2000),
http://www.siia.net/sharedcontent/gove/issues/ip/dbbrief.html
10. Zhang, Z., Jin, X., Wang, J., Li, D.: A Robust Watermarking Scheme for Relational Data.
In: Proc. of the 13th Workshop on Information Technology and Engineering, December
2003, pp. 195200 (2003)
11. Huang, M., Cao, J., Peng, Z., Fang, Y.: A new watermark mechanism for relational data.
In: Proc. of the 4th International Conference on Computer and Technology, China,
September 2004, pp. 946950 (2004)
12. Sion, R., Atallah, M., Prabhakar, S.: Rights Protection for Relational Data. IEEE Trans.
Knowledge and Data Engineering 16(12), 15091525 (2004)

Anda mungkin juga menyukai