Anda di halaman 1dari 6

This paper appears in: Knowledge and Data Engineering, IEEE Transactions on

Issue Date: Jan. 2011


Volume: 23 Issue:1
On page(s): 51 - 63
ISSN: 1041-4347
INSPEC Accession Number: 11656217
Digital Object Identifier: 10.1109/TKDE.2010.100
Date of Publication: 17 June 2010
Date of Current Version: 22 November 2010
Sponsored by: IEEE Computer Society



[1] R. Agrawal and J. Kiernan, "Watermarking Relational Databases,"Proc. 28th Int'l Conf. Very Large Data
Bases (VLDB '02), VLDB Endowment, pp. 155-166, 2002.
[2] P. Bonatti, S.D.C. di Vimercati, and P. Samarati, "An Algebra for Composing Access Control
Policies," ACM Trans. Information and System Security, vol. 5, no. 1, pp. 1-35, 2002.
[3] P. Buneman, S. Khanna, and W.C. Tan, "Why and Where: A Characterization of Data Provenance," Proc.
Eighth Int'l Conf. Database Theory (ICDT '01), J.V. den Bussche and V. Vianu, eds., pp. 316-330, Jan. 2001.
[4] P. Buneman and W.-C. Tan, "Provenance in Databases," Proc. ACM SIGMOD, pp. 1171-1173, 2007.
[5] Y. Cui and J. Widom, "Lineage Tracing for General Data Warehouse Transformations," The VLDB J., vol.
12, pp. 41-58, 2003.
[6] S. Czerwinski, R. Fromm, and T. Hodes, "Digital Music Distribution and Audio
Watermarking,"http://www.scientificcommons. org43025658 , 2007.
[7] F. Guo, J. Wang, Z. Zhang, X. Ye, and D. Li, "An Improved Algorithm to Watermark Numeric Relational
Data," Information Security Applications, pp. 138-149, Springer, 2006.
[8] F. Hartung and B. Girod, "Watermarking of Uncompressed and Compressed Video," Signal
Processing, vol. 66, no. 3, pp. 283-301, 1998.
[9] S. Jajodia, P. Samarati, M.L. Sapino, and V.S. Subrahmanian, "Flexible Support for Multiple Access
Control Policies," ACM Trans. Database Systems, vol. 26, no. 2, pp. 214-260, 2001.
[10] Y. Li, V. Swarup, and S. Jajodia, "Fingerprinting Relational Databases: Schemes and Specialties," IEEE
Trans. Dependable and Secure Computing, vol. 2, no. 1, pp. 34-45, Jan.-Mar. 2005.
[11] B. Mungamuru and H. Garcia-Molina, "Privacy, Preservation and Performance: The 3 P's of Distributed
Data Management," technical report, Stanford Univ., 2008.
[12] V.N. Murty, "Counting the Integer Solutions of a Linear Equation with Unit Coefficients," Math.
Magazine, vol. 54, no. 2, pp. 79-81, 1981.
[13] S.U. Nabar, B. Marthi, K. Kenthapadi, N. Mishra, and R. Motwani, "Towards Robustness in Query
Auditing," Proc. 32nd Int'l Conf. Very Large Data Bases (VLDB '06), VLDB Endowment, pp. 151-162, 2006.
[14] P. Papadimitriou and H. Garcia-Molina, "Data Leakage Detection," technical report, Stanford Univ.,
2008.
[15] P.M. Pardalos and S.A. Vavasis, "Quadratic Programming with One Negative Eigenvalue Is NP-Hard," J.
Global Optimization, vol. 1, no. 1, pp. 15-22, 1991.
[16] J.J.K.O. Ruanaidh, W.J. Dowling, and F.M. Boland, "Watermarking Digital Images for Copyright
Protection," IEE Proc. Vision, Signal and Image Processing, vol. 143, no. 4, pp. 250-256, 1996.
[17] R. Sion, M. Atallah, and S. Prabhakar, "Rights Protection for Relational Data," Proc. ACM SIGMOD, pp.
98-109, 2003.
[18] L. Sweeney, "Achieving K-Anonymity Privacy Protection Using Generalization and
Suppression,"http://en.scientificcommons. org43196131 , 2002.
Index Terms:
Allocation strategies, data leakage, data privacy, fake records, leakage model.
Citation:
Panagiotis Papadimitriou, Hector Garcia-Molina, "Data Leakage Detection," IEEE Transactions on Knowledge
and Data Engineering, vol. 23, no. 1, pp. 51-63, Jan. 2011, doi:10.1109/TKDE.2010.100


repar|ng for data protect|on
Whlle Lhere are plenLy of Lools on Lhe markeL for keeplng moblle and sLaLlonary daLa from
leavlng Lhe company surrepLlLlously Lhe besL ones use a comblnaLlon of prevenLlon and
deLecLlon meLhods such as a deLecLlon englne and a daLa blocker
Powever before dolng anyLhlng lLs cruclal Lo undersLand whaL daLa Lypes are belng proLecLed
and Lhe level of rlsk ?ou should creaLe and codlfy daLa classlflcaLlon levels for all of your
companys daLa accordlng Lo Lhe organlzaLlons l1 securlLy sLandards uaLa Lypes can be ranked
on a scale from low Lo hlgh based on Lhe rlsk of lLs loss or exposure
Some examples of hlghrlsk daLa mlghL lnclude Lhe followlng
O
O usLomer or employee lnformaLlon wlLh names addresses soclal securlLy numbers and
oLher ldenLlLyrelaLed lnformaLlon

O usLomer llsLs LhaL could be used by a compeLlLor for poachlng cllenLs

O 1rade secreLs and lnLellecLual properLy

O onfldenLlal englneerlng and manufacLurlng plans for producLs

O lnanclal lnformaLlon or soonLobereleased markeLlng plans for upcomlng producLs
Cnce you undersLand whaL daLa should be proLecLed and have classlfled and documenLed rlsk
levels you can begln lnvesLlgaLlng whlch Lools would besL sulL your enLerprlses needs
ata |eakage prevent|on too|s
uaLa leakage prevenLlon Lools can be roughly compared Lo appllcaLlonlevel flrewalls
Llkeflrewalls Lhey examlne Lhe conLenL of ouLbound daLa raLher Lhan [usL porLs and packeL
Lypes and ulLlmaLely declde whaL can leave Lhe company When lnvesLlgaLlng daLa leakage
prevenLlon Lools youll flnd LhaL Lhe Lhree blg players ln Lhe markeL are vonLu lnc 8econnex
lnc and verlcepL orp
O
O 1he vonLu 60 sulLe conLalns a seL of Lools LhaL can monlLor all Lypes of Web Lrafflc
lncludlng SSL lM and Web mall lL deLecLs mallclous ouLbound Lrafflc wlLh lLs Lhree
algorlLhms LxacL uaLa MaLchlng lndexed uocumenL MaLchlng and uescrlbed onLenL
MaLchlng vonLu 60 can be flnely Luned Lo LargeL speclflc groups of employees locaLlons or
Lypes of conLenL

O 8econnexs lCuard plaLform conslsLs of Lwo useful devlces 8econnexs lCuard ls a neLwork
appllance LhaL monlLors Lhe conLenL of ouLbound Lrafflc and can also spoL mallclous acLlvlLy
1helr oLher producL 8econnex lnSlghL onsole ls a daLabase LhaL makes deLecLlon easler by
sLorlng senslLlve daLa lnfo As wlLh vonLu Lhe 8econnex plaLform can be Luned Lo sulL a
companys needs

O verlcepLs 360degree vlslblllLy and onLrol ls a cusLomlzable Lool predomlnanLly used for
conLenL monlLorlng lL uses whaL lL calls lLs proprleLary lnLelllgenL onLenL onLrol Lnglne
verlcepL noL only monlLors Lhe whole range of Web Lrafflc llke 1 SSL lM and 2 buL
also monlLors blog posLlngs chaL rooms and Web slLes all places where senslLlve company
daLa and secreLs could end up

O 1wo oLher vendors LhaL may be useful are orLAuLhorlLy 1echnologles lnc and C18 1echnologles
lnc Llke Lhe oLher producLs menLloned above Lhese companles offer hardware appllances LhaL
monlLor ouLbound l Lrafflc for speclflc Lypes of corporaLe daLa
Slnce Lhese producLs are neLwork appllances LhaL slmply slL behlnd flrewalls lL ls lmporLanL Lo
ensure Lhey lnLegraLe wlLh your exlsLlng securlLy lnfrasLrucLure vonLus producL for example
can be lnLegraLed wlLh producLs from lsco SysLems lnc lronorL SysLems lnc and 8lue oaL
SysLems lnc 8econnex and verlcepL producLs also work wlLh 8lue oaL and oLher Web proxles
,ob||e dev|ces and data |eakage
Moblle devlces presenL yeL anoLher challenge for daLa leakage uS8 keys 8lueLooLh devlces or
removable u drlves for example can all clrcumvenL neLwork conLrols wlLhouL a sysLem
admlnlsLraLors knowledge As hardware sLorage devlces Lhey ouLdo Lhe sophlsLlcaLed lnLerneL
and WebmonlLorlng Lools [usL descrlbed
Cne such Lool Safend roLecLor v30 can be lnsLalled as a cllenL on all Lhe deskLops and
lapLops ln your enLerprlse lL can be cenLrally managed vla a Webbased lnLerface and llke Lhe
Web monlLorlng Lools can be Luned Lo check for cerLaln Lypes of daLa belng moved Lhrough
uS8 lrewlre or wlreless porLs 1he Lool ls Lamperproof lnvlslble Lo users and sllenL unLll
someLhlng ls connecLed Lo an exLernal porL AddlLlonally Safend roLecLor v30 can be Luned Lo
compleLely block access Lo any removable devlce resLrlcL cerLaln devlces based on capaclLy or
allow readonly access and pollcles can lnLegraLed lnLo Lhe Croup ollcy Cb[ecLs (CC) of AcLlve
ulrecLory Lo provlde access Lo devlces for selecLed users
AL flrsL glance Lhe problem of daLa leakage prevenLlon seems overwhelmlng 8uL wlLh a few
commerclally avallable Lools leakage can be Lamed wheLher onllne Lhrough Lhe Web or by
sLorage devlce
bout the author
Ioel uoblo cl55l ls oo loJepeoJeot compotet secotlty coosoltoot ne ls o Mlctosoft Mvl lo
secotlty speclollzloq lo web ooJ oppllcotloo secotlty ooJ ls tbe ootbot of 1he LlLLle 8lack 8ook
of ompuLer SecurlLy ovolloble ftom Amozoo







Data Loss Prevention (DLP) is a computer security term referring to systems that identify, monitor, and
protect data in use (e.g. endpoint actions), data in motion (e.g. network actions), and data at rest (e.g.
data storage) through deep content inspection, contextual security analysis of transaction (attributes of
originator, data object, medium, timing, recipient/destination and so on) and with a centralized
management framework. Systems are designed to detect and prevent unauthorized use and transmission
of confidential information Vendors refer to the term as Data Leak Prevention, Information Leak
Detection and Prevention (ILDP), Information Leak Prevention (ILP), Content Monitoring and
FiItering (CMF), Information Protection and ControI (IPC) or Extrusion Prevention System by
analogy to ntrusion-prevention system.
4390398
hlde
1ypes of uL SysLems
4 neLwork uL (aka uaLa ln MoLlon ulM)
4 2 SLorage uL (aka uaLa aL 8esL ua8)
4 3 LndpolnL uL (aka uaLa ln use ulu)
4 4 uaLa ldenLlflcaLlon
4 3 uaLa leakage deLecLlon
4 6 uaLa aL 8esL
2 See also
3 8eferences
4 LxLernal llnks
[edit]Types of DLP Systems
[edit]etwork DLP (aka Data in Motion <DiM>)
Typically a software or hardware solution that is installed at network egress points near the perimeter. t
analyzes network traffic to detect sensitive data that is being sent in violation of information security
policies.
[edit]Storage DLP (aka Data at Rest <DaR>)
Typically a software solution that is installed in data centers to discover confidential data is stored in
inappropriate and/or unsecured locations (e.g. open file share).
[edit]Endpoint DLP (aka Data in Use <DiU>)
Such systems run on end-user workstations or servers in the organization. Like network-based systems,
endpoint-based can address internal as well as external communications, and can therefore be used to
control information flow between groups or types of users (e.g. 'Chinese walls'). They can also control
email and nstant Messaging communications before they are stored in the corporate archive, such that a
blocked communication (i.e., one that was never sent, and therefore not subject to retention rules) will not
be identified in a subsequent legal discovery situation. Endpoint systems have the advantage that they
can monitor and control access to physical devices (such as mobile devices with data storage
capabilities) and in some cases can access information before it has been encrypted. Some endpoint-
based systems can also provide application controls to block attempted transmissions of confidential
information, and provide immediate feedback to the user. They have the disadvantage that they need to
be installed on every workstation in the network, cannot be used on mobile devices (e.g., cell phones and
PDAs) or where they cannot be practically installed (for example on a workstation in an internet caf).
[edit]Data identification
DLP solutions include a number of techniques for identifying confidential or sensitive information.
Sometimes confused with discovery, data identification is a process by which organizations use a DLP
technology to determine what to look for (in motion, at rest, or in use). DLP solutions use multiple
methods for deep content analysis, ranging from keywords, dictionaries, and regular expressions to
partial document matching and fingerprinting. The strength of the analysis engine directly correlates to its
accuracy. The accuracy of DLP identification is important to lowering/avoiding false positives and
negatives. Accuracy can depend on many variables, some of which may be situational or technological.
Testing for accuracy is recommended to ensure a solution has virtually zero false positives/negatives.
[edit]Data Ieakage detection
Sometimes a data distributor gives sensitive data to a set of third parties. Some time later, some of the
data is found in an unauthorized place (e.g., on the web or on a user's laptop). The distributor must then
investigate if data leaked from one or more of the third parties, or if it was independently gathered by
other means.
[1]

[edit]Data at Rest
"Data at rest" specifically refers to old archived information that is stored on either a client PC hard drive,
on a network storage drive or remote file server, or even data stored on a backup system, such as a tape
or CD media. This information is of great concern to businesses and government institutions simply
because the longer data is left unused in storage, the more likely it might be retrieved by unauthorized
individuals outside the network.

Anda mungkin juga menyukai