Ohm

ABSTRACT:
In this work, we present the basic idea of adequate and required set for dispersed
dispensation of probabilistic top-k questions in cluster-based wireless sensor
networks. Simplified data pruning concepts in clusters make sure precise things in
the above two concepts. Therefore, we improve a group of procedures, namely,
sufficient set-based (SSB), necessary set-based (NSB), and boundary-based (BB),
for various kinds of question handling with restricted series of infrastructures.
Additionally, in returning to vigorous deviations of data processing in the network,
we improve an adaptive algorithm that actively changes between the three planned
algorithms to reduce the broadcasting price. We show the applicability of essential
set and required set to wireless sensor networks with both two-tier hierarchical and
tree-structured network topologies. Tentative consequences indicates that the
planned algorithms to reduce the statistics that illustrates significant and
experienced only trivial persistent sequences of data broadcasting. The tentative
consequences also authenticate the authority of the adaptive algorithm, which
attains a near-optimal presentation in several circumstances.
Consolidated Report:SL.N
DATE
WORK DONE
O
1
10-02-2014 To 17-02-2014 Introduction of data mining
18-02-2014 To 28-02-2014 Studied
Concepts
of
programme
3
01-03-2014 To 05-03-2014 Analyzed existing methods
06-03-2014 To 16-03-2014 Literature Survey
17-03-2014 To 30-03-2014 System Architecture
Customized
mobile
Introduction of Data Mining:

In this Period I have concentrated on the basic concepts of data mining which can
be applied to the project. In this project mining is mainly done in order to extract
the information from the data and transform it in to understandable language and is
used for future use. We also use the classification and clustering process to mine
the click-through data.
Data Mining:
Data mining is the process of finding useful knowledge from a
collection of data. Data mining is the removal of hidden predictive information
from large databases. Knowledge Discovery in Databases (KDD) is the non-trivial
process of identifying valid, previously unknown and potentially useful patterns in
data.
Data mining is the process, that Extraction of interesting (non-trivial,

implicit, previously unknown and potentially useful) information or patterns from
data in large databases. This process is known as knowledge discovery (KDD)
process. Data mining plays an essential role in the knowledge discovery (KDD)
process.
Analyzed existing methods:

In this Period, I have made an analysis of the project regarding the
disadvantages of the existing system and how to overcome them in the proposed
system using new methods.
Literature Survey:In this Period, I have studied various Reference papers that will be helpful to the
project development.
The reference papers are
A cross pruning framework for top-k data collection in wireless sensor
networks
Semantics of ranking queries for probabilistic data and expected ranks.
A unified approach to ranking in probabilistic databases,
Ranking with uncertain scores
LITERATURE SURVEY:
Title: A cross pruning framework for top-k data collection in wireless
sensor networks
Authors: X. Liu, J. Xu, and W.-C. Lee
Year: 2010
Description :
Energy preservation is a significantproblemfor procedure schemes in
wireless sensor networks. In this work, we discover in-network collection
techniques for responding top-k requests in wireless sensor networks. A top-k
requestrecovers the k data items with the maximumresultsestimated by a
recordingrole on concerned features of antennaevaluations. Our study shows that
existing techniques for handling top-k request, e.g., Tiny Aggregation Service
(TAG), are not energy effective due to deficits in their directionfindingorganizations and statisticsaccumulation mechanisms. To address these
deficits, we propose to develop a new cross pruning (XP) aggregation framework
for top-k data collection in wireless sensor networks. The XP framework
incorporates several novel ideas to facilitate efficient in-network aggregation and
filtering, including 1) structure a cluster-tree routing arrangement to aggregate
more itemsclosey; 2) implementing a broadcast-then-filter methodology for
professionallydefeatingterminatedinformationcommunications; and 3) providing a
cross pruning procedure to enhance in-network filtering effectiveness. An
extensive set of experiments based on simulation has been conducted to evaluate
the performance of TAG and the proposed XP framework. The experimental
results validate our proposals and show that XP significantly outperforms TAG in
energy charge.
Title: Semantics of Ranking Queries for Probabilistic Data and Expected Ranks
Authors: G. Cormode, F. Li, and K. Yi
Year: 2009
Description :
When distributing with largevolumes of information, top-k requests are a
potentmethod for recurring only the k greatestimportant tuples for assessment,
based on a recording function. The main drawback for responding status requests
has been effectively planned and evaluated by surrounding traditional database
settings. The significance of the top-k is maybe even more in probabilistic files,
where a relative can translatevery rapidly invariousdomains. There are many new
challenges to suggestclassifications and procedures for statusinquiries over
probabilistic information. .Specially, we describeaamount of importantthings,
including exact-k, containment, unique-rank, value-invariance, and stability, which
are all satisfied by statusrequests on assuredinformation. We must say that all the
circumstances can be satisfied by any practical explanation for grading the
uncertain
information.This
effortsuggestsannaturalinnovativemethod
of
estimatedvigorous. This uses an understandable idea of estimated score of each
tuple through all possible surroundings. We are able to prove that, in contrast to all
existing approaches, the expected rank satisfies all the required properties for a
ranking query. We provide efficient solutions to compute this ranking across the
major models of uncertain data, such as attribute-level and tuple-level uncertainty.
For an uncertain relation of N tuples, the processing cost is O(N log N)-no worse
than simply sorting the relation. In settings where there is a high cost for
generating each tuple in turn, we provide pruning techniques based on probabilistic
tail bounds that can terminate the search early and guarantee that the top-k has
been found. Finally, a comprehensive experimental study confirms the
effectiveness of our approach.
Title: A Unified Approach to Ranking in Probabilistic Databases
Authors: j. Li, B. Saha, and A. Deshpande,
Year: 2009
Description :
The dramatic growth in the number of application domains that naturally
generate probabilistic, uncertain data has resulted in a need for efficiently
supporting complex querying and decision-making over such data. In this paper,
we present a unified approach to ranking and top-k query processing in
probabilistic databases by viewing it as a multi-criteria optimization problem, and
by deriving a set of features that capture the key properties of a probabilistic
dataset that dictate the ranked result. We contend that a single, specific ranking
function may not suffice for probabilistic databases, and we instead propose two
parameterized ranking functions, called PRF-w and PRF-e, that generalize or can
approximate many of the previously proposed ranking functions. We present novel
generating functions-based algorithms for efficiently ranking large datasets
according to these ranking functions, even if the datasets exhibit complex
correlations modeled using probabilistic and/xor trees or Markov networks. We
further propose that the parameters of the ranking function be learned from user
preferences, and we develop an approach to learn those parameters. Finally, we
present a comprehensive experimental study that illustrates the effectiveness of our
parameterized ranking functions, especially PRF-e, at approximating other ranking
functions and the scalability of our proposed algorithms for exact or approximate
ranking.
Title: Ranking with uncertain scores
Authors: M.A. Soliman and I.F. Ilyas,
Year: 2006
Description :
Greatdatabanks
with
tentativedata
are
suitable
in
variouspresentationscontaining data integration, location tracking, and Web search.
In these presentations, we want to handle recent circumstances with undetermined
items that are varied from conservative classification. Specially, improbability in
documentations'slashesbrings a limitedinstructionconcluded records, as opposed to
the entireinstruction that is expected in the conservativevigoroussurroundings. In
this work, we existent a new probabilistic design, based on limited orders, to
summarize the universe of probablepositionsinitiating from score improbability.In
this design we can develop various ranking request kinds with various rules.. We
designate and analyze a set of efficient query evaluation algorithms. We show that
our techniques can be used to solve the problem of rank aggregation in partial
orders. In addition, we design novel sampling techniques to compute approximate
query answers. Our tentativeestimate uses both actual and artificialinformation.
The tentativetrainingvalidates the efficacy and usefulness of our skills in unrelated
settings.
Existing System:
This latest technology has resulted in major impacts on a wide array of applications
in a variety of fields, including military, science, industry, commerce, transport,
and health-care. However, the quality of sensors varies significantly in terms of
their sensing precision, correctness, acceptance to hardware/external noise, and so
on. For example, studies show that the sharing of noise varies widely in special
photovoltic sensors, precision and accuracy of readings usually vary extensively in
humidity sensors, and the errors in GPS devices can be up to some meters.
Nevertheless, they have mostly been studied under a centralized system setting. In
this paper, we explore the problem of processing probabilistic top-k queries in
distributed wireless sensor networks. Here, we first use an environmental
monitoring application of wireless sensor network to introduce some basics of
probabilistic databases. Due to sensing imprecision and environmental

interferences, the sensor readings are usually noisy. Thus, multiple sensors are
deployed at certain zones in order to recover monitor quality. In this network,
sensor nodes are grouped into clusters, within each of which one of sensors is
selected as the cluster head for performing localized data processing. border based
algorithm using hits to the data processing.
DISADVANTAGE OF EXISTING SYSTEM:
We explore the problem of processing probabilistic top-k queries in
distributed wireless sensor networks.
The wind station very slowly
Data is not accuracy purify
The one station to another station delay the communication rate
PROPOSED SYSTEM
There are three suggestedprocedures to reduce the broadcastrate. We illustrate the

applicability of adequate set and required set to wireless sensor networks with both
two-tier hierarchical and tree-structured network topologies. There are several topk query semantics and solutions proposed recently, including U-Topk and Uk
Ranks in PT-Topk in PK-Topk in expected rank in and so on. A common way to
process probabilistic top-k queries is to first sort all tuples based on the scoring
attribute, and then process tuples in the sorted order to compute the final answer
set. Nevertheless, while focusing on optimizing the transmission bandwidth, the
proposed
techniques
require
numerous
iterations
of
computation
and
communication, introducing tremendous communication overhead and resulting in

long latency. As argued in this is not desirable for many distributed applications,
e.g., network monitoring, that require the queries to be answered in a good
response time, with a minimized energy consumption. In this paper, we aim at
developing energy efficient algorithms optimized for fixed rounds of

communications.
ADVANTAGE OF PROPOSED SYSTEM:

Additionally, NSB and BB take advantage of the skewed necessary sets and
necessary boundaries among local clusters to obtain their global boundaries,
respectively, which are very effective for intercluster pruning.
The transmission cost increases for all algorithms because the number of
tuples needed for query processing is increased.
REFERENCES:
X. Liu, J. Xu, and W.-C. Lee, A Cross Pruning Framework for
Top-k Data Collection in Wireless Sensor Networks, Proc. 11th
Intl Conf. Mobile Data Management, pp. 157-166, 2010.
M.A. Soliman and I.F. Ilyas, Ranking with Uncertain Scores,
Proc. IEEE Intl Conf. Data Eng. (ICDE 09), 2009.
G. Cormode, F. Li, and K. Yi, Semantics of Ranking Queries for
Probabilistic Data and Expected Ranks, Proc. IEEE Intl Conf. Data
Eng. (ICDE 09), 2009.
j. Li, B. Saha, and A. Deshpande, A Unified Approach to Ranking
in Probabilistic Databases, Proc. Intl Conf. Very Large Data Bases
(VLDB), vol. 2, no. 1, pp. 502-513, 2009.
Plan for next Report:Sl. No
1
2
3
4
Date
06-04-2014 To 16-04-2014
17-04-2014 To 21-04-2014
21-04-2014 To 06-05-2014
07-05-2014 To 12-05-2014
Work To Be Done
Implementing Existing methods
Result Analysis
Draft of the Phase-1
Plan for Phase-II
Implementing Existing Methods:Here, I am going to implement the existing methods to find the drawbacks of
the existing system.
Result Analysis:Here, I am going to check the Result of the exisiting methods.
Draft of the Phase-1:Here, I am going to prepare the Thesis Report of Phase-1.
Plan for Phase-II:Here, I am going to make a plan for Phase-II.

Ohm

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Ohm

Diunggah oleh

Hak Cipta:

Format Tersedia

ABSTRACT:

10-02-2014 To 17-02-2014 Introduction of data mining

18-02-2014 To 28-02-2014 Studied

01-03-2014 To 05-03-2014 Analyzed existing methods

06-03-2014 To 16-03-2014 Literature Survey

17-03-2014 To 30-03-2014 System Architecture

Introduction of Data Mining:

Data mining is the process, that Extraction of interesting (non-trivial,

Analyzed existing methods:

probabilistic databases. Due to sensing imprecision and environmental

There are three suggestedprocedures to reduce the broadcastrate. We illustrate the

communication, introducing tremendous communication overhead and resulting in

developing energy efficient algorithms optimized for fixed rounds of

ADVANTAGE OF PROPOSED SYSTEM:

Anda mungkin juga menyukai