Anda di halaman 1dari 1

Association Rule Mining of Relational Data

As shown, an instance for a pattern with no unipartite transaction mining (Tung et al., 1999) are two main
edges can be arbitrarily extended to instances of a categories. Generally the interest in association rule A
larger pattern. In order to solve this problem, a pat- mining is moving beyond the single-table setting to
tern constraint must be introduced that requires valid incorporate the complex requirements of real-world
patterns to at least have one unipartite edge connected data.
to each entity node.

Related Research Areas FUTURE TRENDS

A related area of research is graph-based pattern min- The consensus in the data mining community of the
ing. Traditional graph-based pattern mining does not importance of relational data mining was recently para-
produce association rules but rather focuses on the task phrased by Dietterich (2003) as “I.i.d. learning is dead.
of frequent subgraph discovery. Most graph-based Long live relational learning”. The statistics, machine
methods consider a single label or attribute per node. learning, and ultimately data mining communities have
When there are multiple attributes, the data are either invested decades into sound theories based on a single
modeled with zero or one label per node or as a bipar- table. It is now time to afford as much rigor to relational
tite graph. One graph-based task addresses multiple data. When taking this step it is important to not only
graph transactions where the data are a set of graphs specify generalizations of existing algorithms but to
(Inokuchi et al, 2000; Yan and Han, 2002; Kuramochi also identify novel questions that may be asked that
& Karypis, 2004; Hasan et al, 2007). Since each record are specific to the relational setting. It is, furthermore,
or transaction is a graph, a subgraph pattern is counted important to identify challenges that only occur in the
once for each graph in which it exists at least once. In relational setting, including skewing due to traversal
that sense transactional methods are not much different of the relational link structure and correlations that are
than single-table item set methods. frequent in relational neighbors.
Single graph settings differ from transactional set-
tings since they contain only one input graph rather than
a set of graphs (Kuramochi & Karypis, 2005; Vanetik, CONCLUSION
2006; Chen et al, 2007). They cannot use simple
existence of a subgraph as the aggregation function; Association rule mining of relational data is a power-
otherwise the pattern supports would be either one or ful frequent pattern mining technique that is useful
zero. If all examples were counted without aggregation for several data structures including graphs. Two
then the problem would no longer satisfy downward main approaches are distinguished. Inductive logic
closure. Instead, only those instances are counted as programming provides a high degree of flexibility,
discussed in the previous section. while mining of joined relations is a fast technique
In relational pattern mining multiple items or at- that allows the study of problems related to skewed
tributes are associated with each node and the main or uninteresting results. The potential computational
challenge is to achieve scaling with respect to the complexity of relational algorithms and specific prop-
number of items per node. Scaling to large subgraphs is erties of relational data make its mining an important
usually less relevant due to the “small world” property current research topic. Association rule mining takes
of many types of graphs. For most networks of practi- a special role in this process, being one of the most
cal interest any node can be reached from almost any important frequent pattern algorithms.
other by means of no more than some small number of
edges (Barabasi & Bonabeau, 2003). Association rules
that involve longer distances are therefore unlikely to REFERENCES
produce meaningful results.
There are other areas of research on ARM in which Agrawal, R. & Srikant, R. (1994). Fast Algorithms for
related transactions are mined in some combined fash- Mining Association Rules in Large Databases. In Pro-
ion. Sequential pattern or episode mining (Agrawal ceedings of the 20th international Conference on Very
& Srikant 1995; Yan, Han, & Afshar, 2003) and inter- Large Data Bases, San Francisco, CA, 487-499.



Anda mungkin juga menyukai