An Efficient Bayesian Network Approach For Discovering Interesting Patterns
Price
Free (open access)
Volume
37
Pages
11
Published
2006
Size
437 kb
Paper DOI
10.2495/DATA060111
Copyright
WIT Press
Author(s)
R. Malhas & Z. Al Aghbari
Abstract
The main problem faced by all association rule/pattern mining algorithms is their production of a large number of rules which incurred a secondary mining problem; namely, mining interesting association rules/patterns. The problem is compounded by the fact that ‘common knowledge’ discovered rules are not interesting, but they are usually strong rules with high support and confidence levels – the classical measures. In this paper, we present an efficient algorithm for discovering interesting (unexpected) patterns based on background knowledge, represented by a Bayesian network. A pattern/rule is unexpected if it is ‘surprising’ to the user. The algorithm profiles a pattern as interesting (unexpected), if the absolute difference between its support estimated from the dataset and the Bayesian network exceeds a user specified threshold (ε ). Itemsets with the highest diverging supports are considered the most interesting. The efficiency of the Java implementation of the algorithm is verified experimentally. Keywords: interesting patterns, association rules, frequent itemsets, Bayesian network, background knowledge. 1 Introduction Since the inception of the classical Apriori algorithm [1] for mining association rules, development of interestingness measures has been a vigilant area of research to mine interesting patterns out of a sheer volume of obvious and irrelevant rules. The problem is compounded since obvious ‘common knowledge’ discovered rules are not interesting, but they are usually strong rules with high support and confidence levels - the classical measures in [1]. In this paper, we present an efficient algorithm that discovers interesting/unexpected patterns based on background knowledge, represented by
Keywords
interesting patterns, association rules, frequent itemsets, Bayesian network, background knowledge.