Mining Association Rules With Negative Terms Using Candidate Pruning
Price
Free (open access)
Volume
33
Pages
10
Published
2004
Size
247 kb
Paper DOI
10.2495/DATA040151
Copyright
WIT Press
Author(s)
T. Shintani & D. Hayashi
Abstract
In this paper, we discuss an association rule with negative terms that contains negative and affirmative conditions intermingled, such as \“80% of customers who buy A and B but do not buy X, also buy C and D”. An association rule with negative terms can provide higher confidence rules, that is, we can attain more valuable information. To find them, itemsets containing negative conditions must be checked. We proposed two candidate pruning methods, upper bound pruning and database partition pruning, which are suitable for handling these itemsets. Upper bound pruning detects itemsets that cannot generate rules satisfying userspecified minimum thresholds. Database partition pruning detects itemsets that do not appear in database. Through performance evaluations, we show that the proposed methods not only reduce candidate itemsets but also avoid finding useless frequent itemsets for rule derivation. Moreover, we show an example of rules obtained by applying the proposed methods to a real dataset that is the hospitalization data of the cardiovascular medicine of the University of Tokyo hospital. Keywords: association rule, negative term, candidate pruning, medical data. 1 Introduction Mining association rules within a large database is representative problem in data mining. Several effective algorithms have been proposed[1, 2], but only affirmative information has been taken into account. In order to apply association rule mining to more complicated applications, we must consider rules that contain negative conditions. In [3, 4], negative condition was made consideration in association rules. [3] introduced a negative association rule X ⇒ Y , such as \“60% of customers who buy A and B do not buy D”. Furthermore, the other forms of negative
Keywords
association rule, negative term, candidate pruning, medical data.