Skip to main content
placeholder image

A regression-based approach for threshold selections of support and confidence in association rule mining

Journal Article


Abstract


  • Association rule mining is one of the most important topics in data mining and knowledge discovery because it can extract interesting relationships between items in a dataset. Generally, the number of association rules in a particular dataset depends primarily on the measures of ‘support’ and ‘confidence’. In order to get satisfactory rules, a mining procedure usually needs to be performed many times with different minimum thresholds of ‘support’ and ‘confidence’. Usually, the thresholds are chosen by experience or trial-and-error methods, so the threshold selection is a time consuming task, especially when users work on larger scale datasets. The majority of the existing literature focuses on improving computational efficiency of the mining algorithms, but little work has been done to address the inefficiency in the threshold selection. To solve this problem, this paper proposes a regression-based approach to improve the efficiency of the threshold selections of ‘support’ and ‘confidence’ in association rule mining. The proposed approach employs non-linear regression analysis to detect potential relationships between ‘support’ and ‘confidence’ and the number of association rules in a large dataset. Our approach can also be used in broad domains with different types of datasets. A case study is also given in this paper to demonstrate the efficiency of the proposed approach in a real-world dataset.

Publication Date


  • 2015

Citation


  • Le, D. T., Zhang, M., Ren, F., & Luo, X. (2015). A regression-based approach for threshold selections of support and confidence in association rule mining. International Journal of Computers and their Applications, 22(2), 59-74.

Scopus Eid


  • 2-s2.0-85096962392

Web Of Science Accession Number


Start Page


  • 59

End Page


  • 74

Volume


  • 22

Issue


  • 2

Abstract


  • Association rule mining is one of the most important topics in data mining and knowledge discovery because it can extract interesting relationships between items in a dataset. Generally, the number of association rules in a particular dataset depends primarily on the measures of ‘support’ and ‘confidence’. In order to get satisfactory rules, a mining procedure usually needs to be performed many times with different minimum thresholds of ‘support’ and ‘confidence’. Usually, the thresholds are chosen by experience or trial-and-error methods, so the threshold selection is a time consuming task, especially when users work on larger scale datasets. The majority of the existing literature focuses on improving computational efficiency of the mining algorithms, but little work has been done to address the inefficiency in the threshold selection. To solve this problem, this paper proposes a regression-based approach to improve the efficiency of the threshold selections of ‘support’ and ‘confidence’ in association rule mining. The proposed approach employs non-linear regression analysis to detect potential relationships between ‘support’ and ‘confidence’ and the number of association rules in a large dataset. Our approach can also be used in broad domains with different types of datasets. A case study is also given in this paper to demonstrate the efficiency of the proposed approach in a real-world dataset.

Publication Date


  • 2015

Citation


  • Le, D. T., Zhang, M., Ren, F., & Luo, X. (2015). A regression-based approach for threshold selections of support and confidence in association rule mining. International Journal of Computers and their Applications, 22(2), 59-74.

Scopus Eid


  • 2-s2.0-85096962392

Web Of Science Accession Number


Start Page


  • 59

End Page


  • 74

Volume


  • 22

Issue


  • 2