Skip to main content
placeholder image

Using cost-sensitive learning and feature selection algorithms to improve the performance of imbalanced classification

Journal Article


Abstract


  • Imbalanced data problem is widely present in network intrusion detection, spam filtering, biomedical engineering, finance, science, being a challenge in many real-life data-intensive applications. Classifier bias occurs when traditional classification algorithms are used to deal with imbalanced data. As already known, the General Vector Machine (GVM) algorithm has good generalization ability, though it does not work well for the imbalanced classification. Additionally, the state-of-the-art Binary Ant Lion Optimizer (BALO) algorithm has high exploitability and fast convergence rate. Based on these facts, we have proposed in this paper a Cost-sensitive Feature selection General Vector Machine (CFGVM) algorithm based on GVM and BALO algorithms to tackle the imbalanced classification problem, delivering different cost weights to different classes of samples. In our method, the BALO algorithm determines the cost weights and extract more significant features to improve the classification performance. Experiments conducted on eleven imbalanced data sets have shown that the CFGVM algorithm significantly improves the classification performance of minority class samples. By comparing with similar algorithms and state-of-the-art algorithms, the proposed algorithm significantly outperforms in performance and produces better classification results.

UOW Authors


  •   Feng, Fang (external author)
  •   Li, Kuan-Ching (external author)
  •   Shen, Jun
  •   Zhou, Qingguo (external author)
  •   Yang, Xuhui (external author)

Publication Date


  • 2020

Citation


  • Feng, F., Li, K., Shen, J., Zhou, Q. & Yang, X. (2020). Using cost-sensitive learning and feature selection algorithms to improve the performance of imbalanced classification. IEEE Access, 8 (1), 69979-69996.

Scopus Eid


  • 2-s2.0-85083886309

Number Of Pages


  • 17

Start Page


  • 69979

End Page


  • 69996

Volume


  • 8

Issue


  • 1

Place Of Publication


  • United States

Abstract


  • Imbalanced data problem is widely present in network intrusion detection, spam filtering, biomedical engineering, finance, science, being a challenge in many real-life data-intensive applications. Classifier bias occurs when traditional classification algorithms are used to deal with imbalanced data. As already known, the General Vector Machine (GVM) algorithm has good generalization ability, though it does not work well for the imbalanced classification. Additionally, the state-of-the-art Binary Ant Lion Optimizer (BALO) algorithm has high exploitability and fast convergence rate. Based on these facts, we have proposed in this paper a Cost-sensitive Feature selection General Vector Machine (CFGVM) algorithm based on GVM and BALO algorithms to tackle the imbalanced classification problem, delivering different cost weights to different classes of samples. In our method, the BALO algorithm determines the cost weights and extract more significant features to improve the classification performance. Experiments conducted on eleven imbalanced data sets have shown that the CFGVM algorithm significantly improves the classification performance of minority class samples. By comparing with similar algorithms and state-of-the-art algorithms, the proposed algorithm significantly outperforms in performance and produces better classification results.

UOW Authors


  •   Feng, Fang (external author)
  •   Li, Kuan-Ching (external author)
  •   Shen, Jun
  •   Zhou, Qingguo (external author)
  •   Yang, Xuhui (external author)

Publication Date


  • 2020

Citation


  • Feng, F., Li, K., Shen, J., Zhou, Q. & Yang, X. (2020). Using cost-sensitive learning and feature selection algorithms to improve the performance of imbalanced classification. IEEE Access, 8 (1), 69979-69996.

Scopus Eid


  • 2-s2.0-85083886309

Number Of Pages


  • 17

Start Page


  • 69979

End Page


  • 69996

Volume


  • 8

Issue


  • 1

Place Of Publication


  • United States