Abstract
-
Scatter-matrix-based class separability is a simple and efficient
feature selection criterion in the literature. However, the conventional
trace-based formulation does not take feature redundancy into account and
is prone to selecting a set of discriminative but mutually redundant features.
In this brief, we first theoretically prove that in the context of this
trace-based criterion the existence of sufficiently correlated features can
always prevent selecting the optimal feature set. Then, on top of this criterion,
we propose the redundancy-constrained feature selection (RCFS). To
ensure the algorithm’s efficiency and scalability,we study the characteristic
of the constraints with which the resulted constrained 0–1 optimization can
be efficiently and globally solved. By using the totally unimodular (TUM)
concept in integer programming, a necessary condition for such constraints
is derived. This condition reveals an interesting special case in which qualified
redundancy constraints can be conveniently generated via a clustering
of features. We study this special case and develop an efficient feature selection
approach based on Dinkelbach’s algorithm. Experiments on benchmark
data sets demonstrate the superior performance of our approach to
those without redundancy constraints.