Abstract
-
Spectral feature selection identifies relevant features by
measuring their capability of preserving sample similarity.
It provides a powerful framework for both supervised
and unsupervised feature selection, and has been
proven to be effective in many real-world applications.
One common drawback associated with most existing
spectral feature selection algorithms is that they evaluate
features individually and cannot identify redundant
features. Since redundant features can have significant
adverse effect on learning performance, it is necessary
to address this limitation for spectral feature selection.
To this end, we propose a novel spectral feature selection
algorithm to handle feature redundancy, adopting
an embedded model. The algorithm is derived from
a formulation based on a sparse multi-output regression
with a L2;1-norm constraint. We conduct theoretical
analysis on the properties of its optimal solutions,
paving the way for designing an efficient path-following
solver. Extensive experiments show that the proposed
algorithm can do well in both selecting relevant features
and removing redundancy.