##
Dimensionality Reduction via Sparse Support
Vector Machines

Jinbo Bi (1), Kristin P. Bennett (1), Mark
Embrechts (2), Curt Breneman (3)

(1) Department of Mathematical Sciences

(2) Department of Decision Science and
Engineering Systems

(3) Department of Chemistry

Rensselaer Polytechnic Institute

110 8th Street

Troy, NY 12180

bij2@rpi.edu, bennek@rpi.edu,

embrem@rpi.edu, brenec@rpi.edu}
We describe a methodology for performing
variable selection and ranking using support vector machines (SVM). The
basic idea of the method is very simple. Construct a series of sparse linear
SVM that exhibit good generalization. Construct the subset of variables
having nonzero weights in the linear models. Then use this subset of variables
in nonlinear SVM to produce the final regression or classification function.
The method exploits the fact that linear SVM with 1-norm regularization
(no kernels) inherently performs variable selection as a side-effect of
minimizing capacity in the SVM model. In linear 1-norm SVM, the optimal
weight vector will have relatively few nonzero weights with the degree
of sparsity depending on the SVM model parameters. The variables with nonzero
weights then become potential attributes to be used in the nonlinear SVM.
In some sense, we trade the variable selection problem for the model parameter
selection problem in SVM.

Our methodology has proven to be very effective
on regression problems in drug design. The number of varibles are dramatically
reduced. In cross-validation testing, the method outperforms SVM models
trained using all the attributes. Also, chemist have found the visualization
of the weight sensitivities to be useful. We are testing the approach on
classification tasks. We hope to test the method on the workshop datasets
as well.