The NIPS 2003 challenge in feature selection
is to find feature selection algorithms that significantly outperform
methods using all features, using as benchmark ALL five datasets
formatted for that purpose. To facilitate entering results for all five datasets,
all tasks are two-class classification problems. During the development period,
participants may submit validation set results on a subset of the datasets.
How to participate:
Simply download the five datasets from
the challenge web site.
||Num. ex. (tr/val/te)
If you are a Matlab user, we provide
sample code to read and check the data.
Otherwise, the data follow a straightforward ASCII
format. Check the latest challenge results.
Each dataset is split into training,
validation, and test set. Only the training labels are provided. During the
development period, participants can return classification results on the
validation set, even for a subset of the datasets. They will receive in return
their validation set scores. At any time (but presumably after some development
period) the participants can submit their final classification results on
ALL the datasets (with a limit of five sumissions per person).
CLOSED. The submission deadline was:
December 1st, 2003.
Questions: Check our challenge FAQ.
Submission for a
The workshop is open to contributions
related to feature extraction at large, including theoretical and practical
contributions on feature construction, space dimensionality reduction, and
feature selection. Participating in the challenge
is not a pre-requisite to submitting an abstract, but some priority will
be given to challenge participants having competitive methods. Abstracts
less than one page long should be sent to firstname.lastname@example.org.
CLOSED: The deadline to submit
abstracts was: December 1, 2003.
The workshop was a success, we had
17 presentations and 98 participants.
Springer will publish the best papers
as part of an edited book on feature extraction.
Friday Dec. 12, morning session 7:30am-10:30am
7:30am Benchmark datasets and challenge
[dataset description] [December 1st results] [December 8th results]
Isabelle Guyon, Steve Gunn, Asa Ben
Hur, and Gideon Dror
7:50am Classification for
High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion
Radford M. Neal and Jianguo Zhang
8:20am Random Forests and Regularized
Least Squares Classifiers
Kari Torkkola and Eugene Tuv
8:40am Feature Selection using
SVM and Random Forest
Yi-Wei Chen and Chih-Jen Lin
9:10am Feature Selection using
Transductive Support Vector Machine
Zhi-li Wu and Chun-hung Li
9:30am Boosting Flexible Learning
Ensembles with Dynamic Feature Selection
Alexander Borisov, Victor Eruhimov
and Eugene Tuv
9:50am Piecewise Linear Regularized
Saharon Rosset and Ji Zhu
10:10am Feature Selection with
Sensitivity Analysis for Direct Kernel Partial Least Squares (DK-PLS)
Mark J. Embrechts
Dec. 12, afternoon session 4:00am-7:00am
4:00pm Spectral Dimensionality
Reduction via Learning Eigenfunctions
4:30pm Protein Sequence Motifs:
Highly Discriminative Features for Function Prediction
Asa Ben Hur
4:50pm Feature Construction: Variations
on PCA and Company
5:10pm Feature Extraction for Image
Ilya Levner and Vadim Bulitko
5:40pm Feature Extraction with
Description Logics Functional Subsumption
Rodrigo de Salvo Braz and Dan Roth
6:00pm Feature Selection with the
Potential Support Vector Machine
6:20pm Information Based Supervised
and Semi-Supervised Feature Selection
Sang-Keun Lee, Seung-Joon Yi and Byoung-Tak
6:40pm Lessons Learned from the
Feature Selection Competition
Nitesh V. Chawla, Grigoris Karakoulas,
and Danny Roobaert
6:55pm Method description
Thomas Navin Lal and Olivier Chapelle
from challenge participants not coming to the workshop:
Nameless: Feature Selection Challenge
Ran Gilad-Bachrach and Amir Navot
NIPS Feature Selection Challenge:
Details On Methods
Amir Reza Saffari Azar
Special issue of JMLR
on variable and feature selection:
The Journal of Machine Learning Research
this year the proceedings of the NIPS 2001 workshop on variable and feature
selection and other contributions on that topic. This issue, organized and
edited by Isabelle Guyon and André Elisseeff, contains 14 papers, including an introduction
to the field by the guest editors. In addition to the papers, many of the
authors have made available the data sets and software used in their research.
Data mining competitions:
A list of data mining competitions
maintained by KDnuggets, including the well known KDD cup.
of datasets for machine learning:
A rather comprehensive list maintained
Includes pointers to software and data.
The collections include the famous UCI repositories, the DELVE platform
of University of Toronto, and other resources.
Critical Assessment of Microarray Data
Analysis, an annual conference on gene expression microarray data analysis.
This conference includes a context with emphasis on gene selection, a special
case of feature selection.
International Conference on Document
Analysis and Recognition, a bi-annual conference proposing a contest in printed
text recognition. Feature extraction/selection is a key component to win
such a contest.
Text Retrieval conference, organized
every year by NIST. The conference is
organized around the result of a competition. Past winners have had to address
feature extraction/selection effectively.
In conjunction with the International
Conference on Pattern Recognition, ICPR 2004, a face recognition contest
is being organized.
An important competition in protein
structure prediction called Critical Assessment of
Techniques for Protein Structure
955, Creston Road,
Berkeley, CA 94708, U.S.A.
Tel/Fax: (510) 524 6211
Proceedings publication: Masoud Nikravesh.
Program advisors: Kristin Bennett, Richard Caruana.
Challenge assistants: Asa Ben-Hur, André Elisseeff, Gideon Dror.
Challenge webmaster: Steve Gunn.
We thank the people who made the data
we are using publicly available. They will be nominatively acknowledged
at the end of the challenge when we reveal the identity of the datasets.