Motivation
The Deep Learning community develops learning machines with multi-layer architectures, with the purpose of learning complex tasks better or more effectively that shallow architectures. The objective of this project is to collect examples of tasks, which demonstrate the advantages of deep architectures compared to shallow architectures and organize public evaluations (challenges) using such tasks.

Our first challenge taking place in 2011 will be on unsupervised learning. One of the recent developments in deep learning research has been the invention of algorithms for learning internal representations using unsupervised learning. The unsupervised learning challenge will test the ability of algorithms to create representations in a totally unsupervised way for use in supervised learning tasks. To emphasize the capability of the unsupervised learning systems to develop useful abstractions, the supervised learning tasks used to evaluate them will make use of very few labeled training examples and will include the case of "one-shot learning" (learning from a single labeled example). Acknowledging the fact that in practical applications there is always some kind of supervision available at least to perform model selection and select the best representation created by unsupervised learning algorithms, there will be a second phase in the challenge in which some labeled training data will be made available. However, the labels will be drawn from a task distinct from the task on which the data representations generated by the participants will be evaluated. Hence this will be a transfer learning problem:  learning a data representation to solve a given task then use it for other similar tasks.

Our second challenge will be a "live challenge" using video data to demonstrate the capability of deep learning systems trained on unrelated tasks to perform one-shot-learning of poses or short actions. Such capability would be useful to develop consumer products of video retrieval,
video surveillance, recognition of signals or commands, games. The competitors will be screened using pre-recorded data but will then have to demonstrate their system with live video at the site of a conference where the competition will take place.

Examples of datasets for the first challenge

 

 

Domain

Name and Description

Size/type of data

1

Chemo-informatics

CHEMO: Library of small molecules coded with QSAR features. The task is to predict  molecule toxicity.

51440 examples available as molecule formulas or in feature rep. (851 feat.). 2 class classif. or regression

2

Handwriting recognition

AVICENA: The task is to spot Arabic words in an ancient manuscript to facilitate indexing.

35070 examples available as raw images or in feature rep. (92 features). 15 classes

3

Object recognition from still images

IMAR: The task is to label the image with the most prominent object(s) for indexing purpose.  

Possibly use data from Caltech 256. Images could be preprocessed in a way that makes them difficult to identify. 30608 pictures, 2567 classes.

4

Situation recognition from video clips

SITUAR: The task is to recognize the occurrence of a given situation (pose or short action) in a short video clip, such as someone giving a phone call, someone getting out of a car, an animal crossing a road.

Possibly use KTH (2391 sequences, from 600 videos= 25 subject x 6 actions x 4 scenarios) or Hollywood dataset (3669 video clips from 69 movies ~150 examples per class. 12 classes of actions).

5

Speech recognition

SPEAK: The task is word spotting: find the presence of a given word in a spoken sequence.

Possibly use the TIMIT database. Licensing issues?

6

Socio-economic data

SOCIO: The task will be to predict revenue using census data.

Publicly available census data, millions of entries available.

7

Text processing

PROTEXT: Classifying or ranking text based on queries.

A large publicly available dataset, possibly  OpenTable.

8

Ecology data

SYLVESTER: Classification of forest cover.

72626 examples coded by 12 real features and tons of distractors.  2 classes.


Examples of videos for the live challenge
We show below tentative examples for a task that lends itself to a live demonstration. You can download the full archive or click on the links to play the movies.

demo video
demonstration.mov: A sequence of poses and gestures that could be used to test a demo.

Possible training examples:
tr01_aggressor.mov
tr02_glasseson.mov
tr03_pickupphone.mov
tr04_pickupphonewglasses.mov
tr05_glassesoff.mov
tr06_nodwphone.mov
tr07_nodewphonenglasses.mov
tr08_nod.mov
tr09_nodwglasses.mov
tr10_nodlookingatcamera.mov
tr11_lookatcamera.mov
tr12_puthandinfrontofmouth.mov
tr13_scratchhead.mov
tr14_puthandinfrontofmouthwglasses.mov
tr15_scratchheadwglasses.mov
tr16_drinkwglasses.mov
tr17_drink.mov
tr18_clap.mov
tr19_clapwglasses.mov

Simplified version involving poses only: staticposes_demonstration.mov

Academic video datasets that could be used for training:
The Hollywood2 dataset
The KTH dataset

Other videos from the Internet that would make good training data:
Videolectures.net
Youtube videos:
Little girl explains Star Wars
Video surveillance at home
Shoplifting

Examples of real-life scenarios
Here is a growing list of real-life scenarios in which having a versatile system capable of learning actions from a few examples would be useful:
- Sentiment analysis: find whether someone approves (nods yes), is hesitant or doubtful (puts hand in front of mouth) or is negative (nods no).
- Detection of suspect behaviors: recognize people who might be shop lifters because they quiclky look around.
- Recognition of signals or commands: time-out, slow down, stop, nono, got-it, etc.

Challenge

Protocol
A draft of the rules of the first challenge is available on the challenge website in preparation.
Evaluation
A draft of the first challenge evaluation method is available
on the challenge website in preparation.
For the live challenge, the candidates will be first evaluated with the same method as the first challenge. The best ranking contestant will be invited to demonstrate their systems in a live demonstration at the site of the conference. The system voted best by the public will win.
Participation

All challenges will be open to everyone who accepts the rules of the challenge. We will avoid providing datasets with licensing restrictions.

Schedule

August 31, 2010: : Deadline for NIPS 2010 demo proposals.
September 20, 2010: Deadline for NIPS 2010 demo proposals.
December 6-9, 2010: NIPS 2010 conference, Vancouver, Canada.

December-January 2010: First competition starts.
June 11-14, 2011: ICML 2011, Seattle, WA.
July 31-August 5, 2011: IJCNN 2011, San Jose, CA

December, 2011: NIPS 2011 conference, Vancouver, Canada.

Links to related workshops/competitions

WCCI 2010 special seesion on active and autonomous learning. Discussion of the results of the active learning challenge.

AISTATS 2010 workshop on active learning and experimental design. Tutorial on experimental design by Donald Rubin. Papers presenting the results of the active learning challenge.

Active learning challenge: Using for the first time the virtual lab of the causality workbench, the participants could buy labels for virtual cash and monitor the tradeoff between getting good classification accuracy and spending a lot on getting labels.

NIPS 2009 causality and time series mini-symposium.  Featuring a memorial lecture of Clive Granger by Halbert White.

NIPS 2008 causality workshop: objectives and assessment. The second challenge in causality organized by the causality workbench.

WCCI  2008 causation and prediction challenge. A first activity of the causality workbench.

NIPS 2006 workshop on causality and feature selection. The ancestor of this workshop.

IJCNN 2007 Agnostic learning vs. Prior knowledge challenge. “When everything fails, ask for additional domain knowledge” is the current motto of machine learning. Therefore, assessing the real added value of prior/domain knowledge is a both deep and practical question.The participants competed in two track: the “prior knowledge track” for which they had access to the raw data and information about the data, and the “agnostic learning track” for which they had access to preprocessed data with no knowledge of the identity of the features.

WCCI 2006 performance prediction challenge. “How good are you at predicting how good you are? 145 participants tried to answer that question. Cross-validation came very strong. Can you do better? Measure yourself against the winners by participating to the model selection game.

NIPS 2003 workshop on feature extraction and feature selection challenge. We organized a competition on five data sets in which hundreds of entries were made. The web site of the challenge is still available for post challenge submissions. Measure yourself against the winners! See the book we published with a CD containing the datasets, tutorials, papers on s.o.a. methods.

Pascal challenges: The Pascal network is sponsoring several challenges in Machine learning.

Data mining competitions:
A list of data mining competitions maintained by KDnuggets, including the well known KDD cup.

List of data sets for machine learning:
A rather comprehensive list maintained by MLnet.

UCI machine learning repository: A great collection of datasets for machine learning research.

DELVE: A platform developed at University of Torontoto benchmark machine learning algorithms.

CAMDA
Critical Assessment of Microarray Data Analysis, an annual conference on gene expression microarray data analysis. This conference includes a context with emphasis on gene selection, a special case of feature selection.

ICDAR
International Conference on Document Analysis and Recognition, a bi-annual conference proposing a contest in printed text recognition. Feature extraction/selection is a key component to win such a contest.

TREC
Text Retrieval conference, organized every year by NIST. The conference is organized around the result of a competition. Past winners have had to address feature extraction/selection effectively.

ICPR
In conjunction with the International Conference on Pattern Recognition, ICPR 2004, a face recognition contest is being organized.

CASP
An important competition in protein structure prediction called Critical Assessment of
 Techniques for Protein Structure Prediction.

Contact information

deeplearning@ clopinet . com.

Sponsors:

US Naval Research Labs