Discovering informative patterns and data cleaning.

I. Guyon, N. Matic , and V. Vapnik.
In U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 181--203. MIT Press.

We present a method for discovering informative patterns from data. With this method, large databases can be reduced to only a few representative data entries. Our framework also encompasses methods for cleaning databases containing corrupted data. Both on-line and off-line algorithms are proposed and experimentally checked on databases of handwritten images. The generality of the framework makes it an attractive candidate for new applications in knowledge discovery.

Keywords: knowledge discovery, machine learning, informative patterns, data cleaning, information gain.

[ next paper ]