SVM Application List

This list of Support Vector Machine applications grows thanks to visitors like you who ADD new entries. Thank you in advance for your contribution.

Support vector machines-based generalized predictive control

This work presents an application of the previously proposed Support Vector Machines Based Generalized Predictive Control (SVM-Based GPC) method to the problem of controlling chaotic dynamics with small parameter perturbations. The Generalized Predictive Control (GPC) method, which is included in the class of Model Predictive Control, necessitates an accurate model of the plant that plays very crucial role in the control loop. On the other hand, chaotic systems exhibit very complex behavior peculiar to them and thus it is considerably difficult task to get their accurate model in the whole phase space. In this work, the Support Vector Machines (SVMs) regression algorithm is used to obtain an acceptable model of a chaotic system to be controlled. SVM-Based GPC exploits some advantages of the SVM approach and utilizes the obtained model in the GPC structure. Simulation results on several chaotic systems indicate that the SVM-Based GPC scheme provides an excellent performance with respect to local stabilization of the target (an originally unstable equilibrium point). Furthermore, it somewhat performs targeting, the task of steering the chaotic system towards the target by applying relatively small parameter perturbations. It considerably reduces the waiting time until the system, starting from random initial conditions, enters the local control region, a small neighborhood of the chosen target. Moreover, SVM-Based GPC maintains its performance in the case that the measured output is corrupted by an additive Gaussian noise.

Reference(s):
“Support vector machines-based generalized predictive control,” Serdar Iplikci, INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, Vol. 16, pp. 843-862, 2006
Reference link(s):
http://ietfec.oxfordjournals.org/cgi/content/abstract/E89-A/10/2787
Data link(s):

Entered by: Serdar Iplikci <iplikci@pau.edu.tr> - Monday, October 23, 2006 at 18:05:17 (GMT)
Comments:

Dynamic Reconstruction of Chaotic Systems from Inter-spike Intervals Using Least Squares Support Vector Machines

This work presents a methodology for dynamic reconstruction of chaotic systems from inter-spike interval (ISI) time series obtained via integrate-and-fire (IF) models. In this methodology, least squares support vector machines (LSSVMs) have been employed for approximating the dynamic behaviors of the systems under investigation.

Reference(s):
Physica D, Vol. 216, pp. 282-293, 2006
Reference link(s):
Data link(s):

Entered by: Serdar Iplikci <iplikci@pau.edu.tr> - Monday, May 29, 2006 at 12:53:56 (GMT)
Comments:

Application of The Kernel Method to the Inverse Geosounding Problem

Determining the layered structure of the earth demands the solution of a variety of inverse problems; in the case of electromagnetic soundings at low induction numbers, the problem is linear, for the measurements may be represented as a linear functional of the electrical conductivity distribution. In this work, an application of the Support Vector (SV) Regression technique to the inversion of electromagnetic data is presented. We take advantage of the regularizing properties of the SV learning algorithm and use it as a modeling technique with synthetic and field data. The SV method presents better recovery of synthetic models than Tikhonov's regularization. As the SV formulation is solved in the space of the data, which has a small dimension in this application, a smaller problem than that considered with Tikhonov's regularization is produced. For field data, the SV formulation develops models similar to those obtained via linear programming techniques, but with the added characteristic of robustness.

Reference(s):
"Application of the kernel method to the inverse geosounding problem", Hugo Hidalgo, Sonia Sosa and E. Gómez-Treviño, Neural Networks, vol. 16, pp. 349-353, 2003
Reference link(s):
http://cienciascomp.cicese.mx/recopat/articulos/NeuralNetworks03.pdf
Data link(s):

Entered by: Hugo Hidalgo <hugo@cicese.mx> - Wednesday, March 22, 2006 at 14:04:25 (MST)
Comments:

Support Vector Machines Based Modeling of Seismic Liquefaction Potential

This paper investigate the potential of support vector machines based classification approach to assess the liquefaction potential from actual standard penetration test (SPT) and cone penetration test (CPT) field data. Support vector machines are based on statistical learning theory and found to work well in comparison to neural networks in several other applications. Both CPT and SPT field data sets is used with support vector machines for predicting the occurrence and nonoccurrence of liquefaction based on different input parameter combination. With SPT and CPT test data sets, highest accuracy of 96% and 97% respectively was achieved with support vector machines. This suggests that support vector machines can effectively be used to model the complex relationship between different soil parameter and the liquefaction potential. Several other combinations of input variable were used to assess the influence of different input parameters on liquefaction potential. Proposed approach suggest that neither normalized cone resistance value with CPT data nor the calculation of standardized SPT value is required with SPT data. Further, support vector machines required few user-defined parameters and provide better performance in comparison to neural network approach.

Reference(s):
Goh ATC. Seismic Liquefaction Potential Assessed by Neural Networks. Journal of Geotechnical Engineering 1994; 120(9): 1467-1480.
Goh ATC. Neural-Network Modeling of CPT Seismic Liquefaction Data. Journal of Geotechnical Engineering 1996; 122(1): 70-73
Reference link(s):
Accepted for publication in International Journal for Numerical and Analytical Methods in Geomechanics.
Data link(s):

Entered by: Mahesh Pal <mpce_pal@yahoo.co.uk> - Wednesday, February 22, 2006 at 06:50:07 (GMT)
Comments:

SVM for Geo- and Environmental Sciences

Statistical learning theory for geo(spatial) and spatio-temporal environmental data analysis and modelling. Comparisons with geostatistical predictions and simulations

Reference(s):
1. N. Gilardi, M. Kanevski, M. Maignan and E. Mayoraz. Environmental and Pollution Spatial Data Classification with Support Vector Machines and Geostatistics. Workshop W07 “Intelligent techniques for Spatio-Temporal Data Analysis in Environmental Applications”. ACAI99, Greece, July, 1999. pp. 43-51. www.idiap.ch
2. M Kanevski, N Gilardi, E Mayoraz, M Maignan. Spatial Data Classification with Support Vector Machines. Geostat 2000 congress. South Africa, April 2000.
3. Kanevski M., Wong P., Canu S. Spatial Data Mapping with Support Vector Regression and Geostatistics. 7th International Conference on Neural Information Processing, Taepon, Korea. Nov. 14-18, 2000. Pp. 1307-1311.
4. N GILARDI, Alex GAMMERMAN, Mikhail KANEVSKI, Michel MAIGNAN, Tom MELLUISH, Craig SAUNDERS, Volodia VOVK. Application des méthodes d’apprentissage pour l’étude des risques de pollution dans le Lac Léman. 5e Colloque transfrontalier CLUSE. Risques majeurs: perception, globalisation et management. Université de Genève, 2000.
5. M. Kanevski. Evaluation of SVM Binary Classification with Nonparametric Stochastic Simulations. IDIAP Research Report, IDIAP-RR-01-07, 17 p. 2001. www.idiap.ch
6. M. Kanevski, A. Pozdnukhov, S. Canu, M. Maignan. Advanced Spatial Data Analysis and Modelling with Support Vector Machines. International Journal on Fuzzy Systems 2002. p. 606-615.
7. M. Kanevski , A. Pozdnukhov , S. Canu ,M. Maignan , P.M. Wong , S.A.R. Shibli “Support Vector Machines for Classification and Mapping of Reservoir Data”. In: “Soft Computing for Reservoir Characterization and Modelling”. P. Wong, F. Aminzadeh, M. Nikravesh (Eds.). Physica-Verlag, Heidelberg, N.Y. pp. 531-558, 2002.
8. Kanevski M., Pozdnukhov A., McKenna S., Murray Ch., Maignan M. Statistical Learning Theory for Spatial Data. In proceedings of GeoENV2002 conference. Barcelona, 2002.
9. M. Kanevski et al. Environmental data mining and modelling based on machine learning algorithms and geostatistics. Journal of Environmental Modelling and Software, 2004. vol. 19, pp. 845-855.
10. M. Kanevski, M. Maignan et al. Advanced geostatistical and machine learning models for spatial data analysis of radioactively contaminated territories. Journal of Environmental Sciences and Pollution Research, pp.137-149, 2003.
11. Kanevski M., Maignan M. and Piller G. Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data n Switzerland. International conference EnviroInfo, 2004. http://www.enviroinfo2004.org/cdrom/Datas/Kanevski.htm
12. Kanevski M., Maignan M. and Pozdnukhov A. Active Learning of Environmental Data Using Support Vector Machines. Conference of the International Association for Mathematical Geology, Toronto 2005. http://www.iamgconference.com/
13. M. Kanevski, A. Pozdnukhov, M. Tonini, M. Motelica, E. Savelieva, M. Maignan. Statistical Learning Theory for Geospatial Data. Case study: Aral Sea. 14th European colloquium on Theoretical and Quantitative Geography. Portugal, September 2005.
14. Pozdnukhov A., Kanevski M. Monitoring network optimisation using support vector machines. In: Geostatistics for Environmental applications. (Renard Ph., Demougeot-Renard H and Froidevaux, Eds.). Springer, 2005. pp. 39-50.
15. Pozdnukhov A. and Kanevski M. Monitoring Network Optimisation for Spatial Data Classification Using Support Vector Machines. (2006). International Journal of Environment and Pollution. Vol.28. 20 pp.
Reference link(s):
www.unil.ch/igar
www.idiap.ch
Data link(s):

Entered by: Mikhail Kanevski <Mikhail.Kanevski@unil.ch> - Sunday, February 12, 2006 at 16:30:07 (GMT)
Comments:

SVM for Protein Fold and Remote Homology Detection

Motivation: Protein remote homology detection is a central problem in computational biology. Supervised learning algorithms based on support vector machines are currently one of the most effective methods for remote homology detection. The performance of these methods depends on how the protein sequences are modeled and on the method used to compute the kernel function between them. Results: We introduce two classes of kernel functions that are constructed by combining sequence profiles with new and existing approaches for determining the similarity between pairs of protein sequences. These kernels are constructed directly from these explicit protein similarity measures and employ effective profile-to-profile scoring schemes for measuring the similarity between pairs of proteins. Experiments with remote homology detection and fold recognition problems show that these kernels are capable of producing results that are substantially better than those produced by all of the existing state-of-the-art SVM-based methods. In addition, the experiments show that these kernels, even when used in the absence of profiles, produce results that are better than those produced by existing non-profile-based schemes.

Reference(s):
Profile based direct kernels for remote homology detection and fold recognition by Huzefa Rangwala and George Karypis (Bioinformatics 2005)
Reference link(s):
http://bioinformatics.oxfordjournals.org/cgi/content/abstract/bti687v1
Data link(s):
http://bioinfo.cs.umn.edu/supplements/remote-homology/

Entered by: Huzefa Rangwala <rangwala@cs.umn.edu> - Sunday, November 06, 2005 at 06:02:08 (GMT)
Comments:

content based image retrieval

Relevance feedback schemes based on support vector machines (SVM) have been widely used in content-based image retrieval (CBIR). However, the performance of SVM based relevance feedback is often poor when the number of labeled positive feedback samples is small. This is mainly due to three reasons: (1) an SVM classifier is unstable on a small-sized training set; (2) SVM’s optimal hyper-plane may be biased when the positive feedback samples are much less than the negative feedback samples; and (3) overfitting happens because the number of feature dimensions is much higher than the size of the training set. In this paper, we develop a mechanism to overcome these problems. To address the first two problems, we propose an asymmetric bagging based SVM (AB-SVM). For the third problem, we combine the random subspace method and SVM for relevance feedback, which is named random subspacing SVM (RS-SVM). Finally, by integrating AB-SVM and RS-SVM, an asymmetric bagging and random subspacing SVM (ABRS-SVM) is built to solve these three problems and further improve the relevance feedback performance.

Reference(s):
Dacheng Tao, Xiaoou Tang, Xuelong Li, and Xindong Wu, Asymmetric Bagging and Random Subspacing for Support Vector Machines-based Relevance Feedback in Image Retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence, accepted, to appear.
Reference link(s):
Data link(s):

Entered by: Dacheng Tao <Dacheng Tao> - Tuesday, October 11, 2005 at 19:03:18 (GMT)
Comments:

DATA Classification ursing SSVM

Smoothing methods, extensively used for solving important mathematical programming problems and applications, are applied here to generate and solve an unconstrained smooth reformulation of the support vector machine for data or pattern classification using a completely arbitrary kernel. The basic SVM is reformulated in to smooth support vector machine (SSVM) which possesses the mathematical property of strong convexity, which makes the basic SVM classification , a minimization problem. In this work Newton-Armijo algorithm is used for solving the SSVM Unconstraint optimization problem. On larger datasets SSVM is faster than SVM light . SSVM can also generate a highly nonlinear separating surface such as a checkerboard.

Reference(s):

[1] O. L. Mangasarian . A Finite Newton Method for Classification Problems
[2] O. L. Mangasarian . A Smooth Support Vector machine for classification .
[3] K.P. Soman . XSVMs and Applications
Reference link(s):
Data link(s):

Entered by: Aduru . Venkateswarlu <venkatsherma@yahoo.com> - Monday, September 19, 2005 at 04:35:39 (GMT)
Comments:

DTREG SVM and decision tree modeling

DTREG builds SVM and decision tree based predictive models.

Reference(s):
Reference link(s):
http://www.dtreg.com/svm.htm
Data link(s):

Entered by: Phil Sherrod <phil.sherrod@sandh.com> - Saturday, September 10, 2005 at 20:32:24 (GMT)
Comments:

DTREG - SVM and Decision Tree Predictive Modeling

DTREG builds Support Vector Machine (SVM) and Decision Tree predictive models. SVM features automatic grid search for optimal parameter selection and V-fold cross-validation for measuring model generalization. Decision tree models provided by DTREG include classical single trees, TreeBoost series of boosted trees and Decision Tree Forest of parallel trees that "vote" on the outcome.

Reference(s):
Reference link(s):
http://www.dtreg.com/svm.htm
Data link(s):

Entered by: Phil Sherrod <phil.sherrod@sandh.com> - Friday, August 26, 2005 at 20:09:46 (GMT)
Comments: DTREG supports Linear, Polynomial, Sigmoid and Radial Basis kernel functions. It can handle problems with millions of data rows and hundreds of variables.

Facial expression classification

Facial expression classification using statistical models of shape and SVM's

Reference(s):
J. Ghent and J. McDonald, "Facial Expression Classification using a One-Against-All Support Vector Machine", proceedings of the Irish Machine Vision and Image Processing Conference, Aug 2005.

J. Ghent and J. McDonald, "Holistic Facial Expression Classification", SPIE Opto-Ireland, pp 5823-18, April 2005.
Reference link(s):
Data link(s):

Entered by: John Ghent <jghent@cs.may.ie> - Tuesday, August 09, 2005 at 10:14:08 (GMT)
Comments:

End-depth and discharge prediction in semi-circular and circular shaped channels

The results of an application of a support vector machine based modelling technique to determine the end-depth ratio and discharge of a free overfall occurring over an inverted smooth semi-circular channel and a circular channel with flat bases are presented in this paper. The results of the study indicate that the support vector machine technique can be used effectively for predicting the end-depth ratio and the discharge for such channels. For subcritical flow, the predicted value of the end-depth ratio compares favorably to the values obtained by using empirical relations derived in previous studies, while for supercritical flow, the support vector machines perform equally well and are found to work better than the empirical relationship proposed in earlier studies.The results also suggest the usefulness of support vector machine based modelling techniques in predicting the end-depth ratio and discharge for a semi-circular channel using the model created for the circular channel data, and vice versa, for supercritical flow conditions.

Reference(s):
C. Cortes and V.N. Vapnik, Support vector networks, Machine Learning 20 (1995), pp. 273–297.
S. Dey, Free over fall in circular channels with flat base: a method of open channel flow measurement, Flow Meas. Instrum. 13 (2002), pp. 209–221.
S. Dey, Free over fall in open channels: state-of-the-art review, Flow Meas. Instrum. 13 (2002), pp. 247–264.
Y.B. Dibike, S. Velickov, D.P. Solomatine and M.B. Abbott, Model induction with support vector machines: Introduction and applications, J. Comput. Civil Eng. 15 (2001), pp. 208–216.
D. Leunberger, Linear and Nonlinear Programming, Addison-Wesley (1984).
H. Rouse, Discharge characteristics of the free overfall, Civil Engineering, ASCE 6 (1936) (4), pp. 257–260.
R.V. Raikar, D. Nagesh Kumar and S. Dey, End depth computation in inverted semi circular channels using ANNs, Flow Meas. Instrum. 15 (2004), pp. 285–293.
A.J. Smola, Regression estimation with support vector learning machines, Master’s Thesis, Technische Universität München, Germany, 1996.
M. Sterling and D.W. Knight, The free overfall as a flow measuring device in a circular channel, Water and Maritime Engineering Proceedings of Institution of Civil Engineers London 148 (December) (2001), pp. 235–243.
V.N. Vapnik, Statistical Learning Theory, John Wiley and Sons, New York (1998).
Reference link(s):
http://www.sciencedirect.com/science/journal/09555986
Data link(s):

Entered by: mahesh pal <mpce_pal@yahoo.co.uk> - Monday, August 01, 2005 at 10:20:34 (GMT)
Comments:

Identification of alternative exons using SVM

Alternative splicing is a major component of the regulatory action on mammalian transcriptomes. It is estimated that over half of all human genes have more than one splice variant. Previous studies have shown that alternatively spliced exons possess several features that distinguish them from constitutively spliced ones. Recently, we have demonstrated that such features can be used to distinguish alternative from constitutive exons. In the current study, we used advanced machine learning methods to generate robust classifier of alternative exons. RESULTS: We extracted several hundred local sequence features of constitutive as well as alternative exons. Using feature selection methods we find seven attributes that are dominant for the task of classification. Several less informative features help to slightly increase the performance of the classifier. The classifier achieves a true positive rate of 50% for a false positive rate of 0.5%. This result enables one to reliably identify alternatively spliced exons in exon databases that are believed to be dominated by constitutive exons.

Reference(s):
Dror G., Sorek R. and Shamir S.
Accurate identification of alternatively spliced exons using Support Vector Machine
Bioinformatics. 2005 Apr 1;21(7):897-901.
Epub 2004 Nov 5.
Reference link(s):
http://www2.mta.ac.il/~gideon/nns_pub.html
Data link(s):

Entered by: Gideon Dror <gideon@mta.ac.il> - Monday, June 20, 2005 at 11:55:09 (GMT)
Comments: 2 class, 243 positive , 1753 negative instances. total 228 features gaussian kernel. Baseline systems: neural networks and Naive Bayes. SVM outperformed them in terms of area under ROC curve, but most inportantly, in its ability to get very high true positives rate (50%) for very low false positives rate (0.5%). This performance would enable effective scan of exon databases in search for novel alternatively spliced exons, in the human or other genomes.

Support Vector Machines For Texture Classification

Sathishkumar is currently doing M.Tech in computer vision and image processing in Amrita institute of technology (amrita vishwa vidya peetham)coimbatore, Tamilnadu, India. He received his B.E from bharathiar university in the year 2003. His intrests are Soft computing & Data mining techniques for image processing

Reference(s):
1.Support Vector Machines for Texture Classification
Kwang In Kim, Keechul Jung, Se Hyun Park, and
Hang Joon Kim,IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 24, NO. 11, NOVEMBER 2002,
2.“An introduction to Support Vector Machines and other kernel-based learning
methods” by Nello Cristianini & John Shawe-Taylor. (http://www.support http://www.supportvector.
net/)
Reference link(s):
Data link(s):

Entered by: sathishkumar <sathishkumar.maddy@gmail.com> - Thursday, June 02, 2005 at 05:02:34 (GMT)
Comments:

SVM application in E-learning

Personalied and learner centered learning is receiving increasing importance due to increased learning rate. AI techniques to tailor the content to the learner depending on his context and need are being deployed for such tasks.SVMs stand out due to their better perfromance specially in handling large dimensions which text content do possess. Lecture material could be reprocessed to create a suitable feature space and then present the contents to the learner as per his need. This will save time and also avoid information overload.

Reference(s):
Reference link(s):
Data link(s):

Entered by: sandeep dixit <sandeepdixit2004@yahoo.com> - Thursday, March 31, 2005 at 15:14:17 (GMT)
Comments:

text classification with SVMs

Classification large volumn of document into some class

Reference(s):
Reference link(s):
Data link(s):

Entered by: Duong DInh DUng <dungngtq8@yahoo.com> - Thursday, March 24, 2005 at 06:03:04 (GMT)
Comments:

Isolated Handwritten Jawi Characters Categorization Using Support Vector Machines (SVM).

Isolated Handwritten Jawi Characters Categorization Using Support Vector Machines (SVM). Application is targeted at routing in a mobile computing environment.

Reference(s):
Reference link(s):
Data link(s):

Entered by: Suhaimi Abd Latif <suhaimie@iiu.edu.my> - Wednesday, January 19, 2005 at 06:02:27 (GMT)
Comments:

Image Clustering

Clustering is an important task for image compression. This clustering can be done by SVM efficiently.

Reference(s):
Reference link(s):
Data link(s):

Entered by: Ahmed Yousuf Saber <saber_uap@yahoo.com> - Wednesday, January 19, 2005 at 02:16:09 (GMT)
Comments:

ewsRec, a SVM-driven Personal Recommendation System for News Websites

NewsRec is a SVM-driven personal recommender system designed for news websites and uses SVMs for prediction wether articles are interesting or not.

Reference(s):
Bomhardt, C. (2004): NewsRec, a SVM-driven Personal Recommendation System for News Websites
In: Web Intelligence, IEEE/WIC/ACM International Conference on (WI'04)
Keywords: Personal Recommendation, Support-Vector-Machine, Personalization, Text Classification
Reference link(s):
http://csdl.computer.org/comp/proceedings/wi/2004/2100/00/2100toc.htm
Data link(s):

Entered by: Christian Bomhardt <christian.bomhardt@etu.uni-karlsruhe.de> - Monday, October 11, 2004 at 15:26:58 (GMT)
Comments: about 1200 datasets, about 30000 features, linear kernel, SVMs are very fast compared to other methods and can handle the large number of features.

Equbits Foresight

Equbits Foresight is a SVM based predictive modeling application designed for HTS and ADME-Tox Chemists.

Reference(s):
Reference link(s):
www.equbits.com
Data link(s):

Entered by: Ravi Mallela <ravi@equbits.com> - Saturday, October 09, 2004 at 15:37:20 (GMT)
Comments:

SPEAKER /SPEECH RECOGNITION

SPEAKER /SPEECH RECOGNITION UTTERANCE VERIFICATION FOR SPEECH RECOGNITION SVM ARE USED TO ACCEPT KEYWORD OR REJECT NON-KEYWORD FOR SPEECH RECOGNITION SPEAKER VERIFICATION /RCOGNITION POLYVAR TELEPHONE DATABASE IS USED NEW METHOD FOR NORMALIZING POLYNORMIAL KERNEL TO USE WITH SVM YOHO DATABASE ,TEXT INDEPENDENT , BEST EER=0.34% COMBINED GAUSSIAN MIXTURE MODEL IN SVM OUTPUTS TEXT INDEPENDENT SPEAKER VERIFICATION BEST EER=1.56%

Reference(s):
X.DONG......
ELECTRONICS LETTERS ,VOL.37,PP.527-529,(2001)
C.MA.RANDOLPH ....
IEEE INT.CONFERENCE ON ACOUSTICS,SPEECH,AND SIGNAL PROCESSING VOL.1,PP.381-384,(2001)
V.WAN...
IEEE WORKSHOP ON NEURAL NETWORK FOR SIGNAL PROCESSING X,VOL.2,(2000)
Reference link(s):
Data link(s):

Entered by: MEHDI GHAYOUMI <M_GHAYOUMI@YAHOO.COM> - Tuesday, March 09, 2004 at 06:25:10 (GMT)
Comments:

STUDENT IN AI

Reference(s):
X.DONG......
ELECTRONICS LETTERS ,VOL.37,PP.527-529,(2001)
C.MA.RANDOLPH ....
IEEE INT.CONFERENCE ON ACOUSTICS,SPEECH,AND SIGNAL PROCESSING VOL.1,PP.381-384,(2001)
V.WAN...
IEEE WORKSHOP ON NEURAL NETWORK FOR SIGNAL PROCESSING X,VOL.2,(2000)
Reference link(s):
Data link(s):

Entered by: MEHDI GHAYOUMI <M_GHAYOUMI@YAHOO.COM> - Tuesday, March 09, 2004 at 06:23:03 (GMT)
Comments:

Analysis and Applications of Support Vector Forecasting Model Based on Chaos Theory

A novel support vector forecasting model based on chaos theory was presented. It adopted support vector machines as nonlinear forecaster and network¡¯s input variable number was determined through computing reconstruct phase space¡¯s saturated embedding dimension; The maximum effective forecasting steps was determined by computing chaos time series¡¯ largest lyapunov exponent; It made use of support vector machines to carry out nonlinear forecasting. Application results in aeroengine compressor¡¯s modeling show that this presented method possesses much better precision, which proves that the method is feasible and effective. This method is contributive and instructional for nonlinear time series forecasting via support vector machines for chaos time series.

Reference(s):
[1] ÂÀ½ð»¢µÈ.»ìãçÊ±¼äÐòÁÐ·ÖÎö¼°ÆäÓ¦ÓÃ[M]. Îäºº: Îäºº´óÑ§³ö°æÉç,2001
[2] Stefania Tronci, Massimiliano Giona, Roberto Baratti, ¡°Reconstruction of chaotic time series by neural models: a case study,¡± Neurocomputing, vol. 55, pp. 581-591, 2003.
[3] ºØÌ«¸Ù,Ö£³çÑ«. »ìãçÐòÁÐµÄ·ÇÏßÐÔÔ¤²â[J].×ÔÈ»ÔÓÖ¾, 19(1): 10-13, 2001.
[4] Àî¶¬Ã·, ÍõÕýÅ·. »ùÓÚRBFÍøÂçµÄ»ìãçÊ±¼äÐòÁÐµÄ½¨Ä£Óë¶à²½Ô¤²â[J].ÏµÍ³¹¤³ÌÓëµç×Ó¼¼Êõ, 24(6): 81-83, 2002.
[5] ½ªÌÎ. º½¿Õ·¢¶¯»ú´Õñ/Ê§ËÙÔ¤¹ÀÄ£ÐÍºÍ¹ÊÕÏ¼ì²âÑÐ¾¿[D], ²©Ê¿Ñ§Î»ÂÛÎÄ, Î÷°²:¿Õ¾ü¹¤³Ì´óÑ§¹¤³ÌÑ§Ôº,2002.
[6] Íõº£Ñà, Ê¢ÕÑå«. »ìãçÊ±¼äÐòÁÐÏà¿Õ¼äÖØ¹¹²ÎÊýµÄÑ¡È¡·½·¨[J].¶«ÄÏ´óÑ§Ñ§±¨, 30(5):113-117, 2000.
[7] L.-Y. Cao, ¡°Practical method for determining minimum embedding dimension of a scalar time series,¡± Physica D, vol. 110, pp. 43-52, 1997.
[8] Eckmann J.P, Kamphorst S.O, ¡°Lyapunov exponent from time series,¡± Phys. Rev. A, vol. 34, no. 6, pp.4971~4979, Dec. 1986.
[9] Oiwa N.N, Fiedler-Ferrara N, ¡°A fast algorithm for estimating lyapunov exponents from time series,¡± Physics Letter A, vol.246, pp.117-121, Sep. 1998.
[10] Fabio Sattin, ¡°Lyap: A FORTRAN 90 program to compute the Lyapunov exponents of a dynamical system from a time series,¡± Computer Physics Communications, vol.107, pp.253-257. 1997.
[11] K.R.M¨¹ler, A.J. Smola, G.Rätsch, ¡°Predicting time series with support vector machines,¡± in Proceeding of ICANN 97¡¯, Berlin: Springer LNCS, vol. 1327, pp. 999-1004. 1997.
[12] B.-J. Chen, ¡°Load forecasting using Support vector machines: A study on EUNITE Competition 2001,¡±unpublished.
[13] L.-J.Cao, Q.-M. Gu, ¡°Dynamic support vector machines for non-stationary time series forecasting,¡± Intelligent Data Analysis. vol. 6, no. 1, pp. 67-83, 2002.
[14] F.E.H.Tay, L.-J. Cao, ¡°Applications of support vector machines in financial forecasting,¡± Omega, vol. 9, no. 4, pp.309-317, Aug. 2001.
[15] K.W.Lau, Q.-H. Wu, ¡°Local prediction of chaotic time series based on Gaussian processes,¡± in Proceeding of the 2002 IEEE International Conference on Control Applications, Glasgow, Scotland, U.K, pp. 1309-1313, Sep. 18-20 2002.
[16] Sayan Mukherjee, Edgar Osuna, Frederico Girosi, ¡° Nonlinear prediction of chaotic time series using support vector machines,¡± in Proc.of IEEE NNSP 97, Amelia Island, FL, Sep. 1997.
Reference link(s):
In press
Proceeding of WCICA 2004
Data link(s):

Entered by: xunkai <skyhawkf119@163.com> - Monday, February 23, 2004 at 04:51:26 (GMT)
Comments: It seems impossible but SVM do perfect well!

A Comparison Of The Performance Of Artificial Neural Networks And Support Vector Machines For The Prediction Of Traffic Speed and Travel Time

The ability to predict traffic variables such as speed, travel time or flow based on real time data and historic data collected by various systems in transportation networks is vital to the intelligent transportation systems (ITS) components such as in-vehicle route guidance systems (RGS), advanced traveler information systems (ATIS), and advanced traffic management systems (ATMS). This predicted information enables the drivers to select the shortest path for the intended journey. Accurate prediction of traffic speed and travel time is also useful for evaluating the planning, design, operations and safety of roadways. In the context of prediction methodologies, different time series, and artificial neural networks (ANN) models have been developed in addition to the historic and real time approach. The present paper proposes the application of a recently developed pattern classification and regression technique called support vector machines (SVM) for the short-term prediction of traffic speed and travel time. An ANN model is also developed and a comparison of the performance of both these approaches is carried out, along with real time and historic approach results. Data from the freeways of San Antonio, Texas is used for the analysis.

Reference(s):
V. Kecman, Learning And Soft Computing: Support Vector Machines, Neural Networks, And Fuzzy Logic Models, The MIT press, Cambridge, Massachusetts, London, England.

S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice Hall, 1999

N. Cristianini and J. S. Taylor, An Introduction To Support Vector Machines And Other Kernel Based Learning Methods, Cambridge university press, 2000

S. R. Gunn, “Support Vector Machines for Classification and Regression, http://www.ecs.soton.ac.uk/~srg/ publications/pdf/SVM.pdf
Reference link(s):
Data link(s):

Entered by: Lelitha Vanajakshi <lelitha@yahoo.com> - Friday, January 30, 2004 at 17:39:08 (GMT)
Comments: When the training data was less SVM outperformed ANN, when enough data was available both performed more or less same.

none

Reference(s):
Reference link(s):
Data link(s):

Entered by: leechs <leechs@sohu.com> - Sunday, January 25, 2004 at 13:44:16 (GMT)
Comments:

svm learning

svm face learning

Reference(s):
Reference link(s):
Data link(s):

Entered by: burak <burakkaragoz2002@yahoo.com> - Monday, December 08, 2003 at 16:06:03 (GMT)
Comments:

Protein Structure Prediction

The task of predicting protein structure from protein sequence is an important application of support vector machines. A protein's function is closely related to its structure, which is difficult to determine experimentally. There are mainly two types of methods for predicting protein structure. The first type includes threading and comparitve modeling, which relies on a priori knowledge on similarity among sequence and known structures. The second type, called de novo or ab-initio methods, predicts the protein structure from the sequence alone without relying on the similarity to known structures. Currently, it is difficult to predict high resolution 3D structure from ab-initio methods in order to study the docking of macro-molecules, predicting protein-partner interactions, designing and improving ligands, and protein-protein interactions. The prediction of protein relative solvent accessibility gives us useful information for predicting tertiary protein structure. The SVMpsi method, which uses support vector machines (SVMs) and the position specific scoring matrix (PSSM) generated from PSI-BLAST, has been applied to achieve better prediction accuracy of the relative solvent accessibility. We have introduced a three demensional local descriptor that contains information about the expected remote contacts via the long-range interaction matrix as well as neighbor sequences. The support vector machine approach has successfully been applied to solvent accessibility prediction by considering long-range interaction and handling unbalanced data.

Reference(s):
1. Kim, H. and H. Park, "Prediction of protein relative solvent accessibility with support vector machines and long-range interaction 3D local descriptor",
Proteins:structure, function, and genetics, to appear. (pdf download)
2. Kim, H. and H. Park, "Protein secondary structure prediction by support vector machines and position-specific scoring matrices",
Protein Engineering, to appear. (pdf download)
Reference link(s):
http://www.cs.umn.edu/~hpark/papers/surface.pdf
http://www.cs.umn.edu/~hpark/papers/protein2.pdf
Data link(s):

Entered by: Dr. Haesun Park <hpark@cs.umn.edu> - Friday, July 11, 2003 at 18:59:18 (GMT)
Comments:

Support vector classifiers for land cover classification

SVM wasused to classifiy different land covers using remote sensing data. Results from this study suggests that Multi-class support vector machine perform well in comparison with neural network and decision tree classifiers.

Reference(s):
Mahesh Pal recently finished his PhD form the university of Nottingham, UK and presently working as a lecturer in department of civil enginnering NIT kurukshetra, haryana, India.
Reference link(s):
http://www.gisdevelopment.net/technology/rs/pdf/23.pdf
Data link(s):

Entered by: Mahesh Pal <mpce_pal@yahoo.co.uk> - Wednesday, May 21, 2003 at 07:17:46 (GMT)
Comments:

Intrusion Detection

Intrusion Detection Systems (IDSs) have become more widely used to protect computer networks. However, it is difficult to build highly effective IDS since some of the pattern recognition problems involved are intractable. In this paper, we propose the use of Support Vector Machines (SVMs) for intrusion detection and analyze its performance. We conduct experiments using a large set of DARPA-provided intrusion data. Two groups of SVMs are built to perform, respectively, binary classifications (normal pattern vs. attack pattern) and five-class classifications (normal pattern, and four classes of attack patterns). Detailed analysis is provided on the classification accuracy and time performance (regarding both learning time and running time). Performance comparison between SVMs and neural networks (that are built using the same training and testing data) is also given. Based on the simulation results, we argue that SVMs are superior to neural networks in several important aspects of IDS. Our results therefore indicate that SVMs are superior candidates to be used as the learning machines for IDSs.

Reference(s):
Srinivas Mukkamala joined the Computer Science graduate program of New Mexico Tech in 2000 and is currently a Ph.D. student. He received his B.E. degree in computer science and engineering in 1999 from University of Madras. His interests are information assurance, information hiding, artificial intelligence, soft computing techniques for computer security.

Andrew H. Sung is professor and chairman of the Computer Science Department, and the Coordinator of the Information Technology Program, at New Mexico Tech. He received his Ph.D. in computer science from the State University of New York at Stony Brook in1984. His interests are intelligent systems, soft computing, and information assurance.
Reference link(s):
www.cs.nmt.edu/~IT
Data link(s):
http://kdd.ics.uci.edu/

Entered by: Srinivas Mukkamala <srinivas@cs.nmt.edu> - Thursday, January 09, 2003 at 05:02:19 (GMT)
Comments: SVMs are superior to ANNs for intrusion detection in three critical respects: SVMs train, and run, an order of magnitude faster; SVMs scale much better; and SVMs give higher classification accuracy. For details on number of classes, kernels used, input features, number of support vectors, input feature selection and ranking methods. Please take a read of our latest versions. If you need our latest versions or need any assistance, please send the author an email: srinivas@cs.nmt.edu Sincerely Srinivas Mukkamala

The Gaussian Dynamic Time Warping (GDTW) kernel for On-line Handwriting Recognition

During the last years the task of on-line handwriting recognition has gained an immense importance in all-day applications, mainly due to the increasing popularity of the personal digital assistant (pda). Currently a next generation of ``smart phones'' and tablet-style PCs, which also rely on handwriting input, is further targeting the consumer market. However, in the majority of these devices the handwriting input method is still not satisfying. In current pdas people still use input methods, which abstract from the natural writing style, e.g. in the widespread Graffiti.

Thus there is demand for a handwriting recognition system which is accurate, efficient and which can deal with the natural handwriting of a wide range of different writers.

Reference(s):

Claus Bahlmann, Bernard Haasdonk and Hans Burkhardt. On-line Handwriting Recognition using Support Vector Machines - A kernel approach. In Int. Workshop on Frontiers in Handwriting Recognition (IWFHR) 2002, Niagara-on-the-Lake, August 2002.
Description of the frog on hand recognition system
Data URL:
UNIPEN Train-R01/V07
Reference link(s):
Data link(s):

UNIPEN Train-R01/V07

Entered by: Claus Bahlmann <bahlmann@informatik.uni-freiburg.de> - Monday, September 09, 2002 at 11:52:27 (GMT)
Comments:

Usual SVM kernels are designed to deal with data of fixed dimension. However, on-line handwriting data is not of a fixed dimension, but of a variable-length sequential form. In this respect SVMs cannot be applied to HWR straightforwardly.

We have addressed this issue by developing an appropriate SVM kernel for sequential data, the Gaussian dynamic time warping (GDTW) kernel. The basic idea of the GDTW kernel is, that instead of the squared Euclidean distance in the usual Gaussian kernel it uses the dynamic time warping distance. In addition to on-line handwriting recognition the GDTW kernel can be straightforwardly applied to all classification problems, where DTW gives a reasonable distance measure, e.g. speech recognition or genome processing.

Experiments have shown superior recognition rate in comparison to an HMM-based classifier for relative small training sets (~ 6000) and comparable rates for larger training sets.

The Gaussian Dynamic Time Warping (GDTW) kernel for On-line Handwriting Recognition

Thus there is demand for a handwriting recognition system which is accurate, efficient and which can deal with the natural handwriting of a wide range of different writers.

Reference(s):

href="http://lmb.informatik.uni-freiburg.de/people/bahlmann/">Claus
Bahlmann, href="http://lmb.informatik.uni-freiburg.de/people/haasdonk/">Bernard
Haasdonk and href="http://lmb.informatik.uni-freiburg.de/people/burkhardt/">Hans
Burkhardt. href="http://lmb.informatik.uni-freiburg.de/people/bahlmann/science.en.html#Anchor_ba_ha_bu_iwfh02">On-line
Handwriting Recognition using Support Vector Machines - A kernel
approach. In Int. Workshop on Frontiers in Handwriting
Recognition (IWFHR) 2002, Niagara-on-the-Lake, August 2002.

href="http://lmb.informatik.uni-freiburg.de/people/bahlmann/frog.en.html">Description of the frog on hand recognition system
Reference link(s):
Data link(s):

UNIPEN Train-R01/V07

Entered by: Claus Bahlmann <bahlmann@informatik.uni-freiburg.de> - Friday, September 06, 2002 at 11:39:08 (GMT)
Comments:

Experiments have shown superior recognition rate in comparison to an HMM-based classifier for relative small training sets (~ 6000) and comparable rates for larger training sets.

forecast

forecast stock

Reference(s):
Reference link(s):
Data link(s):

Entered by: shen <shen0204@yahoo.com.tw> - Thursday, September 05, 2002 at 07:24:00 (GMT)
Comments:

Detecting Steganography in digital images

Techniques for information hiding have become increasingly more sophisticated and widespread. With high-resolution digital images as carriers, detecting hidden messages has become considerably more difficult. This paper describes an approach to detecting hidden messages in images that uses a wavelet-like decomposition to build higher-order statistical models of natural images. Support vector machines are then used to discriminate between untouched and adulterated images.

Reference(s):
Detecting Hidden Messages Using Higher-Order Statistics and Support Vector Machines

S. Lyu and H. Farid
5th International Workshop on Information Hiding, Noordwijkerhout, The Netherlands, 2002
Reference link(s):
http://www.cs.dartmouth.edu/~farid/publications/ih02.html
Data link(s):
http://www.cs.dartmouth.edu/~farid/publications/ih02.html

Entered by: Siwei Lyu <lsw@cs.dartmouth.edu> - Thursday, August 22, 2002 at 15:58:54 (GMT)
Comments: 2 classes 3600 training examples, over 18,000 testing samples 1100 SVs RBF kernel LibSVM

Detecting Steganography in digital images

Reference(s):
Reference link(s):
http://www.cs.dartmouth.edu/~farid/publications/ih02.html
Data link(s):
http://www.cs.dartmouth.edu/~farid/publications/ih02.html

Entered by: Siwei Lyu <lsw@cs.dartmouth.edu> - Thursday, August 22, 2002 at 15:57:24 (GMT)
Comments: 2 classes 3600 training examples, over 18,000 testing samples 1100 SVs RBF kernel LibSVM

Fast Fuzzy Cluster

real time adaptive pattern recognition

Reference(s):
Reference link(s):
members.aol.com/awareai
Data link(s):

Entered by: Michael Bickel <awareai@aol.com> - Tuesday, July 23, 2002 at 01:11:11 (GMT)
Comments:

Breast Cancer Prognosis: Chemotherapy Effect on Survival Rate

A linear support vector machine (SVM) is used to extract 6 features from a total of 31 features from the Wisconsin breast cancer dataset of 253 patients. We cluster the 253 breast cancer patients into three prognostic groups: Good, Intermediate and Poor. Each of the three groups has a significantly distinct Kaplan-Meier survival curve. Of particular significance is the Intermediate group, because this group comprises of patients for whom chemotherapy gives distinctly better survival times than those in the same group that did not undergo chemotherapy. This is the reverse case to that of the overall population studied, for which patients without chemotherapy have better longevity. We prescribe a procedure that utilizes three nonlinear smooth support vector machines (SSVMs) for classifying breast cancer patients into three above prognostic groups with 82.7% test set correctness. These results suggest that patients in the Good group should not receive chemotherapy while Intermediate group patients should receive chemotherapy based on our survival curve analysis. To our knowledge this is the first instance of classifiable group of breast cancer patients for which chemotherapy enhances survival.

Reference(s):
Yuh-Jye Lee, O. L. Mangasarian and W. H. Wolberg: ¡§Survival-Time Classification of Breast Cancer Patients, Data Mining Institute Technical Report 01-03, March 2001.

Yuh-Jye Lee, O. L. Mangasarian and W. H. Wolberg: ¡§Breast Cancer Survival and Chemotherapy: A Support Vector Machine Analysis¡¨, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Vol. 55 (2000), pp. 1-10.

Yuh-Jye Lee and O. L. Mangasarian: ¡§SSVM: Smooth Support Vector Machine for Classification¡¨, Computational Optimization and Applications (20)1: pp. 5-22.
Reference link(s):
Data link(s):
WPBCC: Wisconsin Prognostic Breast Cancer Chemotherapy Database.
ftp://ftp.cs.wisc.edu/math-prog/cpo-dataset/machine-learn/WPBCC/

Entered by: Yuh-Jye Lee <yjlee@cs.ccu.edu.tw> - Wednesday, October 24, 2001 at 19:38:50 (MDT)
Comments:

Underground Cable Temperature Prediction

Support Vector Regression used to predict the temperature of a cable buried underground, based on weather data from the previous 24 hours. Training examples typically 100-200, testing examples 400. Outperformed previous hybrid forecasting system of Neural Net with fuzzy logic. The software used was not my own but is a Matlab toolbox implementation written by Steve Gunn at Southampton University (http://www.isis.ecs.soton.ac.uk/resources/svminfo/). Training takes about a minute - I'm now using about 400 training examples and the entire testing dataset is about 26000 examples. Dimensions of the feature space are about 50: 5 weather elements repeated ten times over the past 24 hours. So far the predictions on the entire dataset are within about a dgree; I could probably optimise it further but I'm running out of time!

Reference(s):
Reference link(s):
Data link(s):

Entered by: Robin Willis <rew198@soton.ac.uk> - Friday, May 04, 2001 at 08:31:41 (PDT)
Comments:

Image classification

Classification of natural images from the Corel database using a SVM and a color histogram as input feature

Reference(s):
Reference link(s):
www.ens-lyon.fr/~ochapell/tnn99.ps.gz
Data link(s):

Entered by: Olivier Chapelle <chapelle@research.att.com> - Tuesday, April 04, 2000 at 13:50:39 (PDT)
Comments: Number of classes = 6 or 14 Dimension of the input features = 4096 Kernel = RBF with various distances SVM outperforms KNN. The choice of the distance in the RBF kernel is critical.

Particle and Quark-Flavour Identification in High Energy Physics

The aim is to classify events of high-energy electron-positron collisions according to the quark-flavour they originate from. A second application is to identify particles in an event (e.g. muons). The event data comes from a big detector, the OPAL experiment, which measures physical properties of the particles coming out of the collision. Monte Carlo methods are used to simulate this collisions in the detector. This allows the use of supervised machine-learning methods to classify the data. The input-variable distributions of the different quark-classes have partially a very high overlap. No complete separation is possible. Identifying muons is an easier problem, where neural methods proof to be successful.
We compared the performance of SVMs (RBF kernel) with NNs (trained with backprop). The amounts of available data are very large, we tested on 3x100k patterns for the quark-flavour problem.

Reference(s):
Classifying LEP Data with Support Vector Algorithms
P. Vannerem, K.-R. Müller, B. Schölkopf, A. Smola, S. Söldner-Rembold
submitted to Proceedings of AIHENP'99.
Reference link(s):
http://wwwrunge.physik.uni-freiburg.de/preprints/EHEP9901.ps
ftp://ftp.physics.uch.gr/aihenp99/Vannerem/
Data link(s):

Entered by: Philippe Vannerem <philippe.vannerem@cern.ch> - Tuesday, October 19, 1999 at 16:17:56 (PDT)
Comments: We saw only small differences in performance between NNs and SVMs.

Object Detection

The problem of object detection is to differentiate a certain class of objects (the A class) from all other patterns and objects (the not-A class). This is contrasted with object recognition where the problem is to be able to differentiate between elements of the same class.

Reference(s):

A Pattern Classification Approach to Dynamical Object Detection

Anuj Mohan
Object Detection in Images by Components
CBCL Paper #178/AI Memo #1664, Massachusetts Institute of Technology

Constantine Papageorgiou, Theodoros Evgeniou, Tomaso Poggio
A Trainable Pedestrian Detection System
Proceedings of Intelligent Vehicles, 1998, pp. 241-246

Constantine P. Papageorgiou, Michael Oren, Tomaso Poggio
A General Framework for Object Detection
Proceedings of ICCV, 1998, pp. 555-562

Reference link(s):

http://www.ai.mit.edu/projects/cbcl/publications/ps/dyn-obj-iccv99.ps.gz

ftp://publications.ai.mit.edu/ai-publications/1500-199/AIM-1664.ps

www.ai.mit.edu/projects/cbcl/publications/ps/ped-det-iv98.ps.gz

www.ai.mit.edu/projects/cbcl/publications/ps/gen-obj-det-iccv98.ps.gz

Data link(s):

Entered by: Constantine Papageorgiou <cpapa@ai.mit.edu> - Wednesday, October 06, 1999 at 14:33:37 (PDT)
Comments (entered by Isabelle Guyon): In Papageorgiou-Oren-Poggio-98, the authors investigate face detection and pedestrian detection. From the point of view of static images, they obtain 75% correct face detection for a rate of 1 false dectection in 7500 windows and 70% correct pedestrian detection for a false detection rate of 1 false detection in 15000 windows. Of particular interest is their method to increase the number of negative examples with a "bootstrap method": they start with a training set consisting of positive examples (faces or pedestrians) and a small number of negative examples that are "meaningless" images. After a first round of training and testing on fresh examples, the negative examples corresponding to false detections are added to the training set. Training/test set enlargement is iterated. The dynamic system that uses motion to refine performance is roughly 20% better. In this first paper, they authors reduce the dimensionality of input space before training with SVM down to 20-30 input features. Thousands of examples are used for training. In contrast, in Papageorgiou-Poggio-99, using again the problem of pedestrian detection in motion picture, the authors train an SVM with 5201 examples directly in a 6630 dimensional input space consisting of wavelet features at successive time steps. They find that their system is simpler and faster than traditional HMM or Kalman filter systems and has lower false positive rates than static systems.

Combustion Engine Knock Detection

In our research project we developed an engine knock detection system for combustion engine control using advanced neural detection algorithms. Knocking is an undesired fast combustion which can destroy the engine. To approach we collected a large database with different engine states (2000 and 4000 rounds per minute; non-knocking, borderline- and hard-knocking). Because of the high non-linearity of the problem neural approaches are very promising and we showed already their high performance in this application.

Reference(s):

Construction of a Support Vector Machine with Local Experts.

International Joint Conference on Artificial Intelligence (IJCAI 99)

Support Vector Approaches for Engine Knock Detection

International Joint Conference on Neural Networks (IJCNN 99)

Reference link(s):

Data link(s):

Entered by: Matthias Rychetsky <rychetsky@mes.tu-darmstadt.de> - Tuesday, October 05, 1999 at 01:03:12 (PDT)
Comments: We compared for our database (unfortunately not public domain) SVM approaches, MLP nets and Adaboost. The SV Machines outperformed all other approaches significantly. For this application real time calculation is an issue, therefore we currently examine methods to reduce computational burden at recall phase (e.g. reduced set algorithms or integer based approaches).

Engineering Support Vector Machine Kernels That Recognize Translation Initiation Sites

In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points from which regions encoding proteins start, the so-called translation initiation sites. This can be modeled as a classification problem. We demonstrate the power of support vector machines for this task, and show how to successfully incorporate biological prior knowledge by engineering an appropriate kernel function.

Reference(s):

Reference link(s):

http://www.bioinfo.de/isb/gcb99/talks/zien/

Data link(s):

Entered by: Alexander Zien <Alexander.Zien@gmd.de> - Friday, September 24, 1999 at 02:27:26 (PDT)
Comments: SVMs beat a neural network.

Detection of Remote Protein Homologies

A core problem in statistical biosequence analysis is the annotation of new protein sequences with structural and functional features. To a degree, this can be achieved by relating the new sequences to proteins for which such structural properties are already known. Although numerous techniques have been applied to this problem with success, the detection of remote protein homologies has remained a challenge.

Reference(s):

David Haussler

Reference link(s):

http://www.cse.ucsc.edu/research/compbio/discriminative/Jaakola2-1998.ps

Data link(s):

Dataset (12Mb compressed)

Entered by: Isabelle Guyon <isabelle@clopinet.com> - Monday, September 20, 1999 at 15:04:05 (PDT)
Comments: Jaakkola et al combine SVMs with HMMs and show the superiority of this approach compared to several baseline systems.

Function Approximation and Regression

Function approximation and regression problems seek to determine from pairs of examples (x,y) an approximation to an unknown function y=f(x). The application of SVMs to such problems has been intensively benchmarked with "synthetic data" coming from known functions. Although this demonstrated that SVMs are a very promising technique, this hardly qualifies as an application. There are only a few applications to real world problems. For example, in the Boston housing problem, house prices must be predicted from socio-economic and environmental factors, such as crime rate, nitric oxide concentation, distance to employment centers, and age of a property.

Reference(s):

Support Vector Regression Machines.

Support Vector Regression with ANOVA Decomposition Kernels.

Mark O. Stitson

Reference link(s):

Drucker-97

Data link(s):

ftp://ftp.ics.uci.edu/pub/machine-learning-databases/housing

Entered by: Isabelle Guyon <isabelle@clopinet.com> - Monday, September 20, 1999 at 14:23:51 (PDT)
Comments: Drucker-97 finds that SVMs outperform the baseline system (bagging) on the Boston housing problem. It is noted that SVMs can make a real difference when the dimensionality of input space and the order of the approximation create a dimensionality of feature space which is untractable with other methods. The results of Drucker et al are further improved in Stitson-99 (overall 35% better than the baseline method).

3-D Object Recognition Problems

3-D Object Recognition encompasses a wide variety of problems in Pattern Recognition that have to do with classifying representations of 3-dimensional objects. This ranges from face recognition to Automatic Target Recognition (ATR) from radar images. Some of the challenges include that objects are usually seen from only one angle at a time, and may be partially occulted. After training the system must be able both to classify correctly the objects of interest and reject other "confusers" or "distractors".

Reference(s):

Visual Learning and Recognition of 3-D Object from Appearance
H. Murase and S.K. Nayar
Int. J. Comput. Vision, Vol. 14, 1995, pp. 5--24.
Comparison of view-based object recognition algorithms using realistic 3D models.
V. Blanz, B. Schölkopf, H. Bülthoff, C. Burges, V. Vapnik, T. & Vetter,
In: C. von der Malsburg, W. von Seelen, J. C. Vorbrüggen, and B. Sendhoff (eds.): Artificial Neural Networks - ICANN'96.
Springer Lecture Notes in Computer Science Vol. 1112, Berlin, 251-256.
1996.
(also published in Proc of ICANN'96,
LNCS Vol. 1112, 1996, pp. 251--256.)
Training Support Vector Machines: an Application to Face Detection,
Edgar Osuna, Robert Freund and Federico Girosi.
Proceedings of CVPR'97, Puerto Rico,
1997.
Kernel Principal Component Analysis.
B. Schölkopf, A. Smola, K.-R. Müller,
Proceedings ICANN'97, p.583. Springer Lecture Notes in Computer Science.
1997.
Schölkopf, B.; 1997. Support Vector Learning. PhD Thesis. Published by: R. Oldenbourg Verlag, Munich, 1997.
Direct aspect-based 3-D object recognition.
Massimiliano Pontil, and A. Verri.
Proc. Int. Conf on Image Analysis and Processing, Firenze,
1997.
Automatic Target Recognition with Support Vector Machines,
Qun Zhao and Jose Principe,
NIPS-98 Workshop on Large Margin Classifiers,
1998.
The kernel adatron algorithm: a fast and simple learning procedure for support vector machines.
T.-T. Friess, N. Cristianini, C. Campbell.
15th Intl. Conf. Machine Learning, Morgan Kaufman Publishers.
1998.
View-based 3D object recognition with Support Vector Machines.
Danny Roobaert & Marc M. Van Hulle
Proc. IEEE Neural Networks for Signal Processing Workshop
1999.
Improving the Generalisation of Linear Support Vector Machines: an Application to 3D Object Recognition with Cluttered Background
Danny Roobaert
Proc. Workshop on Support Vector Machines at the 16th International Joint Conference on Artificial Intelligence, July
31-August 6, Stockholm, Sweden, p. 29-33
1999.

Reference link(s):

Data link(s):

Chair data set

Sonar data

Entered by: Isabelle Guyon <isabelle@clopinet.com> - Saturday, September 18, 1999 at 17:54:51 (PDT)
                     Pontil Massimiliano <pontil@ai.mit.edu> - Monday, October 04, 1999 at 18:48:38 (PDT)
                    Danny Roobaert <roobaert@nada.kth.se> - Friday, October 08, 1999 at 12:01:21 (PDT)
                    Edited by Isabelle Guyon - Thursday, October 14, 1999.
Comments: SVM's have been used either in the classification stage or in the pre-processing (Kernel Principal Component Analysis).
In Blanz-96. Support Vector Classifiers show excellent performance, leaving behind other methods. Osuna-97 demonstrates that SVCs can be trained on very large data sets (50,000 examples). The classification performance reaches that of one of the best known system while being 30 times faster at run time.
In Schölkopf-97, the advantage of KPCA is more measured in terms of simplicity, ensured convergence, and ease of understanding of the non-linearities.
Zhao-98 notes that SVCs with Gaussian kernels handle the rejection of unknown "confusers" particularly well. Friess-98 reports performance on the sonar data of Gorman and Sejnowski (1988). Their kernel adatron SVMs has a 95.2% success, compared to 90.2% for the best Backpropagation Neural Networks.
Papageorgiou-98 applies SVM with a wavelet preprocessing to face and people detection, showing improvements with respect to their base system.
Roobaert-99 shows that an SVM system working on raw data, not incorporating any domain knowledge about the task, matches the performance of their baseline system that does incorporate such knowledge.
Massimiliano Pontil points out that, as shown by the comparison with other techniques, it appears that SVMs can be effectively trained even if the number of examples is much lower than the dimensionality of the object space. In the paper Pontil-Verri-97, linear SVMs are used for 3-D object recognition. The potential of SVMs is illustrated on a database of 7200 images of 100 different objects. The proposed system does not require feature extraction and performs recognition on images regarded as points of a space of high dimension without estimating pose. The excellent recognition rates achieved in all the performed experiments indicate that SVMs are well-suited for aspect-based recognition.
In Roobaert-99, 3 methods for the improvement of Linear Support Vector Machines are presented in the case of Pattern Recognition with a number of irrelevent dimensions. A method for 3D object recognition without segmentation is proposed.

Text Categorization

Text categorization is the assignment of natural language texts to one or more predefined categories based on their content. Applications include: assigning subject categories to documents to support text retrieval, routing, and filtering; email or files sorting into folder hierarchies; web page sorting into search engine categories.

Reference(s):

Text Categorization with Support Vector Machines: Learning with Many Relevant Features.

T. Joachims

Inductive Learning Algorithms and Representations for Text Categorization,

J. Platt

Support Vector Machines for Spam Categorization. H. Drucker, with D. Wu and V. Vapnik. IEEE Trans. on Neural Networks , vol 10, number 5, pp. 1048-1054. 1999.
Transductive Inference for Text Classification using Support Vector Machines.

Reference link(s):

Joachims-98 Postcript

Joachims-98 PDF

Dumais et al 98

Drucker et al 98

Joachins-99 Postcript

Joachims-99 PDF

Data link(s):

Reuters-21578

Entered by: Isabelle Guyon <isabelle@clopinet.com> - Friday, September 17, 1999 at 15:19:48 (PDT). Last modified, October 13, 1999.
Comments: Joachims-98 reports that SVMs are well suited to learn in very high dimensional spaces (> 10000 inputs). They achieve substantial improvements over the currently best performing methods, eliminating the need for feature selection. The tests were run on the Oshumed corpus of William Hersh and Reuter-21578. Dumais et al report that they use linear SVMs because they are both accurate and fast (to train and to use). They are 35 times faster to train that the next most accurate classifier that they tested (Decision Trees). They have applied SVMs to the Reuter-21578 collection, emails and web pages. Drucker at al classify emails as spam and non spam. They find that boosting trees and SVMs have similar performance in terms of accuracy and speed. SVMs train significatly faster. Joachims-99 report that transduction is a very natural setting for many text classification and information retrieval tasks. Transductive SVMs improve performance especially in cases with very small amounts of labelled training data.

Time Series Prediction and Dynamic Resconstruction of Chaotic Systems

Dynamic reconstruction is an inverse problem that deals with reconstructing the dynamics of an unknown system, given a noisy time-series representing the evolution of one variable of the system with time. The reconstruction proceeds by utilizing the time-series to build a predictive model of the system and, then, using iterated prediction to test what the model has learned from the training data on the dynamics of the system.

Reference(s):

Nonlinear Prediction of Chaotic Time Series using a Support Vector Machine
S. Mukherjee, E. Osuna, and F. Girosi NNSP'97, 1997.

Using Support Vector Machines for Time Series Prediction
Müller, K.-R.; Smola, A.; Rätsch, G.; Schölkopf, B.; Kohlmorgen, J.; Vapnik, V.
in Advances in Kernel Methods, B. Schölkopf, C.J.C. Burges, and A.J. Smola Eds.
Pages 242-253, MIT Press, 1999. ISBN 0-262-19416-3.

Support Vector Machines for Dynamic Reconstruction of a Chaotic System
Davide Matterra and Simon Haykin
in Advances in Kernel Methods, B. Schölkopf, C.J.C. Burges, and A.J. Smola Eds.
Pages 211-241, MIT Press, 1999. ISBN 0-262-19416-3.

Reference link(s):

Müller et al

Mukherjee et al

Data link(s):

Santa Fe competition Data Set D

Entered by: Isabelle Guyon <isabelle@clopinet.com> - Thursday, September 16, 1999 at 14:54:32 (PDT)
Comments: Müller et al report excellent performance of SVM. They set a new record on the Santa Fe competition data set D, 37% better than the winning approach during the competition. Mattera et al report that SVM are effective for such tasks and that their main advantage is the possibility of trading off the required accuracy with the number of Support Vectors.

Support Vector Machine Classification of Microarray Gene Expression Data

We introduce a new method of functionally classifying genes using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). We describe SVMs that use different similarity metrics, including a simple dot product of gene expression vectors, polynomial versions of the dot product, and a radial basis function. The radial basis function SVM appears to provide superior performance in classifying functional classes of genes when compared to the other SVM similarity metrics. In addition, SVM performance is compared to four standard machine learning algorithms. SVMs have many features that make them attractive for gene expression analysis, including their flexibility in choosing a similarity function, sparseness of solution when dealing with large data sets, the ability to handle large feature spaces, and the ability to identify outliers.

Reference(s):

Reference link(s):

Data link(s):

Entered by: Nello Cristianini <nello.cristianini@bristol.ac.uk> - Friday, September 10, 1999 at 03:11:46 (PDT)
Comments: SVMs outperformed all other classifers, when provided with a specifically designed kernel to deal with very imbalanced data.

Handwritten digit recognition problems

Support vector classifiers were applied to the recognition of isolated handwritten digits optically scanned. This is a subtask of of zipcode automatic reading and courtesy amount recognition on checks.

Reference(s):

An training algorithm for optimal margin classifiers.

B. Boser

I. Guyon

V. Vapnik

Writer adaptation for on-line handwritten character recognition.

N. Matic

Support Vector Networks.

C. Cortes

V. Vapnik

Learning algorithms for classification: A comparison on handwritten digit recognition.

Discovering informative patterns and data cleaning.

Incorporating Invariances in Support Vector Learning Machines.

B. Schölkopf

C. Burges

Prior Knowledge in Support Vector Kernels

Pairwise Classification and Support Vector Machines

Ulrich H.-G. Kressel

The kernel adatron algorithm: a fast and simple learning procedure for support vector machines.

T.-T. Friess

Reference link(s):

Data link(s):

http://www.clopinet.com/isabelle/Projects/LITTLE1200/

http://www.research.att.com/~yann/ocr/mnist/

Entered by: Isabelle Guyon <isabelle@clopinet.com> - Thursday, September 09, 1999 at 16:27:38 (PDT)
Comments: This is one of the first applications of SVCs. It was demonstrated that SCVs could be applied directly to pixel maps and nearly match or outperform other techniques requiring elaborate pre-processing, architecture design (structured neural networks), and/or a metric incorporating prior knowledge of the task (tangent distance) -- see e.g. Lecun-95. Elaborate metrics such as tangent distance can be used in combination with SVCs (Schölkopf-96-97) and yield improved performance. SVCs are also attractive for handwriting recognition tasks because they lend themself to easy writer adaptation and data cleaning, by making use of the support vectors (Matic-93 and Guyon-96). In Friess-98, the kernel Adatron SVM slightly outperforms the original SVM on the USPS character recognition benchmark.

Breast cancer diagnosis and prognosis

Support vector machines have been applied to breast cancer diagnosis and prognosis. The Wisconsin breast cancer dataset contains 699 patterns with 10 attributes for a binary classification task (the tumor is malignant or benign).

Reference(s):

P. S. Bradley, O. L. Mangasarian and W. Nick Street: ``Feature selection via mathematical programming", INFORMS Journal on Computing 10, 1998, 209-217.

P. S. Bradley, O. L. Mangasarian and W. Nick Street: ``Clustering via concave minimization", in ``Advances in Neural Information Processing Systems -9-", (NIPS*96), M. C. Mozer and M. I. Jordan and T. Petsche, editors, MIT Press, Cambridge, MA, 1997, 368-374.

T.-T. Friess; N. Cristianini; C. Campbell.
The kernel adatron algorithm: a fast and simple learning procedure for support vector machines.
15th Intl. Conf. Machine Learning, Morgan Kaufman Publishers.
1998.

Reference link(s):

Data link(s):

WDBC: Wisconsin Diagnostic Breast Cancer Database

BC: Wisconsin Prognostic Breast Cancer Database

Entered by: Prof. Olvi L. Mangasarian <olvi@cs.wisc.ed> - Thursday, September 09, 1999 at 15:25:50 (PDT)
Modified by: Isabelle Guyon <isabelle@clopinet.com> - Monday, September 20, 1999 at 9:30 (PDT)
Comments: Mangasarian et al use a linear programming formulation underlying that can be interpreted as an SVM. Their system (XCYT) is a highly accurate non-invasive breast cancer diagnostic program currently in use at University of Wisconsin Hospital. Friess et al report that the Wisconsin breat cancer dataset has been extensively studied. Their system, which uses Adatron SVMs, has 99.48% success rate, compared to 94.2% (CART), 95.9% (RBF), 96% (linear discriminant), 96.6% (Backpropagation network), all results reported elsewhere in the literature.

Support Vector Decision Tree Methods for Database Marketing

We introduce a support vector decision tree method for customer targeting in the framework of large databases (database marketing). The goal is to provide a tool to identify the best customers based on historical data (model development). Then this tool is used to forecast the best potential customers among a pool of prospects through a process of scoring. We begin by recursively constructing a decision tree. Each decision consists of a linear combination of the independent attributes. A linear program motivated by the support vector machine method from Vapnik's Statistical Learning Theory is used to construct each decision. A gainschart table is used to verify the goodness of fit of the targeting, the likely prospects, and the expected utility of profit. Successful results are given for three industrial problems. The linear program automatically performs dimensionality reduction. The method consistently produced trees with a very small number of decision nodes. Each decision consisted of a relatively small number of attributes. The trees produced a clear division of the population into likely prospects, unlikely prospects, and ambiguous prospects. The largest training dataset tested contained 15,700 points with 866 attributes. Commercial optimization package used, CPLEX, is capable of solving even larger problems.

Reference(s):

Reference link(s):

Data link(s):

Entered by: Kristin P Bennett <bennek@rpi.edu> - Thursday, September 09, 1999 at 15:16:07 (PDT)
Comments: The support vector decision tree performed better than C4.5. SVDT produced very simple trees using few attributes.

Add a new entry to the SVM Application List.
Read more about SVM applications on the Kernel Machines website.
This page was powered by Matt Wright
Other resources: Artificial Intelligence resources - directory of Artificial Intelligence related websites.