## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# Support-Vector Networks

Machine Learning, no. 3 (1995): 273-297

EI

Full Text

Weibo

Keywords

Abstract

The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures...More

Code:

Data:

Introduction

- Fisher (Fisher, 1936) suggested the first algorithm for pattern recognition
- He considered a model of two normal distributed populations, N ( m 1 , EI) and N ( m 2 , E2) of n dimensional vectors x with mean vectors m1 and m2 and co-variance matrices E1 and E2, and showed that the optimal (Bayesian) solution is a quadratic decision function: In the case where E1 = E2 = E the quadratic decision function (1) degenerates to a linear function: To estimate the quadratic decision function one has to determine "("+3) free parameters.

Highlights

- According to the properties of the soft margin classifier method the vector w can be written as a linear combination of support vectors
- The convolution of the dot-product in feature space can be given by any function satisfying Mercer's condition; in particular, to construct a polynomial classifier of degree d in n-dimensional input space one can use the following function
- This paper introduces the support-vector network as a new learning machine for two-group classification problems

Methods

**The Method of Convolution of the Dot**

Product in Feature Space

The algorithms described in the previous sections construct hyperplanes in the input space.- The convolution of the dot-product in feature space can be given by any function satisfying Mercer's condition; in particular, to construct a polynomial classifier of degree d in n-dimensional input space one can use the following function

Results

- The 7 degree polynomial has only 30% more support vectors than the 3rd degree polynomial—and even less than the first degree polynomial.

Conclusion

- This paper introduces the support-vector network as a new learning machine for two-group classification problems.

The support-vector network combines 3 ideas: the solution technique from optimal hyperplanes, the idea of convolution of the dot-product, and the notion of soft margins.

The algorithm has been tested and compared to the performance of other classical algorithms. - The support-vector network combines 3 ideas: the solution technique from optimal hyperplanes, the idea of convolution of the dot-product, and the notion of soft margins.
- Despite the simplicity of the design in its decision surface the new algorithm exhibits a very fine performance in the comparison study.
- Other characteristics like capacity control and ease of changing the implemented decision surface render the support-vector network an extremely powerful and universal learning machine.
- A. Constructing Separating Hyperplanes In this appendix the authors derive both the method for constructing optimal hyperplanes and soft margin hyperplanes

- Table1: Performance of various classifiers collected from publications and own experiments. For references see text
- Table2: Results obtained for dot products of polynomials of various degree. The number of "support vectors" is a mean value per classifier
- Table3: Results obtained for a 4th degree polynomial classifier on the NIST database. The size of the training set is 60,000, and the size of the test set is 10,000 patterns

Funding

- The 7 degree polynomial has only 30% more support vectors than the 3rd degree polynomial—and even less than the first degree polynomial

Reference

- Aizerman, M., Braverman, E., & Rozonoer, L. (1964). Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control, 25:821-837.
- Anderson, T.W., & Bahadur, R.R. (1966). Classification into two multivariate normal distributions with different covariance matrices. Ann. Math. Stat., 33:420-431.
- Boser, B.E., Guyon, I., & Vapnik, V.N. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop of Computational Learning Theory, 5, 144-152, Pittsburgh, ACM.
- Bottou, L., Cortes, C, Denker, J.S., Drucker, H., Guyon, I., Jacket, L.D., LeCun, Y., Sackinger, E., Simard, P., Vapnik, V., & Miller, U.A. (1994). Comparison of classifier methods: A case study in handwritten digit recognition. Proceedings of 12th International Conference on Pattern Recognition and Neural Network.
- Bromley, J., & Sackinger, E. (1991). Neural-network and it-nearest-neighbor classifiers. Technical Report 11359910819-16TM, AT&T.
- Courant, R., & Hilbert, D. (1953). Methods of Mathematical Physics, Interscience, New York. Fisher, R.A. (1936). The use of multiple measurements in taxonomic problems. Ann. Eugenics, 7:111-132.
- LeCun, Y. (1985). Une procedure d'apprentissage pour reseau a seuil assymetrique. Cognitiva 85: A la Frontiere de I'Intelligence Artificielle des Sciences de la Connaissance des Neurosciences, 599-604, Paris. LeCun, Y, Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., & Jackel, L.D. (1990). Handwritten digit recognition with a back-propagation network. Advances in Neural Information Processing Systems, 2,396404, Morgan Kaufman.
- Parker, D.B. (1985). Learning logic. Technical Report TR-47, Center for Computational Research in Economics and Management Science, Massachusetts Institute of Technology, Cambridge, MA. Rosenblatt, F. (1962). Principles ofNeurodynamics, Spartan Books, New York.
- Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986). Learning internal representations by backpropagating errors. Nature, 323:533-536.
- Rumelhart, D.E.,Hinton, G.E., & Williams, R.J. (1987). Learning internal representations by error propagation. In James L. McClelland & David E. Rumelhart (Eds.), Parallel Distributed Processing, 1,318-362, MIT Press.
- Vapnik, V.N. (1982). Estimation of Dependences Based on Empirical Data, Addendum 1, New York: SpringerVerlag.

Tags

Comments

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn