What is the support vector machine?

What IS the support vector machine? Can someone clarify my confusion?

Possible answers:

  1. The SVM is the problem: given data (xn,yn),n=1,,N

\text{ subject to: } y_n(w \cdot x_n + b) \geq 1, n=1,…,N

See Solving Support Vector Machine
with Many Examples

Paweł Białoń, “…Various methods of dealing with linear support vector machine (SVM) problems with a large number of examples
are presented and compared…”

See http://www.cs.tau.ac.il/~mansour/ml-course-10/scribe9.pdf “We begin with building the intuition behind SVMs, continue to define
SVM as an optimization problem

  1. SVM is the algorithm that solves the problem,

    given data (x_n, y_n), n = 1, \ldots, N

\min_{w, b}\frac{1}{2}||w||^2
\text{ subject to: } y_n(w \cdot x_n + b) \geq 1, n=1,…,N

“SVMs are among the best (and many believe are indeed the best)
“off-the-shelf” supervised learning algorithms.” – Andrew Ng

See reference, “SVM is a supervised learning algorithm”

But this means that QP solvers are SVMs…

  1. SVM is the solution to the following problem:

    given data (x_n, y_n), n = 1, \ldots, N

\min_{w, b}\frac{1}{2}||w||^2
\text{ subject to: } y_n(w \cdot x_n + b) \geq 1, n=1,…,N

See, https://cel.archives-ouvertes.fr/cel-01003007/file/Lecture1_Linear_SVM_Primal.pdf (Page 22) “SVM is the solution to the problem….” (but then the author immediately contradicts himself by calling the SVM a quadratic programming problem – so is it the solution or the problem??)

See, reference “SVM is a discriminative classifier”


At the outset, I think it is worth noting that there is a lot of variation in terminology, not only in the referent of the term “support vector machine”, but also in the more general term “machine”. Usually the term “machine” is used to refer to the algorithm that performs a particular optimisation. However, as you can see from the various quotes in your question, it is often used more loosely to refer either to the optimisation problem, the solution, or the algorithm that computes that solution (or even the combination of all three of these elements). This is somewhat natural, since all three of these things are aspects of the same statistical prediction problem, and it is simple to differentiate these aspects of the problem directly by description.

Bearing that caveat in mind, in deciding the proper referent of the term “support vector machine”, it is worth tracing the term back through the initial literature. A good starting point for this is Moguerza and Munoz (2006), which gives an overview of support vector machines and their applications. According to this source, the term “support vector” was first used in Cortez and Vapnik (1995), where the authors referred to the “support vector network” as a “learning machine”. On p. 275 of the latter paper, the authors explain that the “support vectors” refer to a small number of vectors in the training data that determine the boundaries of the optimal hyperplane in the method. The description of the “support vector network” (a type of “learning machine”) is given on p. 276, and this clearly refers to a method —i.e., an algorithm— for computing the optimal hyperplane. Moreover, the substantive contribution of the paper is clearly related to the construction of an algorithm to solve a long-standing optimisation problem.

This terminology is consistent with the more general practice of using the term “machine” (or “learning machine”) to refer to the algorithm that solves a particular learning/optimisation problem. This goes all the way back to Turing machines and other types of conceptual computing machines, which have always been regarded as tools for computing outputs of a problem. Later literature on “support vector machines” appears to have adopted a shortened version of the nomenclature of Cortez and Vapnik (1995), where the terminology of a “support vector network” that is a “learning machine” is concatenated simply to “support vector machine”. This historical evolution of the terminology, and the substantive analysis done in that paper, suggest that the “support vector machine” refers to the algorithm that performs the optimisation. Of course, as you have observed, there are some authors who use this terminology loosely, either to refer to the optimisation problem itself, its solution point, or even the whole triumvirate of problem/solution/algorithm.

Source : Link , Question Author : Fraïssé , Answer Author : Ben

Leave a Comment