| dbp:mathStatement
|
- Given a dataset , such that , and it is linearly separable by some unit vector , with margin :
Then the perceptron 0-1 learning algorithm converges after making at most mistakes, for any learning rate, and any method of sampling from the dataset. (en)
- If the dataset has only finitely many points, then there exists an upper bound number , such that for any starting weight vector all weight vector has norm bounded by (en)
|
| dbp:proof
|
- Also and since the perceptron made a mistake, , and so (en)
- Combining the two, we have (en)
- Suppose at step , the perceptron with weight makes a mistake on data point , then it updates to .
If , the argument is symmetric, so we omit it.
WLOG, , then , , and .
By assumption, we have separation with margins: Thus, (en)
- Since we started with , after making mistakes, but also (en)
|