# Simple classifiers from data dependent hypothesis classes Page: 3 of 9

This
**article**
is part of the collection entitled:
Office of Scientific & Technical Information Technical Reports and
was provided to Digital Library
by the UNT Libraries Government Documents Department.

#### Extracted Text

The following text was automatically extracted from the image on this page using optical character recognition software:

mization over traditional linear classifiers, LPP admits

a low-order polynomial-time algorithm for empirical

error minimization that is simple, numerically robust,

fully parallelizable and has no free parameters. In ad-

dition we provide bounds on estimation error for LPP

that are independent of dimension and similar in form

to those obtained for a traditional class with VC di-

mension equal to 3. We also provide similar bounds

on the difference between training and generalization

error which is beneficial when it is important to have

an accurate estimate of the actual performance of the

classifier in the absence of a large test set. Because the

estimation error bounds are independent of dimension,

we are motivated to map to a higher dimensional space

to improve performance, and since LPP classifiers are

linear, computations can still be performed in the orig-

inal space by employing kernel mappings. This situa-

tion is similar to support vector machines but here we

optimize empirical error over a restricted class rather

than margin over the unrestricted class. In summary,

the simple class LPP when coupled with empirical er-

ror minimization successfully addresses nearly all the

key issues in the classifier design problem.

The outstanding issue is approximation error, which

we do not analyze theoretically but rather through

simulations. Our results are consistent with the tradi-

tional framework in that they depend heavily on how

well the class is matched to the process generating

the data. We can address this issue in our framework

by using elementary constructions to create additional

simple classes. Cannon et al. (2003) construct alterna-

tive simple classes to demonstrate this observation. In

this paper we only treat the LPP class but we consider

the use of kernels and boosting (Schapire et al., 1998)

as means for reducing approximation error. Kerneliz-

ing LPP is motivated in the above discussion about

estimation error bounds. Boosting is especially suit-

able because LPP admits computationally efficient al-

gorithms, even for weighted empirical error minimiza-

tion. Boosting calls for the production of a base clas-

sifier at each round that minimizes a weighted empiri-

cal error. For most nontrivial hypothesis classes mini-

mizing weighted empirical error is computationally in-

tractable. Our empirical results demonstrate that even

when LPP performs poorly on a particular problem

instance it may be kernelized and/or boosted to a per-

formance that is comparable to powerful methods such

as support vector machines and random forests.2. Error Minimization over Data

Dependent Hypothesis Spaces

Consider a set X, the binary set Y = {0, 1} and a

probability space Z = X x Y. Let z =- (x, y) de-

note the corresponding random variable with proba-

bility measure P, conditional probability measures Py

and y-marginal probability measure P. Let F denote

a class of functions (classifiers) f : X -+ Y and let

e(f) = P(f(x) 0.4y)

denote the generalization error of the classifier f. Let

e* = inffEy e(f) denote the best error achievable

in the class F. We further suppose that we collect

n independent identically distributed (i.i.d.) samples

(z(1), z(2), .., z(n)) from P and use them to construct

an empirical error function

n

e(f) = n ZI(f((i)) : y(i)).

The work of Vapnik and Chervonenkis (1974) justifies

the time honored learning strategy that chooses f to

minimize e 1 by establishing the following probabilistic

guarantee:P"(e(f) - e* > e) < 8nv(y)e nE2/125

(1)

where V(F) is the Vapnik-Chervonenkis dimension of

the function class F.

This result is only applicable when the the hypoth-

esis class is chosen before data is observed. Cannon

et al. (2002) showed that one could allow the hy-

pothesis class to depend on the data and still obtain

a bound on estimation error similar to (1). Following

Cannon et al. (2002) we outline this result now. Let z,

denote the n-sample consisting of individual samples

zn(i),i = 1,..,n. Given an n-sample zn, we consider

functions from a hypothesis space F,, which can de-

pend on the n-sample and so is defined by a class Fn

of functions on (Zn, Z). We define a data dependent

class F = {Fn} to be a collection of such classes Fn.

Next we define shatter coefficients for data-dependent

function classes.

Definition 2.1. Let ANn(z2n, F) be the number of dis-

tinct dichotomies of 2n points z2n generated by the

function classes ., where zn C z2n varies over all

size n subsets of z2n. That is, let If = {z : f(z) = 1}

denote the indicator set where the function is equal to

'In this paper we ignore questions of whether minima

or maxima are actually attained. This detail is easy to

include by introducing approximation parameters and ap-

proximate minima/maxima but obscures the presentation.

## Upcoming Pages

Here’s what’s next.

## Search Inside

This article can be searched. **Note: **Results may vary based on the legibility of text within the document.

## Tools / Downloads

Get a copy of this page or view the extracted text.

## Citing and Sharing

Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.

### Reference the current page of this Article.

Cannon, A. (Adam); Howse, J. W. (James W.); Hush, D. R. (Donald R.) & Scovel, James C. Simple classifiers from data dependent hypothesis classes, article, January 1, 2003; United States. (https://digital.library.unt.edu/ark:/67531/metadc932925/m1/3/: accessed April 21, 2019), University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.