Features in Deep Learning Architectures with Unsupervised Kernel k-Means Page: 3 of 6
This article is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to Digital Library by the UNT Libraries Government Documents Department.
The following text was automatically extracted from the image on this page using optical character recognition software:
Learning Features in Deep Architectures with
Unsupervised Kernel k-Means
Karl Ni, Ryan Prenger
Video Laboratory Directed Research and Development
Lawrence Livermore National Laboratory
Abstract-Deep learning technology and related algorithms
have dramatically broken landmark records for a broad range of
learning problems in vision, speech, audio, and text processing.
Meanwhile, kernel methods have found common-place usage due
to their nonlinear expressive power and elegant optimization
formulation. Based on recent progress in learning high-level,
class-specific features in unlabeled data, we improve upon the
result by combining nonlinear kernels and multi-layer (deep)
architecture, which we apply at scale. In particular, our exper-
imentation is based on k-means with an RBF kernel, though it
is a straightforward extension to other unsupervised clustering
techniques and other reproducing kernel Hilbert spaces. With the
proposed method, we discover features distilled from unorganized
images. We augment high-level feature invariance by pooling
Tasks in computer vision, audio and multimedia research,
and natural language and text processing require features that
saliently describe a semantic concept. One approach is to
directly learn low-level features to hierarchically construct
classifiers with Haar wavelets  and deformable parts , but
generalization has suffered and labels are costly to produce.
Meanwhile, theoretical features, such as the oft-cited SIFT 
and GIST  in vision and standard MFCC's  for speech,
speaker recognition, and diarization, have been shown to be
somewhat successful, though are often not as salient for
recognition tasks as more class-specific high-level features.
Instead, we have seen a recent push to learn high-level features
in a completely unsupervised fashion given large enough data
sets with deep learning architectures.
The success of high-level features in deep learning archi-
tectures ,  has been demonstrated in audio , ,
 and vision , working especially well at scale 
with sparse autoencoders. While previous computer vision
techniques focused on labeled training data sets, deep learning
methodology features have shown potential in building class-
specifity in an unsupervised setting. Meanwhile, the expressive
power of nonlinear kernels, particularly in kernel machines
like SVMs , have been used regularly with much success.
In addition to constructing high-dimensional discrimination
This work was performed under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract DE-
boundaries, kernels can easily be used to generatively ap-
proximate densities (kernel density estimates), parameterize
nonlinear regression algorithms (kernel ridge regression and
support vector regression), dimensionality reduction (KPCA),
and in general, construct spaces that offer more flexibility.
The proposed algorithm seeks to unify the expressiveness of
nonlinear kernels with the learning ability of deep networks.
Specifically, we explore an architecture that can best be
described as deep kernel k-means. The nomenclature deliber-
ately denotes the methodology: iteratively determine multiple
layers of centroids using clustering, where comparisons are
made based on a similarity function. In our case, the similarity
function is the radial basis function (RBF) kernel, and the
clustering algorithm is k-means.
Our choice of the k-means algorithm, and by extension,
kernel k-means , was primarily due to its simple imple-
mentation, but the generalized deep learning paradigm extends
to any unsupervised clustering method. Moreover, although re-
cent successful deep architectures have been implemented with
sparse auto-encoders, it has been shown  that, at least on a
per layer basis, certain brands of k-means, i.e., soft k-means or
"triangle" k-means, are at least commensurate in quality and
often outperform sparse auto-encoders and sparse RBM. The
trend has, in fact, inspired some deep implementations with
variations on a modified k-means optimization algorithm ,
The remainder of this paper is devoted to explaining the
deep kernel k-means algorithm and its application to learning
high-level features. Discussion of the algorithm in the next
section includes the construction and application of kernel
space, description of the deep architecture, and methods for
data augmentation and aggregation. Because the proposed al-
gorithm is derived from several related works, we also explore
their relationship to our technique, and compare recognition
Deep kernel k-means consists of (1) alternating processes
of kernel construction and kernel application, (2) layering with
care to patch size, and (3) a good set of training dimenions.
We discuss the three aspects of this problem presently:
Here’s what’s next.
This article can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Article.
Ni, K S & Prenger, R J. Features in Deep Learning Architectures with Unsupervised Kernel k-Means, article, September 26, 2013; Livermore, California. (digital.library.unt.edu/ark:/67531/metadc866741/m1/3/: accessed May 24, 2018), University of North Texas Libraries, Digital Library, digital.library.unt.edu; crediting UNT Libraries Government Documents Department.