CROification: Practical Kernel Classification
Kernel methods have been shown to be effective for many machine learning tasks such as classification, clustering, and regression. The standard way to apply kernel methods is to use the ’kernel trick’, where the inner product of the vectors in the feature space is computed via the kernel function. Using the kernel trick for support vector machines, however, leads to training that is quadratic in the number of input vectors and classification that is linear with the number of support vectors.
We introduce a new kernel, the CRO (Concomitant Rank Order) kernel that approximates the RBF kernel for unit length input vectors. We also introduce a new randomized feature map, based on concomitant rank order hashing, that produces binary, sparse, high dimensional feature vectors whose inner product asymptotically equals the CRO kernel. For unit length input vectors, we get the accuracy of the RBF kernel with the efficiency of a sparse high dimensional linear kernel.
Mehran Kafai is a senior research scientist at Hewlett Packard Labs leading the Commercial Analytics team on exploring novel algorithms and alternative system designs for scalable analytics. His recent research has been concerned with Boolean expression matching, large-scale search and information retrieval, dense vector analytics, and real-time analytics on high-dimensional data.
He received the M.Sc. degree in computer science from San Francisco State University in 2009, and the PhD degree in computer science from the Center for Research in Intelligent Systems (CRIS), University of California Riverside in 2013.