Improved Nystr÷m Low-rank Approximation for

Scalable Manifold Learning

Manifold learning (or dimension reduction) algorithms rely heavily on the eigenvalue decomposition of the kernel matrix. The cubical complexity is a big hindrance to large scale problems. The Nystr÷m method [1,2] is a popular sampling-based approach for computing large scale kernel eigen-systems. However, the optimal sampling is still an open problem.

We find that the Nystr÷m low-rank approximation quality depends crucially on the encoding powers of the landmak points in representing the data [3]. Our error analysis suggests the use of k-means clustering centers as the landmark points. We call this Improved Nystr÷m low-rank approximation and apply it to several important manifold learning algorithms, i.e., kernel PCA, Laplacian Eigenmap, and Multidimensional Scaling. The matlab codes can be downloaded here. A brief summary is as follows.

             Table1: Improved Nystr÷m low-rank approximation for manifold learning and dimension reduction.

Algorithms Matlab function Specifications
Kernel PCA V = INys_KPCA(data, kernel, m ); V: top m eigenvectors/embeddings computed;
n-by-dim dta matrix;
kernel.type: 'pol' or 'rbf';
kernel.para: d in the polynomial kernel <x,y>^d;
b in the rbf kernel exp(-||x||^2/b);
number of landmark points (m << n).    
Laplacian Eigenmap    Spectral Clustering Normalized Cut V = INys_SpectrEmbed(data, kernel, m );
Multi-dimensional Scaling V = INys_MDS(data, m );

Results: all results are computed by using only m = 30 landmark points. Codes are in matlab (not fully optimized) and run on a core(TM)-dual PC with 2.13GHz CPU.

Data1: IJCNN1 (100000 x 22) Data2: MNIST digit1 (6742 x 784) Data3: Isomap face image (4096 x 698)
Kernel PCA  3.7sec Kernel PCA  3.8sec Kernel PCA  3.5sec
Laplacian Eigenmap  3.5sec Laplacian Eigenmap  4.1sec   Laplacian Eigenmap  4.3sec
Multidimensional Scaling  3.7sec Multidimensional Scaling  2.5sec Multidimensional Scaling  1.4sec


[1] C. Fowlkes, S. Belongie, F. Chung, and J. Malik. Spectral grouping using the Nystr÷m method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2):214ľ225, 2004.

[2] C. Williams, M, Seeger. Using the Nystr÷m method to speed up kernel machines. Advances in Neural Information Processing Systems 13 (pp. 682 ľ 688), 2001.

[3] K. Zhang, I. Tsang, J. Kwok. Improved Nystr÷m Low Rank Approximation and Error Analysis. In the 25th International Conference on Machine Learning (ICML 2008).