2011.3.28 Academic report : Manifold Elastic Net via Backward-Forward Least Angle Shrinkage
Presenter: Dacheng Tao（OPTIMAL）
Time: 9:30 a.m. March 28, 2011
Location: OPTIMAL meeting room, 3rd Floor, Building Three
Recently, the success of compressed sensing in signal processing and lasso type model selection in statistics inspires the development of diverse sparse learning methods, e.g., sparse representation for face recognition, multi-label prediction via compressed sensing and sparse PCA. However, it is always difficult to find the optimal solution of a manifold learning based sparse dimensionality reduction without any relaxation or approximation. This is because that it is an L2 norm constrained sparse quadratic optimization, while most existing compressed sensing algorithms are developed to solve L1 penalized least square regression.
In this talk, we will introduce a sparse dimension reduction method termed "manifold elastic net (MEN)", its optimization algorithms and real-world applications. MEN incorporates the merits of both the manifold learning based dimensionality reduction and the sparse learning based dimensionality reduction. By using a series of equivalent transformations, we show MEN has a similar optimization form of the original sparse PCA problem. One of MEN's variant is equivalent to the L1 penalized least square regression, which can be solved by existing compressed sensing algorithms.
In particular, MEN has the following advantages for subsequent classification after dimension reduction: 1) the local geometry of samples is well preserved for low dimensional data representation, 2) both the margin maximization and the classification error minimization are considered for sparse projection calculation, 3) the projection matrix of MEN improves the parsimony in computation, 4) the elastic net penalty reduces the over-fitting problem, and 5) the projection matrix of MEN can be interpreted psychologically and physiologically.
For solving the original MEN problem, we propose backward-forward least angle shrinkage (BF-LAS), which also provides a scheme to solve general constrained sparse quadratic optimization. BF-LAS starts from the dense solution, iteratively shrinks unimportant variables' magnitudes to zeros in the backward step for minimizing the L1 norm, decreases important variables' gradients in the forward step for optimizing the objective, and projects the solution on the feasible set defined by the constraints. The importance of a variable is asured by its correlation w.r.t the objective and is updated via least angle shrinkage (LAS). We prove the sign consistency of BF-LAS in two situations: 1) large n, small p and q; and 2) large n, p and q, wherein n is the number of observations, p is the number of variables and q is the cardinality of the solution.
Experimental evidence on face recognition over various popular datasets suggests that MEN is superior to top level dimension reduction algorithms, and brings novel interesting properties such as feature interpretability to dimension reduction.
Introduction on the presenter:
Dacheng Tao mainly applies statistics and mathematics for data analysis problems in data mining, computer vision, machine learning, multimedia, and video surveillance. He has authored and co-authored more than 100 scientific articles at top venues including IEEE T-PAMI,T-KDE, T-IP, NIPS, ICDM, AISTATS, AAAI, CVPR, ECCV; ACM T-KDD, and KDD.