AI helps you reading Science
AI Insight
AI extracts a summary of this paper
Weibo:
On the local correctness of 1-minimization for dictionary learning
Full Text
Weibo
Keywords
Abstract
The idea that many important classes of signals can be well-represented by linear combinations of a small set of atoms selected from a given dictionary has had dramatic impact on the theory and practice of signal processing. For practical problems in which an appropriate sparsifying dictionary is not known ahead of time, a very popular an...More
Code:
Data:
Introduction
- Progress in signal processing over the past four decades has been driven by the quest for ever more effective signal representations.
- One competing train of thought, dating at least back to the advent of the Karhunen-Loeve transform in the 1970’s, suggests that rather than meticulously designing an appropriate representation for each class of signals the authors encounter, it may be possible to learn an appropriate representation from large sets of sample data
- This idea has several appeals: Given the recent proliferation of new and exotic types of data, it may not be possible to invest the intellectual effort required to develop optimal representations for each new class of signal the authors encounter.
- It may be possible for an automatic procedure to discover useful structure in the data that is not readily apparent to us
Highlights
- To a great extent, progress in signal processing over the past four decades has been driven by the quest for ever more effective signal representations
- It is possible that the gap between the two orders of growth might be further closed with a more refined analysis of the construction proposed in this paper. While we find these results quite encouraging, there is still much to do
- One natural question is whether the assumption of hard sparsity in X can be relaxed to a Bernoulli-Gaussian model, with similar probability of each coefficient being nonzero; i.e., ρ ≈ k/n
- Care will need to be taken because a small number of columns of X may be so dense as to not be optimal
- We see no essential obstacle to extending the approach used here to deal with this case
- More work will need to be done to ensure that the balancedness condition in Theorem 5.1 still holds
Conclusion
- While the authors find these results quite encouraging, there is still much to do. there remains a wealth of fascinating open problems just involving the linearized subproblem.
- One natural question is whether the assumption of hard sparsity in X can be relaxed to a Bernoulli-Gaussian model, with similar probability of each coefficient being nonzero; i.e., ρ ≈ k/n
- In this case, care will need to be taken because a small number of columns of X may be so dense as to not be optimal.
- The framework of Negahban and collaborators may be relevant here [NRWY09]
Reference
- M. Aharon, M. Elad, and A. Bruckstein. The K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54(11):4311–4322, 2006.
- N. Ahmed, T. Natarajan, and K. Rao. Discrete Cosine Transform. IEEE Transactions on Computers, pages 90–93, 1974.
- Ahlswede and A. Winter. Strong converse for identification via quantum channels. IEEE Transactions on Information Theory, 48(3):569–579, 2002.
- A. M. Bruckstein, D. L. Donoho, and M. Elad. From sparse solutions of systems of equations to sparse modeling of signals and images. SIAM Review, 51(1):34–81, 2009.
- O. Bryt and M. Elad. Compression of facial images using the K-SVD algorithm. Journal of Visual Communication and Image Representation, 19(4):270–283, 2008.
- E. Candes. The restricted isometry property and its implications for compressed sensing. Comptes Rendus Mathematique, 346(9-10):589–592, 2008.
- [CDDY06] E. Candes, L. Demanet, D. Donoho, and L. Ying. Fast discrete curvelet transformation. Multiscale Modeling and Simulation, 5:861–899, 2006.
- [CLMW09] E. Candes, X. Li, Y. Ma, and J. Wright. Robust principal component analysis? Available at http://arxiv.org/abs/0912.3599, 2009.
- E. Candes and Y. Plan. Near-ideal model selection by 1 minimization. Annals of Statistics, 37:2145–2177, 2009.
- E. Candes and Y. Plan. A probabilistic RIP-less theory of compressed sensing. Available at http://arxiv.org/abs/1011.3854, 2010.
- E. Candes and B. Recht. Exact matrix completion via convex optimization. Foundations of Computational Mathematics, 9:717–772, 2008.
- E. Candes and T. Tao. Decoding by linear programming. IEEE Transactions on Information Theory, 51(12):4203–4215, 2005.
- E. Candes and T. Tao. The power of convex relaxation: Near-optimal matrix completion. IEEE Transactions on Information Theory, 56(5):2053–2080, 2009.
- D. Donoho and M. Elad. Optimally sparse representation in general (nonorthogonal) dictionaries via 1 minimization. Proceedings of the National Academy of Sciences of the United States of America, 100(5):2197–2202, March 2003.
- D. Donoho and J. Tanner. Counting faces of randomly projected polytopes when the projection radically lowers dimension. Journal of the American Mathematical Society, 22(1):1–53, 2009.
- M. Elad and M. Aharon. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image Processing, 15(12):3736–3745, 2006.
- [EAHH99] K. Engan, S. Aase, and J. Hakon-Husoy. Method of optimal directions for frame design. In ICASSP, volume 5, pages 2443–2446, 1999.
- J. Fuchs. On sparse representations in arbitrary redundant bases. IEEE Transactions on Information Theory, 50(6), 2004.
- R. Gribonval and M. Nielsen. Sparse decompositions in unions of bases. IEEE Transactions on Information Theory, 49:3320–3325, 2003.
- D. Gross. Recovering low-rank matrices from a few coefficients in any basis. Available at http://arxiv.org/abs/0910.1879, 2009.
- R. Gribonval and K. Schnass. Dictionary identification - sparse matrix factorization via 1-minimization. IEEE Transactions on Information Theory, 56(7):3523–3539, 2010.
- Q. Geng, H. Wang, and J. Wright. Algorithms for exact dictionary learning by 1minimization. Technical Report, 2011.
- K. Jittorntrum and M. Osborne. Strong uniqueness and second order convergence in nonlinear discrete approximation. Numerische Mathematik, 34:439–455, 1980.
- J. Kahane. Sur les sommes vectorielles ±un. Comptes Rendus Mathematique, 259:2577–2580, 1964.
- [KDMR+03] K. Kreutz-Delgado, J. Murray, B. Rao, K. Engan, T. Lee, and T. Sejnowski. Dictionary learning algorithms for sparse representation. Neural Computation, 15(20):349–396, 2003.
- M. Ledoux. The Concentration of Measure Phenomenon, Mathematical Surveys and Monographs 89. American Mathematical Society, Providence, RI, 2001.
- R. Latala and K. Oleszkiewicz. On the best constant in the Khintchine-Kahane inequality. Studia Mathematica, 109(1):101–104, 1994.
- M. Ledoux and M. Talagrand. Probability in Banach Spaces: Isoperimetry and Processes. Springer, 1991.
- J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman. Discriminative learned dictionaries for local image analysis. In Computer Vision and Pattern Recognition (CVPR), 2008.
- [MBPS10] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11:19–60, 2010.
- J. Morlet and A. Grossman. Decomposition of hardy functions into square integrable wavelets of constant shape. SIAM Journal on Mathematical Analysis, 15:723–736, 1984.
- N. Meinshausen and B. Yu. Lasso-type recovery of sparse representations for highdimensional data. Annals of Statistics, 37(1):246–270, 2009.
- [NRWY09] S. Negahban, P. Ravikumar, M. Wainwright, and B. Yu. A unified framework for the analysis of regularized m estimators. In Advances in Neural Information Processing Systems (NIPS), 2009.
- B. Olshausen and D. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381(6538):607–609, 1996.
- R. Rubinstein, A. Bruckstein, and M. Elad. Dictionaries for sparse representation modeling. Proceedings of the IEEE, 98(6):1045–1057, 2010.
- F. Rodriguez and G. Sapiro. Sparse representations for image classification: learning discriminative and reconstructive non-parametric dictionaries. Available at http://www.ima.umn.edu/preprints/jun2008/2213.pdf, 2008.
- J. Tropp. Norms of random submatrices and sparse approximation. Comptes Rendus Mathematique, 346:1271–1274, 2008.
- J. Tropp. User-friendly tail bounds for matrix martingales. Available at http://arxiv.org/abs/1004.4389v4, 2010.
- G. Wallace. The JPEG still picture compression standard. Communications of the ACM, 34(4):30–44, 1991.
- J. Wright and Y. Ma. Dense error correction via 1-minimization. IEEE Transactions on Information Theory, 56(7):3540 – 3560, 2010.
- [YWHM10] J. Yang, J. Wright, T. Huang, and Y. Ma. Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11):2861–2873, 2010.
- P. Zhao and B. Yu. On model selection consistency of Lasso. Journal of Machine Learning Research, 7:2541–2563, 2006.
Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn