ИСТИНА |
Войти в систему Регистрация |
|
ИПМех РАН |
||
This paper presents a new generic text summarization method using Non-negative Matrix Factorization (NMF) to estimate sentence relevance. Proposed sentence relevance estimation is based on normalization of NMF topic space and further weighting of each topic using sentences representation in topic space. The proposed method shows better summarization quality and performance than state of the art methods on DUC 2002 standard dataset. In addition, we study how this method can improve the performance of supervised and unsupervised text classification tasks. In our experiments with Reuters-21578 and Classic4 benchmark datasets we apply developed text summarization method as a preprocessing step for further multi-label classification and clustering. As a result, the quality of classification and clustering has been significantly improved.