Feature selection for OPLS discriminant analysis of cancer tissue lipidomics data

Tokareva, A.O.; Chagovets, V.V.; Starodubtseva, N.L.; Nazarova, N.M.; Nekrasova, M.E.; Kononikhin, A.S.; Frankevich, V.E.; Nikolaev, E.N.; Sukhikh, G.T.

Авторы: Tokareva Alisa O., Chagovets Vitaliy V., Starodubtseva Natalia L., Nazarova Niso M., Nekrasova Maria E., Kononikhin Alexey S., Frankevich Vladimir E., Nikolaev Evgeny N., Sukhikh Gennady T.
Журнал: Journal of Mass Spectrometry
Том: 55
Номер: 1
Год издания: 2020
Издательство: IM Publications
Местоположение издательства: United Kingdom
Первая страница: e4457
DOI: 10.1002/jms.4457
Аннотация: The mass spectrometry‐based molecular profiling can be used for better differentia-tion between normal and cancer tissues and for the detection of neoplastic transfor-mation, which is of great importance for diagnostics of a pathology, prognosis of itsevolution trend, and development of a treatment strategy. The aim of the presentstudy is the evaluation of tissue classification approaches based on various data setsderived from the molecular profile of the organic solvent extracts of a tissue. A set ofpossibilities are considered for the orthogonal projections to latent structures dis-criminant analysis: all mass spectrometric peaks over 300 counts threshold, subsetof peaks selected by ranking with support vector machine algorithm, peaks selectedby random forest algorithm, peaks with the statistically significant difference of theintensity determined by the Mann‐Whitney U test, peaks identified as lipids, and bothidentified and significantly different peaks. The best predictive potential is obtainedfor OPLS‐DA model built on nonpolar glycerolipids (Q2= 0.64, area under curve[AUC] = 0.95); the second one is OPLS‐DA model with lipid peaks selected by ran-dom forest algorithm (Q2= 0.58, AUC = 0.87). Moreover, models based on particularmolecular classes are more preferable from biological point of view, resulting in newexplanatory mechanisms of pathophysiology and providing a pathway analysis.Another promising features for OPLS‐DA modeling are phosphatidylethanolamines(Q2= 0.48, AUC = 0.86).
Добавил в систему: Франкевич Владимир Евгеньевич

	ИСТИНА	Войти в систему Регистрация
	ИПМех РАН
	Главная Поиск Статистика О проекте Помощь

ИСТИНА

ИПМех РАН

Feature selection for OPLS discriminant analysis of cancer tissue lipidomics dataстатья