Asymptotic behavior of statistics employed for high-dimensional data analysis

Bulinski, A.

Автор: Bulinski A.
Сборник: Abstracts of Communications of XXXI International Seminar on Stability Problems for Stochastic Models, Moscow, April 23-27, 2013
Тезисы
Год издания: 2013
Место издания: Moscow
Первая страница: 12
Последняя страница: 13
Аннотация: The development of new methods for analysis of data having huge dimensions is of great importance. For example a challenging problem is to find the genetic and non-genetic (or environmental) factors which could increase the risk of complex diseases such as diabetes, myocardial infarction and others. In this regard recall that human genom contains more than milliard nucleotide bases. The vast research domain called the genome-wide association studies (GWAS) requires new techniques for handling large arrays of biostatistical data. The plan of the talk is as follows. After the brief introduction we concentrate on the modern methods such as multifactor dimensionality reduction (MDR) and its modications, logic regression and machine learning. We deal with optimization problems for random functions dened on various graphs. The model selection is discussed as well. We apply also K-fold cross validation and permutation tests. Along with survey we present our quite recent results. We propose the basis for application of the MDR-method when one uses an arbitrary penalty function to describe the prediction error of the binary response variable by means of a function in factors. We also establish the asymptotic normality of appropriately normalized statistics used to justify the optimal choice of a subcollection of the explanatory variables. Moreover, we consider self-normalization in this variant of CLT.
Добавил в систему: Булинский Александр Вадимович

	ИСТИНА	Войти в систему Регистрация
	ИПМех РАН
	Главная Поиск Статистика О проекте Помощь

ИСТИНА

ИПМех РАН

Asymptotic behavior of statistics employed for high-dimensional data analysisтезисы доклада