This thesis aims at proposing a new method of solving the nonparametric and non-additive regression problem in presence of ultra-high dimensional data. In this context, there are two relevant aspects: variable selection and structure discovery, such as identification of the variables that affect the response variable and the type of effects (linear or non linear), respectively. In this thesis we propose a nonparametric method of variable selection that works in two stages. At the first stage, a screening procedure is performed: selecting a subset of variables which contains the true covariates with probability 1. In the second, we transform the screening step into variable selection using a non-penalized approach. In this way we take advantage of the simplicity of screening and we overcome the problem of estimating penalty parameters. Furthermore, our screening approach has the potential to distinguish linear and non-linear covariates, therefore it also succeeds in structure discovery. Chang et al. (2016), without requiring a specific parametric form of the underlying data model, proposed a screening method using empirical likelihood and local polynomials. Once the estimate of the marginal function between a particular variable and the response is obtained, they used empirical likelihood to test whether this function is significantly different from zero. Despite the excellent results in terms of dimensionality achieved, the authors did not perform any variable selection and structure discovery. To solve these problems, we propose to complicate their approach by estimating the first marginal derivative rather than the marginal function. In this way, we obtain a new fully nonparametric screening method, called Derivative Empirical Likelihood Sure Independence Screening(D-ELSIS). In order to transform our screening selection procedure into variable selection procedure, we use the subsample technique. In particular, we propose to apply the subsample idea not on the results of a variable selection procedure, as in Meinshausen and B¨uhlmann (2010), but after a screening procedure. With this tool, the variables selected through the D-ELSIS are then further evaluated to investigate their probability, in terms of relative frequency, to be chosen when the data are randomly sampled. Furthermore, although thresholds are used in this approach, these do not need to be estimated. .. [edited by Author]
A screening selection procedure for nonparametric regression and survival analysis / Sara Milito , 2020 May 18., Anno Accademico 2018 - 2019. [10.14273/unisa-4645].
A screening selection procedure for nonparametric regression and survival analysis
Milito, Sara
2020
Abstract
This thesis aims at proposing a new method of solving the nonparametric and non-additive regression problem in presence of ultra-high dimensional data. In this context, there are two relevant aspects: variable selection and structure discovery, such as identification of the variables that affect the response variable and the type of effects (linear or non linear), respectively. In this thesis we propose a nonparametric method of variable selection that works in two stages. At the first stage, a screening procedure is performed: selecting a subset of variables which contains the true covariates with probability 1. In the second, we transform the screening step into variable selection using a non-penalized approach. In this way we take advantage of the simplicity of screening and we overcome the problem of estimating penalty parameters. Furthermore, our screening approach has the potential to distinguish linear and non-linear covariates, therefore it also succeeds in structure discovery. Chang et al. (2016), without requiring a specific parametric form of the underlying data model, proposed a screening method using empirical likelihood and local polynomials. Once the estimate of the marginal function between a particular variable and the response is obtained, they used empirical likelihood to test whether this function is significantly different from zero. Despite the excellent results in terms of dimensionality achieved, the authors did not perform any variable selection and structure discovery. To solve these problems, we propose to complicate their approach by estimating the first marginal derivative rather than the marginal function. In this way, we obtain a new fully nonparametric screening method, called Derivative Empirical Likelihood Sure Independence Screening(D-ELSIS). In order to transform our screening selection procedure into variable selection procedure, we use the subsample technique. In particular, we propose to apply the subsample idea not on the results of a variable selection procedure, as in Meinshausen and B¨uhlmann (2010), but after a screening procedure. With this tool, the variables selected through the D-ELSIS are then further evaluated to investigate their probability, in terms of relative frequency, to be chosen when the data are randomly sampled. Furthermore, although thresholds are used in this approach, these do not need to be estimated. .. [edited by Author]| File | Dimensione | Formato | |
|---|---|---|---|
|
129226296607135107197954019943063476632
accesso aperto
Tipologia:
Altro materiale allegato
Dimensione
1.3 MB
Formato
Adobe PDF
|
1.3 MB | Adobe PDF | Visualizza/Apri |
|
5277682269728782119273159586827251476
accesso aperto
Tipologia:
Altro materiale allegato
Dimensione
490.58 kB
Formato
Adobe PDF
|
490.58 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


