3.2 Regularization Methods for Categorical Predictors

Prendre des notes

Il n’y a pas de note disponible pour vous pour cette vidéo.

Connectez-vous pour en créer une nouvelle.

Disciplines

Types

Mots clés

pad 203 droit 177 environnement 116 histoire 93 durable 72 biodiversite 71 citizen science 64 science participative 64 patrimoine culturel 63 una europa 62 archeologie 60 cultural heritage 60 de 54 syndicalisme 48 please 43 relations internationales 43 histoire contemporaine 42 droit des entreprises 41 geographie 41 droit des contrats 37

The majority of regularization methods in regression analysis has been designed for metric predictors and can not be used for categorical predictors. A rare exception is the group lasso which allows for categorical predictors or factors. We will consider alternative approaches based on penalized likelihood and boosting techniques. Typically the operating model will be a generalized linear model. We will start with ordered categorical predictors which unfortunately are often treated as metric variables because software is available. It is shown how difference penalties on adjacent dummy coefficients can be used to obtain smooth effect curves that can be estimated also in cases where simple maximum likelihood methods fail. The difference penalty turns out to be highly competitive when compared to methods often seen in practice, namely simple linear regression on the group labels and pure dummy coding. In a second step L1-penalty based methods that enforce variable selection and clustering of categories are presented and investigated. It is distinguished between ordered predictors where clustering refers to the fusion of adjacent categories and nominal predictors for which arbitrary categories can be fused. The methods allow to identify which categories do actually differ with respect to the dependent variable. Finally interaction effects are modeled within the framework of varying coefficients models. For the proposed methods properties of the estimators are investigated. Methods are illustrated and compared in simulation studies and applied to real world data.

Ajouté par : Yannick Mahe (ymahe)
Contributeur(s) :
- Université Paris 1 Panthéon - Sorbonne (production)
- Gerhard Tutz (Intervenant)
Mis à jour le : 21 juillet 2017 00:00
Chaîne :
- UFR 02 - Ecole d'Economie de la Sorbonne (EES)
Type : Cours / MOOC / SPOC
Langue principale : Anglais
Discipline(s) :
- Mathématiques et informatique appliquées aux sciences humaines et sociales

Réseaux sociaux

3.2 Regularization Methods for Categorical Predictors

Informations