linearmodels.iv.absorbing.AbsorbingLS¶
- class
AbsorbingLS
(dependent, exog=None, *, absorb=None, interactions=None, weights=None)[source]¶ Linear regression with high-dimensional effects
- Parameters
- dependent
ArrayLike
Endogenous variables (nobs by 1)
- exog
ArrayLike
,optional
Exogenous regressors (nobs by nexog)
- absorb: {DataFrame, Interaction}
The effects or continuous variables to absorb. When using a DataFrame, effects must be categorical variables. Other variable types are treated as continuous variables that should be absorbed. When using an Interaction, variables in the cat argument are treated as effects and variables in the cont argument are treated as continuous.
- interactions{
DataFrame
,Interaction
,List
[DataFrame
,Interaction
]},optional
Interactions containing both categorical and continuous variables. Each interaction is constructed using the Cartesian product of the categorical variables to produce the dummy, which are then separately interacted with each continuous variable.
- weights
ArrayLike
,optional
Observation weights used in estimation
- dependent
See also
Notes
Capable of estimating models with millions of effects.
Estimates models of the form
\[y_i = x_i \beta + z_i \gamma + \epsilon_i\]where \(\beta\) are parameters of interest and \(\gamma\) are not. z may be high-dimensional, although must have fewer variables than the number of observations in y.
The syntax simplifies specifying high-dimensional z when z consists of categorical (factor) variables, also known as effects, or when z contains interactions between continuous variables and categorical variables, also known as fixed slopes.
The high-dimensional effects are fit using LSMR which avoids inverting or even constructing the inner product of the regressors. This is combined with Frish-Waugh-Lovell to orthogonalize x and y from z.
z can contain factors that are perfectly linearly dependent. LSMR estimates a particular restricted set of parameters that captures the effect of non-redundant components in z.
Examples
Estimate a model by absorbing 2 categoricals and 2 continuous variables
>>> import numpy as np >>> import pandas as pd >>> from lineamodels.iv import AbsorbingLS, Interaction >>> dep = np.random.standard_normal((20000,1)) >>> exog = np.random.standard_normal((20000,2)) >>> cats = pd.DataFrame({i: pd.Categorical(np.random.randint(1000, size=20000)) ... for i in range(2)}) >>> cont = pd.DataFrame({i+2: np.random.standard_normal(20000) for i in range(2)}) >>> absorb = pd.concat([cats, cont], 1) >>> mod = AbsorbingLS(dep, exog, absorb=absorb) >>> res = mod.fit()
Add interactions between the cartesian product of the categorical and each continuous variables
>>> iaction = Interaction(cat=cats, cont=cont) >>> absorb = Interaction(cat=cats) # Other encoding of categoricals >>> mod = AbsorbingLS(dep, exog, absorb=absorb, interactions=iaction)
Methods
fit
(self, \*, cov_type, debiased, …)Estimate model parameters
resids
(self, params)Compute model residuals
wresids
(self, params)Compute weighted model residuals
Properties
Dependent variable with effects absorbed
Exogenous variables with effects absorbed