linearmodels.iv.absorbing.AbsorbingLS¶
- class AbsorbingLS(dependent, exog=None, *, absorb=None, interactions=None, weights=None, drop_absorbed=False)[source]¶
Linear regression with high-dimensional effects
- Parameters:
- dependentarray_like
Endogenous variables (nobs by 1)
- exogarray_like
Exogenous regressors (nobs by nexog)
- absorb: {DataFrame, Interaction}
The effects or continuous variables to absorb. When using a DataFrame, effects must be categorical variables. Other variable types are treated as continuous variables that should be absorbed. When using an Interaction, variables in the cat argument are treated as effects and variables in the cont argument are treated as continuous.
- interactions{
DataFrame
,Interaction
,list
[DataFrame
,Interaction
]} Interactions containing both categorical and continuous variables. Each interaction is constructed using the Cartesian product of the categorical variables to produce the dummy, which are then separately interacted with each continuous variable.
- weightsarray_like
Observation weights used in estimation
- drop_absorbedbool
Flag indicating whether to drop absorbed variables
See also
Notes
Capable of estimating models with millions of effects.
Estimates models of the form
\[y_i = x_i \beta + z_i \gamma + \epsilon_i\]where \(\beta\) are parameters of interest and \(\gamma\) are not. z may be high-dimensional, although must have fewer variables than the number of observations in y.
The syntax simplifies specifying high-dimensional z when z consists of categorical (factor) variables, also known as effects, or when z contains interactions between continuous variables and categorical variables, also known as fixed slopes.
The high-dimensional effects are fit using LSMR which avoids inverting or even constructing the inner product of the regressors. This is combined with Frish-Waugh-Lovell to orthogonalize x and y from z.
z can contain factors that are perfectly linearly dependent. LSMR estimates a particular restricted set of parameters that captures the effect of non-redundant components in z.
Examples
Estimate a model by absorbing 2 categoricals and 2 continuous variables
>>> import numpy as np >>> import pandas as pd >>> from linearmodels.iv import AbsorbingLS, Interaction >>> dep = np.random.standard_normal((20000,1)) >>> exog = np.random.standard_normal((20000,2)) >>> cats = pd.DataFrame({i: pd.Categorical(np.random.randint(1000, size=20000)) ... for i in range(2)}) >>> cont = pd.DataFrame({i+2: np.random.standard_normal(20000) for i in range(2)}) >>> absorb = pd.concat([cats, cont], axis=1) >>> mod = AbsorbingLS(dep, exog, absorb=absorb) >>> res = mod.fit()
Add interactions between the cartesian product of the categorical and each continuous variables
>>> iaction = Interaction(cat=cats, cont=cont) >>> absorb = Interaction(cat=cats) # Other encoding of categoricals >>> mod = AbsorbingLS(dep, exog, absorb=absorb, interactions=iaction)
- Attributes:
absorbed_dependent
Dependent variable with effects absorbed
absorbed_exog
Exogenous variables with effects absorbed
- dependent
- exog
- has_constant
- instruments
- weights
Methods
fit
(*[, cov_type, debiased, method, ...])Estimate model parameters
resids
(params)Compute model residuals
wresids
(params)Compute weighted model residuals
Properties
Dependent variable with effects absorbed
Exogenous variables with effects absorbed