linearmodels.iv.absorbing.AbsorbingLS¶
-
class linearmodels.iv.absorbing.AbsorbingLS(dependent: ndarray | DataArray | DataFrame | Series, exog: ndarray | DataArray | DataFrame | Series | None =
None
, *, absorb: DataFrame | Interaction | None =None
, interactions: DataFrame | Interaction | Iterable[DataFrame | Interaction] | None =None
, weights: ndarray | DataArray | DataFrame | Series | None =None
, drop_absorbed: bool =False
)[source]¶ Linear regression with high-dimensional effects
- Parameters:¶
- dependent: ndarray | DataArray | DataFrame | Series¶
Endogenous variables (nobs by 1)
- exog: ndarray | DataArray | DataFrame | Series | None =
None
¶ Exogenous regressors (nobs by nexog)
- absorb: DataFrame | Interaction | None =
None
¶ The effects or continuous variables to absorb. When using a DataFrame, effects must be categorical variables. Other variable types are treated as continuous variables that should be absorbed. When using an Interaction, variables in the cat argument are treated as effects and variables in the cont argument are treated as continuous.
- interactions: DataFrame | Interaction | Iterable[DataFrame | Interaction] | None =
None
¶ Interactions containing both categorical and continuous variables. Each interaction is constructed using the Cartesian product of the categorical variables to produce the dummy, which are then separately interacted with each continuous variable.
- weights: ndarray | DataArray | DataFrame | Series | None =
None
¶ Observation weights used in estimation
- drop_absorbed: bool =
False
¶ Flag indicating whether to drop absorbed variables
Notes
Capable of estimating models with millions of effects.
Estimates models of the form
\[y_i = x_i \beta + z_i \gamma + \epsilon_i\]where \(\beta\) are parameters of interest and \(\gamma\) are not. z may be high-dimensional, although must have fewer variables than the number of observations in y.
The syntax simplifies specifying high-dimensional z when z consists of categorical (factor) variables, also known as effects, or when z contains interactions between continuous variables and categorical variables, also known as fixed slopes.
The high-dimensional effects are fit using LSMR which avoids inverting or even constructing the inner product of the regressors. This is combined with Frish-Waugh-Lovell to orthogonalize x and y from z.
z can contain factors that are perfectly linearly dependent. LSMR estimates a particular restricted set of parameters that captures the effect of non-redundant components in z.
See also
Interaction
,linearmodels.iv.model.IVLIML
,linearmodels.iv.model.IV2SLS
,scipy.sparse.linalg.lsmr
Examples
Estimate a model by absorbing 2 categoricals and 2 continuous variables
>>> import numpy as np >>> import pandas as pd >>> from linearmodels.iv import AbsorbingLS, Interaction >>> dep = np.random.standard_normal((20000,1)) >>> exog = np.random.standard_normal((20000,2)) >>> cats = pd.DataFrame({i: pd.Categorical(np.random.randint(1000, size=20000)) ... for i in range(2)}) >>> cont = pd.DataFrame({i+2: np.random.standard_normal(20000) for i in range(2)}) >>> absorb = pd.concat([cats, cont], axis=1) >>> mod = AbsorbingLS(dep, exog, absorb=absorb) >>> res = mod.fit()
Add interactions between the cartesian product of the categorical and each continuous variables
>>> iaction = Interaction(cat=cats, cont=cont) >>> absorb = Interaction(cat=cats) # Other encoding of categoricals >>> mod = AbsorbingLS(dep, exog, absorb=absorb, interactions=iaction)
Methods
fit
(*[, cov_type, debiased, method, ...])Estimate model parameters
resids
(params)Compute model residuals
wresids
(params)Compute weighted model residuals
Properties
Dependent variable with effects absorbed
Exogenous variables with effects absorbed