Introduction¶
Panel data includes observations on multiple entities – individuals, firms, countries – over multiple time periods. In most classical applications of panel data the number of entities, N, is large and the number of time periods, T, is small (often between 2 and 5). Most asymptotic theory for these estimators has been developed under an assumption that N will diverge while T is fixed.
Most panel models are designed to estimate the parameters of a model which can be described
where i indexes the entities and t indexes time. \(\beta\) contains the parameters of interest. \(\alpha_i\) are entity-specific components that are not usually identified in the standard setup, and so cannot be consistently estimated and \(\epsilon_{it}\) are idiosyncratic errors uncorrelated with \(\alpha_i\) and the covariates \(x_{it}\).
All models require two inputs
dependent
- The variable to be modeled, \(y_{it}\) in the modelexog
- The regressors, \(x_{it}\) in the model.
and use different techniques to address the presence of \(\alpha_i\).
In particular,
PanelOLS
uses fixed effect (i.e., entity effects) to eliminate the entity specific components. This is mathematically equivalent to including a dummy variable for each entity, although the implementation does not do this for performance reasons.BetweenOLS
averages within an entity and then regresses the time-averaged values using OLS.FirstDifferenceOLS
takes the first difference to eliminate the entity specific effect.RandomEffects
uses a quasi-difference to efficiently estimate \(\beta\) when the entity effect is independent from the regressors. It is, however, not consistent when there is dependence between the entity effect and the regressors.PooledOLS
ignores the entity effect and is consistent but inefficient when the effect is independent of the regressors.
PanelOLS
is somewhat more general than the
other estimators and can be used to model 2 effects (e.g., entity and time
effects).
Model specification is similar to statsmodels. This example estimates a fixed effect regression on a panel of the wages of working men modeling the log wage as a function of squared experience, a dummy if the man is married and a dummy indicating if the man is a union member.
from linearmodels.panel import PanelOLS
from linearmodels.datasets import wage_panel
import statsmodels.api as sm
data = wage_panel.load()
data = data.set_index(['nr','year'])
dependent = data.lwage
exog = sm.add_constant(data[['expersq','married','union']])
mod = PanelOLS(dependent, exog, entity_effects=True)
res = mod.fit(cov_type='unadjusted')
res
While the result contains many properties containing specific quantities of
interest (e.g., params
or tstats
), the string representation of the
result is a summary table.
PanelOLS Estimation Summary
================================================================================
Dep. Variable: lwage R-squared: 0.1365
Estimator: PanelOLS R-squared (Between): -0.0674
No. Observations: 4360 R-squared (Within): 0.1365
Date: Wed, Apr 19 2017 R-squared (Overall): 0.0270
Time: 17:48:58 Log-likelihood -1439.0
Cov. Estimator: Unadjusted
F-statistic: 200.87
Entities: 545 P-value 0.0000
Avg Obs: 8.0000 Distribution: F(3,3812)
Min Obs: 8.0000
Max Obs: 8.0000 F-statistic (robust): 200.87
P-value 0.0000
Time periods: 8 Distribution: F(3,3812)
Avg Obs: 545.00
Min Obs: 545.00
Max Obs: 545.00
Parameter Estimates
==============================================================================
Parameter Std. Err. T-stat P-value Lower CI Upper CI
------------------------------------------------------------------------------
const 1.3953 0.0123 113.50 0.0000 1.3712 1.4194
expersq 0.0037 0.0002 19.560 0.0000 0.0033 0.0041
married 0.1073 0.0182 5.8992 0.0000 0.0717 0.1430
union 0.0828 0.0198 4.1864 0.0000 0.0440 0.1215
==============================================================================
F-test for Poolability: 9.3360
P-value: 0.0000
Distribution: F(544,3812)
Included effects: Entity
Like statsmodels, panel models can be specified using a R-like formula. This model
is identical to the previous. Note the use of the special variable EntityEffects
to include the fixed effects.
mod = PanelOLS.from_formula('lwage ~ 1 + expersq + union + married + EntityEffects',data)