# Introduction¶

Panel data includes observations on multiple entities – individuals, firms, countries – over multiple time periods. In most classical applications of panel data the number of entities, N, is large and the number of time periods, T, is small (often between 2 and 5). Most asymptotic theory for these estimators has been developed under an assumption that N will diverge while T is fixed.

Most panel models are designed to estimate the parameters of a model which can be described

where i indexes the entities and t indexes time. \(\beta\) contains the parameters of interest. \(\alpha_i\) are entity-specific components that are not usually identified in the standard setup, and so cannot be consistently estimated and \(\epsilon_{it}\) are idiosyncratic errors uncorrelated with \(\alpha_i\) and the covariates \(x_{it}\).

All models require two inputs

`dependent`

- The variable to be modeled, \(y_{it}\) in the model`exog`

- The regressors, \(x_{it}\) in the model.

and use different techniques to address the presence of \(\alpha_i\).

In particular,

`PanelOLS`

uses fixed effect (i.e., entity effects) to eliminate the entity specific components. This is mathematically equivalent to including a dummy variable for each entity, although the implementation does not do this for performance reasons.`BetweenOLS`

averages within an entity and then regresses the time-averaged values using OLS.`FirstDifferenceOLS`

takes the first difference to eliminate the entity specific effect.`RandomEffects`

uses a quasi-difference to efficiently estimate \(\beta\) when the entity effect is independent from the regressors. It is, however, not consistent when there is dependence between the entity effect and the regressors.`PooledOLS`

ignores the entity effect and is consistent but inefficient when the effect is independent of the regressors.

`PanelOLS`

is somewhat more general than the other estimators and can be used to model 2 effects (e.g., entity and time effects).

Model specification is similar to statsmodels. This example estimates a fixed effect regression on a panel of the wages of working men modeling the log wage as a function of squared experience, a dummy if the man is married and a dummy indicating if the man is a union member.

```
from linearmodels.panel import PanelOLS
from linearmodels.datasets import wage_panel
import statsmodels.api as sm
data = wage_panel.load()
data = data.set_index(['nr','year'])
dependent = data.lwage
exog = sm.add_constant(data[['expersq','married','union']])
mod = PanelOLS(dependent, exog, entity_effects=True)
res = mod.fit(cov_type='unadjusted')
res
```

While the result contains many properties containing specific quantities of interest (e.g., `params`

or `tstats`

), the string representation of the result is a summary table.

```
PanelOLS Estimation Summary
================================================================================
Dep. Variable: lwage R-squared: 0.1365
Estimator: PanelOLS R-squared (Between): -0.0674
No. Observations: 4360 R-squared (Within): 0.1365
Date: Wed, Apr 19 2017 R-squared (Overall): 0.0270
Time: 17:48:58 Log-likelihood -1439.0
Cov. Estimator: Unadjusted
F-statistic: 200.87
Entities: 545 P-value 0.0000
Avg Obs: 8.0000 Distribution: F(3,3812)
Min Obs: 8.0000
Max Obs: 8.0000 F-statistic (robust): 200.87
P-value 0.0000
Time periods: 8 Distribution: F(3,3812)
Avg Obs: 545.00
Min Obs: 545.00
Max Obs: 545.00
Parameter Estimates
==============================================================================
Parameter Std. Err. T-stat P-value Lower CI Upper CI
------------------------------------------------------------------------------
const 1.3953 0.0123 113.50 0.0000 1.3712 1.4194
expersq 0.0037 0.0002 19.560 0.0000 0.0033 0.0041
married 0.1073 0.0182 5.8992 0.0000 0.0717 0.1430
union 0.0828 0.0198 4.1864 0.0000 0.0440 0.1215
==============================================================================
F-test for Poolability: 9.3360
P-value: 0.0000
Distribution: F(544,3812)
Included effects: Entity
```

Like statsmodels, panel models can be specified using a R-like formula. This model is identical to the previous. Note the use of the *special* variable `EntityEffects`

to include the fixed effects.

```
mod = PanelOLS.from_formula('lwage ~ 1 + expersq + union + married + EntityEffects',data)
```