linearmodels.panel.data.PanelData

class PanelData(x, var_name='x', convert_dummies=True, drop_first=True, copy=True)[source]

Abstraction to handle alternative formats for panel data

Parameters
x{ndarray, Series, DataFrame, DataArray}

Input data

var_namestr

Variable name to use when naming variables in NumPy arrays or xarray DataArrays

convert_dummiesbool

Flat indicating whether pandas categoricals or string input data should be converted to dummy variables

drop_firstbool

Flag indicating to drop first dummy category when converting

copy: bool

Flag indicating whether to copy the input. Only has an effect when x is a DataFrame

cast: bool

Flag indicating to case the data to double precision.

Raises
TypeError

If the input type is not supported

ValueError

If the input has the wrong number of dimensions or a MultiIndex DataFrame does not have 2 levels

Notes

Data can be either 2- or 3-dimensional. The three key dimensions are

  • nvar - number of variables

  • nobs - number of time periods

  • nentity - number of entities

All 3-d inputs should be in the form (nvar, nobs, nentity). With one exception, 2-d inputs are treated as (nobs, nentity) so that the input can be treated as-if being (1, nobs, nentity).

If the 2-d input is a pandas DataFrame with a 2-level MultiIndex then the input is treated differently. Index level 0 is assumed ot be entity. Index level 1 is time. The columns are the variables. MultiIndex Series are also accepted and treated as single column MultiIndex DataFrames.

Attributes
dataframe

pandas DataFrame view of data

entities

List of entity index names

entity_ids

Get array containing entity group membership information

index

Return the index of the multi-index dataframe view

isnull

Locations with missing observations

ndim

Number of dimensions of panel view of data

nentity

Number of entities

nobs

Number of time observations

nvar

Number of variables

panel

pandas Panel view of data

shape

Shape of panel view of data

time

List of time index names

time_ids

Get array containing time membership information

values2d

NumPy ndarray view of dataframe

values3d

NumPy ndarray view of panel

vars

List of variable names

Methods

copy()

Return a deep copy

count([group])

Count number of observations by entity or time

demean([group, weights, return_panel, ...])

Demeans data by either entity or time group

drop(locs)

Drop observations from the panel.

dummies([group, drop_first])

Generate entity or time dummies

first_difference()

Compute first differences of variables

general_demean(groups[, weights])

Multi-way demeaning using only groupby

mean([group, weights])

Compute data mean by either entity or time group

Properties

dataframe

pandas DataFrame view of data

entities

List of entity index names

entity_ids

Get array containing entity group membership information

index

Return the index of the multi-index dataframe view

isnull

Locations with missing observations

ndim

Number of dimensions of panel view of data

nentity

Number of entities

nobs

Number of time observations

nvar

Number of variables

panel

pandas Panel view of data

shape

Shape of panel view of data

time

List of time index names

time_ids

Get array containing time membership information

values2d

NumPy ndarray view of dataframe

values3d

NumPy ndarray view of panel

vars

List of variable names