linearmodels.panel.utility.generate_panel_data

linearmodels.panel.utility.generate_panel_data(nentity: int = 971, ntime: int = 7, nexog: int = 5, const: bool = False, missing: float = 0, other_effects: int = 2, ncats: int | list[int] = 4, rng: RandomState | None = None) PanelModelData[source]

Simulate panel data for testing

Parameters:
nentity: int = 971

The number of entities in the panel.

ntime: int = 7

The number of time periods in the panel.

nexog: int = 5

The number of explanatory variables in the dataset.

const: bool = False

Flag indicating that the model should include a constant.

missing: float = 0

The percentage of values that are missing. Should be between 0 and 100.

other_effects: int = 2

The number of other effects generated.

ncats: int | list[int] = 4

The number of categories to use in other_effects and variance clusters. If list-like, then it must have as many elements as other_effects.

rng: RandomState | None = None

A NumPy RandomState instance. If not provided, one is initialized using a fixed seed.

Returns:

A namedtuple derived class containing 4 DataFrames:

  • data - A simulated data with variables y and x# for # in 0,…,4. If const is True, then also contains a column named const.

  • weights - Simulated non-negative weights.

  • other_effects - Simulated effects.

  • clusters - Simulated data to use in clustered covariance estimation.

Return type:

linearmodels.panel.utility.PanelModelData