Panel Data

Panel data (or longitudinal data) typically refers to the general form of datasets which contain measurements of multiple “panel members”/objects over multiple time-periods. In other words, they are datasets/tables with multiple indices.

Panel data can be presented in many ways. Using the stock market to illustrate the definitions above we could have the following dataset presented in two forms:

Long at wide forms of panel data
  • Long Form

  • Wide Form

Table 1. Share prices of companies in March 2020
Date Company Value ($) Volume (M)

02/03/20

AAPL

470.89

10.5

02/03/20

AMZN

1981.03

2.3

02/03/20

CLDR

8.21

35.1

03/03/20

AAPL

471.03

11.2

03/03/20

AMZN

1981.10

1.9

03/03/20

CLDR

8.21

37.7

04/03/20

AAPL

471.02

9.4

04/03/20

AMZN

1981.11

2.2

04/03/20

CLDR

8.36

29.8

Table 2. Share prices of companies in March 2020
Date AAPL Value AAPL Vol AMZN Value AMZN Vol CLDR Value CLDR Vol

02/03/20

470.89

10.5

1981.03

2.3

8.21

35.1

03/03/20

471.03

11.2

1981.10

1.9

8.21

37.7

04/03/20

471.02

9.4

1981.11

2.2

8.36

29.8

Slices of panel data

We can also take different "slices" of panel data too. These produce commonly seen table forms such as time-series and cross-sectional data. They can be thought of as special cases of panel data with only one index (one object and many time-points for time-series data, one time point and many objects for cross-sectional data). Examples of both of these are shown below:

  • Cross-sectional

  • Time-series

Table 3. Share price of companies on 02/03/20
Company Value ($) Volume (M)

AAPL

470.89

10.5

AMZN

1981.03

2.3

CLDR

8.21

35.1

Table 4. Share price of AAPL in March 2020
Date Value ($) Volume (M)

02/03/20

470.89

10.5

03/03/20

471.03

11.2

04/03/20

471.02

9.4

The index of time-series data is continuous whilst the index of cross-sectional data is categorical. The continuous nature of indices of time-series data can be thought of imposing adjacency or order to the data. This can add additional complexity to the data as rows can become dependent on each other.

Modelling and generating panel data

The way in which we choose to model the panel datasets will depend on a few key traits:

Time-index

  • Total length

  • Absolute or relative

Object-index

  • Total number

Variables

  • Total number

For example, the panel data above has 2 variables for 3 panel members over 3 dates.

When we do then synthesize new panel data, we can either synthesize values from within the original index domain or extrapolate outside of the original index domain.

Common types of panel datasets

Panel Data Form # Object Indices # Time Indices

Type 1: Cross-sectional Data

1+

0

Type 2: Absolute Time-series Data

0

1

Type 3: Event Data

0+

1