Panel Data
Panel data (or longitudinal data) typically refers to the general form of datasets which contain measurements of multiple “panel members”/objects over multiple time-periods. In other words, they are datasets/tables with multiple indices.
Panel data can be presented in many ways. Using the stock market to illustrate the definitions above we could have the following dataset presented in two forms:

Date | Company | Value ($) | Volume (M) |
---|---|---|---|
02/03/20 |
AAPL |
470.89 |
10.5 |
02/03/20 |
AMZN |
1981.03 |
2.3 |
02/03/20 |
CLDR |
8.21 |
35.1 |
03/03/20 |
AAPL |
471.03 |
11.2 |
03/03/20 |
AMZN |
1981.10 |
1.9 |
03/03/20 |
CLDR |
8.21 |
37.7 |
04/03/20 |
AAPL |
471.02 |
9.4 |
04/03/20 |
AMZN |
1981.11 |
2.2 |
04/03/20 |
CLDR |
8.36 |
29.8 |
Date | AAPL Value | AAPL Vol | AMZN Value | AMZN Vol | CLDR Value | CLDR Vol |
---|---|---|---|---|---|---|
02/03/20 |
470.89 |
10.5 |
1981.03 |
2.3 |
8.21 |
35.1 |
03/03/20 |
471.03 |
11.2 |
1981.10 |
1.9 |
8.21 |
37.7 |
04/03/20 |
471.02 |
9.4 |
1981.11 |
2.2 |
8.36 |
29.8 |
Slices of panel data
We can also take different "slices" of panel data too. These produce commonly seen table forms such as time-series and cross-sectional data. They can be thought of as special cases of panel data with only one index (one object and many time-points for time-series data, one time point and many objects for cross-sectional data). Examples of both of these are shown below:
The index of time-series data is continuous whilst the index of cross-sectional data is categorical. The continuous nature of indices of time-series data can be thought of imposing adjacency or order to the data. This can add additional complexity to the data as rows can become dependent on each other.
Modelling and generating panel data
The way in which we choose to model the panel datasets will depend on a few key traits:
Time-index
-
Total length
-
Absolute or relative
Object-index
-
Total number
Variables
-
Total number
For example, the panel data above has 2 variables for 3 panel members over 3 dates.
When we do then synthesize new panel data, we can either synthesize values from within the original index domain or extrapolate outside of the original index domain.