Data
Each row of the dataset describes an investment vehicle at a certain date.
Here follows a concise description of the columns of the three files comprising the dataset, X_train
and y_train
.
X_train
:
date
: A sequentially increasing integer representing a date. Time between subsequent dates is a constant, denoting an unknown but fixed frequency at which the data is sampled. The initial training dataset is composed of 268 dates.id
: A unique identifier representing the investment vehicle at a given date. Note that the same asset has a differentid
at each date.0,...,460
: Anonymized features describing an investment vehicle at a given date. Derived from high-quality market data.
y_train
:
date
: Same as inX_train
.id
: Same as inX_train
.y
: The target value to predict. It is related to the future performance of the investment vehicle at the given date. The value is normalized between-1
and1
.
X_test
:
Same structure as
X_train
but comprises only a few dates. This file is used to simulate the submission process locally viacrunch.test()
, orcruch test
. The aim is to help participants debug their code and have successful submissions. A successful local test usually means no errors during execution on the submission platform.
The dataset is obfuscated.
Files
X_train.parquet
y_train.parquet
Last updated