Regression Model Performance Evaluation #494

thehanggit · 2024-12-05T00:13:38Z

The regression models are trained every 3 months using data from the 3rd day of the beginning of each season. Since imputed data lacks ground truth, we cannot directly evaluate the performance by comparing predictions on imputed data. Instead, the evaluation focuses on assessing how well the trained model generalizes to missing data.

Evaluation Steps:

EDA analysis on the training data to examine whether distribution, correlation, and normality assumption are satisfied.
Choose one month of data within a season and randomly drop a subset of ground truth data (observed data) to create a test set.
Use the fixed coefficients from trained model to make predictions for the test set.
Compare predictions against the actual values in the test set using MAE, RMSE, and R^2.
Repeat the random sampling and evaluation through k-fold cross-validation to avoid bias.
Compare the results of different regression models.

ian-r-rose · 2024-12-05T00:16:13Z

Love this!

thehanggit · 2024-12-05T00:56:19Z

Love this!

Thank you! @ian-r-rose. Will see what happens

thehanggit added this to the Data Quality Checks milestone Dec 5, 2024

thehanggit self-assigned this Dec 5, 2024

jkarpen added the unplanned Unplanned work added to current sprint, after sprint planning label Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression Model Performance Evaluation #494

Regression Model Performance Evaluation #494

thehanggit commented Dec 5, 2024

ian-r-rose commented Dec 5, 2024

thehanggit commented Dec 5, 2024

Regression Model Performance Evaluation #494

Regression Model Performance Evaluation #494

Comments

thehanggit commented Dec 5, 2024

ian-r-rose commented Dec 5, 2024

thehanggit commented Dec 5, 2024