Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression Model Performance Evaluation #494

Open
thehanggit opened this issue Dec 5, 2024 · 2 comments
Open

Regression Model Performance Evaluation #494

thehanggit opened this issue Dec 5, 2024 · 2 comments
Assignees
Labels
unplanned Unplanned work added to current sprint, after sprint planning

Comments

@thehanggit
Copy link
Contributor

The regression models are trained every 3 months using data from the 3rd day of the beginning of each season. Since imputed data lacks ground truth, we cannot directly evaluate the performance by comparing predictions on imputed data. Instead, the evaluation focuses on assessing how well the trained model generalizes to missing data.

Evaluation Steps:

  1. EDA analysis on the training data to examine whether distribution, correlation, and normality assumption are satisfied.
  2. Choose one month of data within a season and randomly drop a subset of ground truth data (observed data) to create a test set.
  3. Use the fixed coefficients from trained model to make predictions for the test set.
  4. Compare predictions against the actual values in the test set using MAE, RMSE, and R^2.
  5. Repeat the random sampling and evaluation through k-fold cross-validation to avoid bias.
  6. Compare the results of different regression models.
@thehanggit thehanggit added this to the Data Quality Checks milestone Dec 5, 2024
@thehanggit thehanggit self-assigned this Dec 5, 2024
@ian-r-rose
Copy link
Member

Love this!

@thehanggit
Copy link
Contributor Author

Love this!

Thank you! @ian-r-rose. Will see what happens

@jkarpen jkarpen added the unplanned Unplanned work added to current sprint, after sprint planning label Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
unplanned Unplanned work added to current sprint, after sprint planning
Projects
None yet
Development

No branches or pull requests

3 participants