-
-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plot.model_performance_explainer outliers' labels depend on the order of model input #49
Comments
close due to lack of human resources |
We still have the same problem in R library("DALEX")
#> Welcome to DALEX (version: 2.4.3).
#> Find examples and detailed introduction at: http://ema.drwhy.ai/
library("randomForest")
#> randomForest 4.7-1.1
#> Type rfNews() to see new features/changes/bug fixes.
model_apart_lm <- archivist::aread("pbiecek/models/55f19")
explain_apart_lm <- DALEX::explain(model = model_apart_lm,
data = apartments_test[,-1],
y = apartments_test$m2.price,
label = "Linear Regression")
#> Preparation of a new explainer is initiated
#> -> model label : Linear Regression
#> -> data : 9000 rows 5 cols
#> -> target variable : 9000 values
#> -> predict function : yhat.lm will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package stats , ver. 4.2.3 , task regression ( default )
#> -> predicted values : numerical, min = 1792.597 , mean = 3506.836 , max = 6241.447
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -257.2555 , mean = 4.687686 , max = 472.356
#> A new explainer has been created!
model_apart_rf <- archivist::aread("pbiecek/models/fe7a5")
explain_apart_rf <- DALEX::explain(model = model_apart_rf,
data = apartments_test[,-1],
y = apartments_test$m2.price,
label = "Random Forest")
#> Preparation of a new explainer is initiated
#> -> model label : Random Forest
#> -> data : 9000 rows 5 cols
#> -> target variable : 9000 values
#> -> predict function : yhat.randomForest will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package randomForest , ver. 4.7.1.1 , task regression ( default )
#> -> predicted values : numerical, min = 1985.837 , mean = 3506.107 , max = 5788.052
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -762.3422 , mean = 5.416971 , max = 1318.093
#> A new explainer has been created!
mr_lm <- DALEX::model_performance(explain_apart_lm)
mr_rf <- DALEX::model_performance(explain_apart_rf)
# Works good
plot(mr_rf, mr_lm,
geom = "boxplot",
show_outliers = 1) # Doesn't assing the outliners correctly
plot(mr_lm, mr_rf,
geom = "boxplot",
show_outliers = 1) Created on 2024-06-08 with reprex v2.0.2 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
Following the example on https://pbiecek.github.io/DALEX/reference/plot.model_performance_explainer.html , if you rearrange the order of arguments from plot(mp_rf, mp_glm, mp_lm, geom = "boxplot", show_outliers = 1) to plot(mp_glm, mp_lm, mp_rf, geom = "boxplot", show_outliers = 1), you will get a graph where the outliers don't match the model.
It seems like we have to input the models best to worst in terms of root mean square of residuals for it for the outliers' label to match the model.
The text was updated successfully, but these errors were encountered: