-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lower results than in the paper for a model (probably doing something wrong) #27
Comments
To further update; I realized I constrained my search to only using sLCWA runs, so the one above does not correspond to the run presented in the paper (Table 18). However, switching to the one in the paper give me a 0.94 instead of 0.98 on ComplEx, but I think that should be good enough given that result can never be exact. RotatE also seem to give decent results on Kinship now (0.98 hits@10). But I still do not know why my results are remarkably lower for the settings above. The validation curves for the different models are also very strange, but I guess that is a consequence of large hyperparameter tuning :) |
Hi @Filco306 if you are looking at the validation curves generated by the if "callbacks" not in config["pipeline"]["training_kwargs"]:
config["pipeline"]["training_kwargs"]["callbacks"] = ["evaluation-loop"]
if "callback_kwargs" not in config["pipeline"]["training_kwargs"]:
config["pipeline"]["training_kwargs"]["callback_kwargs"] = {
"prefix": "validation"
} you may be missing to filter with training triples, too. To do so, you would need to pass the additional key config["pipeline"]["training_kwargs"]["callback_kwargs"] = {
"prefix": "validation",
"additional_filter_triples": dataset.training,
} This is a bit hidden, since this parameter goes from the |
Hi there, Thank you for your reply! :D I will re-run the experiment in question with your comment in my mind and see if the fixes the results. If not, I'll get back to you :) Thanks! :D |
One more thing I noticed: https://pykeen.readthedocs.io/en/stable/api/pykeen.training.callbacks.EvaluationLoopTrainingCallback.html also needs the factory on which to evaluate, i.e., config["pipeline"]["training_kwargs"]["callback_kwargs"] = {
"prefix": "validation",
"factory": dataset.validation,
"additional_filter_triples": dataset.training,
} |
Hi again @mberr , If I add what you write, config["pipeline"]["training_kwargs"]["callback_kwargs"] = {
"prefix": "validation",
"factory": dataset.validation,
"additional_filter_triples": dataset.training,
} I get the error:
Would you know what the issue is here? Note that I still get the warning |
Okay, this seems to be a bug in I used this smaller snippet to reproduce your error from pykeen.pipeline import pipeline
from pykeen.datasets import get_dataset
dataset = get_dataset(dataset="nations")
result = pipeline(
dataset=dataset,
model="mure",
training_kwargs=dict(
num_epochs=5,
callbacks="evaluation-loop",
callback_kwargs=dict(
frequency=1,
prefix="validation",
factory=dataset.validation,
additional_filter_triples=dataset.training,
),
),
) EDIT: I opened a ticket here: pykeen/pykeen#1213 |
Hello again @mberr , Thank you for this! Yes, I believe it is a bug. Thank you for flagging it! |
Hello!
Thank you for a nice study and a nice repository! :D I am currently trying to re-use some of your hyperparameters from the study, e.g., those for
Complex
for theYAGO3-10
dataset. However, upon trying to use the config files with the current version of Pykeen, I am landing with the error thatowa
is not an option, but only['lcwa', 'slcwa']
are valid options. I saw that you changed the name of OWA to SLCWA, so I switched from OWA to SLCWA as instructed.However, training locally with Pykeen 1.9.0 and
slcwa
gives me very different results; on the validation set, I get extremely low results, although I seem to get pretty decent (but still far from the outputted metrics there) on the test set. For this specific run, I got for the corresponding metrics:I attach my training script below; I am most likely doing something wrong or not considering some specific setting that was updated in the more recent version of pykeen. Thanks again for a nice tool! :)
The results from the database can be seen below.
Config file (originally this one):
Running the following gives me great output metrics:
Version:
Is there some change in the packages since it was last run that causes this mismatch, or am I perhaps using the package incorrectly? Thank you for your time, and thank you for your package!
The text was updated successfully, but these errors were encountered: