-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance on CUHK03 #14
Comments
|
Regarding the scores from the third-party re-implementation, I quickly skimmed their code and they do actually load the original splits and have the option to train with each of the splits. However, it is a little unclear if the scores in their benchmark are obtained from actually doing the 20 trainings and reporting the average. If the results come from a single split, this does explain why the scores are different since the performance on the different splits varies quite a bit. It's important not to look at the CMC allshot evaluation. This is not the typical CUHK03 evaluation protocol and thus not comparable to numbers you find in the literature. When comparing their CUHK03 results (85.4) with ours (89.6/87.6) I think that slightly different implementations and test-time augmentation can explain the difference. For our CUHK03 experiments we combined the training and validation sets and used all hyperparameters as also used for our Market-1501 and MARS training (hence we don't need the validation set). The only thing we changed was the input size where we used 256x96 instead of 256x128 to better match the original CUHK03 aspect ratio. |
Dear authors, |
Almost same parameters, yes. The main difference is "H = 256, W = 96". As for the testing procedure, we follow the "original" 20-split one, which is detailed in the original CUHK03 paper. We have never gotten anything nearly as low as 30% rank-1, that's a very bad score indicating you're doing something very wrong or have a bug hidden somewhere. The most frequent mistake we see is that people forget to load the pre-trained ImageNet weights. |
Thank you for quick respone. I will double check and come back with details |
Actually I have an interest in mAP score of TriNet on cuhk03. |
I don't fully understand what you mean @liangbh6. In case you didn't notice, we have included CUHK03 scores in the latest arxiv version of the paper. |
I have found the rank-1 and rank-5 scores on cuhk03 in the latest arxiv version of the paper. But mAP is a measure different from them. |
@liangbh6 aaah! In fact both @lucasb-eyer and me were a bit confused by your comment since we do provide CUHK03 results, but only now I realize we do not provide the mAP score. This is simply based on the fact that the mAP score is not meaningful on the CUHK03 dataset since you can only retrieve a single ground truth match. Some of the more recent papers stopped using the original evaluation protocol and rather created a single new train-test-split for which mAP seems to make more sense. It should be noted though that these scores are not compatible and you should always pay attention to the evaluation protocol when looking at CUHK03 scores in a paper. To be honest, even within the original evaluation protocol, there are some ambiguities and a lot of the papers seem to evaluate in a slightly different way. I have always wondered how comparable the scores are at all. The new split might actually fix this to some extend. |
Well, thanks for your explanation! |
Hello, authors.
I was wondering, if you could provide some extra details about training on CUHK03. There is third-party re-implementation of your work. This implementation shows almost the same performance on Market1501, according to their benchmarks they did not use Test-time data augmentation. However, your performance on chuk03 is a little bit far away from theirs. Why? Can Test-data augmentation influence final result that much? By the way, did you use only one GPU fro training?
The text was updated successfully, but these errors were encountered: