-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Randomness introduced when loading cache files in JSON #13
Comments
This behavior cannot be controlled through manually seeding, this is what makes me uncomfortable. I think everything should be in order before e.g. |
toolkit/got10k/datasets/vid.py Line 41 in c5d3cb1
A simple solution is to change this line to |
Thanks for reporting the randomness issue and also proposing a solution. Using |
Hi @ZhouYzzz, the seq_dict = json.load(f, object_pairs_hook=OrderedDict) You could access the revision using |
That is great, thank you! |
I have been troubled by the randomness of
ImageNetVID
, and finally found the reason. In some versions of python, e.g. python 3.5, the cache files in JSON format will be unorderly loaded. This won't happen when we use python 2.7 or python 3.6. This greatly prevented me from reproducing my experiments, since the random order of the training data between runs will lead to different gradients in early epochs when training SiamFC. I suggest we may cache the dataset in a more stable way, e.g. using numpy or cpickle, and usingOrderedDict
or else.Details:
gives
ILSVRC2015_train_00000000.0
twice when using python 3.6 (completely in order),gives
ILSVRC2015_train_00646001.0
twice when using python 2.7 (not in order but repeatable),but gives
ILSVRC2015_train_00053009.1
andILSVRC2015_train_00047000.2
when using python 3.5.The text was updated successfully, but these errors were encountered: