What's new?
This release incorporates additional training data, specifically aiming to improve our coverage of:
- Boats and trains in the "vehicle" class
- Artificial objects (e.g. bait stations, traps, lures) that frequently overlap with animals
- Rodents, particularly at close range
- Reptiles and small birds
This release also represents a change in MegaDetector's architecture, from Faster-RCNN to YOLOv5. Our inference scripts have been updated to support both architectures, so the transition should be mostly seamless. Users will need to update to the latest version of the repo and follow our updated setup instructions in order to use MDv5, but the inference scripts work the same way they always have.
Why are there two files?
MDv5 is actually two models (MDv5a and MDv5b), differing only in their training data (see the training data section for details). Both appear to be more accurate than MDv4, and both are 3x-4x faster than MDv4, but each MDv5 model can outperform the other slightly, depending on your data. When in doubt, for now, try them both. If you really twist our arms to recommend one... we recommend MDv5a. But try them both and tell us which works better for you! The pro tips section in the MegaDetector User Guide contains some additional thoughts on when to try multiple versions of MD.
A word on confidence thresholds
MDv5 uses the full range of confidence values much more than MDv4 does, so don't apply the MDv4 thresholds you're accustomed to when using MDv5 results. Typical confidence thresholds for MDv4 were in the 0.7-0.8 range; for MDv5, they are in the 0.15-0.25 range. Does that make MDv5 better-calibrated? Who knows, but what we know for sure is that if you apply a confidence threshold like 0.8 to MDv5 results, you'll miss some animals.
Will MDv5 files work with Timelapse?
Short answer: yes!
Long answer...
Many users read MegaDetector results into Timelapse. MDv4 and MDv5 results generated with our batch inference script are compatible with old versions of Timelapse, but as per above, MDv4 and MDv5 operate in difference confidence ranges, and the default confidence thresholds in old versions of Timelapse were set to be reasonable for MDv4. We want users to have reasonable default values regardless of which model version you're using, so we've updated the MegaDetector output format to include recommended defaults; as of the just-released Timelapse version 2.2.5.2, Timelapse reads those defaults, so users will see reasonable default thresholds for either MDv4 or MDv5 files. You should be prompted to update Timelapse when you next start it, or you can visit the Timelapse download page to get the latest version. If you are using Timelapse version 2.2.5.1 or earlier and for any reason you can't upgrade, just remember to bump up those thresholds if you're working with MDv5 results.
But of course, as always, never trust default confidence values; always spend some time choosing thresholds that are appropriate for your data!
And remember to thank your friendly neighborhood Timelapse developer (Saul) for all the work he put in to supporting this release.
Is it really more accurate and faster than MDv4? That seems too good to be true.
Well, it's definitely faster, that part is easy.
Out of what we think is a healthy paranoia, we've spent a bunch of time over the last few weeks trying to find cases where MDv5 is less accurate than MDv4. Although we have found a few individual images where MDv4 finds something that MDv5 misses (because deep learning can be maddening that way), on every data set we've tried with any realistic number of images, we've failed to find cases where MDv4 is the more accurate option, and on many data sets that challenged MDv4, MDv5 is substantially more accurate. But we're sure that if we keep looking, we will find a case where MDv4 is better for some esoteric reason, because... see above, re: deep learning being maddening.
So as much as we want to hear about your successes with MDv5, we're also eager to hear about the cases where it fails, or isn't as good as MDv4. Email us to let us know how things compare for you, or file issues on this repo, or post your experiences to the AI for Conservation forum at WILDLABS or the AI for Conservation Slack group.
What if I'm still cautious and/or skeptical? Can I compare MDv4 and MDv5 results?
Although we wouldn't exactly call this "user-friendly" yet, in the course of our own comparisons on mostly-unlabeled data, we put together a script to compare sets of batch results, which we've primarily used to compare MDv4/MDv5a/MDv5b. This produces pages like this one, which asks "what does each model find at a reasonable confidence threshold that the other models miss?". If all models are equal, each model will have around the same number of detections that the others don't have, and they will all be junk / false detections. Hopefully this script is helpful to others who will be doing similar comparisons.
(Take that particular results link with a grain of salt; it's a very practical data set for demonstrating this comparison script, but the data it contains is from the ENA24 data set, which is included in MDv5's training data, and not MDv4's.)
Along the same lines, in some cases, users may want to run more than one model, and take, e.g., only the very-high-confidence detections from MDv5b (or MDv4) and merge them into an otherwise-MDv5a results file. We have a new script to do this as well, although we think this will rarely be necessary.
Tell us how it goes!
We are really excited about this release... but really what we're even more excited about is hearing from users about how it can help your work. So please email us to let us know how MDv5 works for you, or file issues on this repo. No detail is too small to let us know about, especially as it pertains to the updated setup/execution instructions. If you're confused, someone else it too.
And if you want to discuss your MDv5 experiences with community, consider posting to the AI for Conservation forum at WILDLABS or the AI for Conservation Slack group.
We look forward to hearing from you!
Release cloned from Microsoft/CameraTraps, original release posted by agentmorris on Jun 20, 2022.