-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
First lookup for an ID is slow #376
Comments
yeah seems possible the page cache just isn't warm and that prefault for whatever reason isn't working on your machine if you want to warm up the page cache – consider doing a sequential scan or a few thousand random lookups first |
Looking at the code, it seems like prefault only triggers if MAP_POPULATE is defined. It does not seem to be set anywhere in the library. Is that something defined on the system in question? Part of compilation? Something I should have set in my code calling the library? In short, is there a way for me to check on a particular computer/installation if that flag is enabled? Edit: Thank you so much for the quick response, by the way! I really appreciate the library and documentation. |
There's some doc here: http://man7.org/linux/man-pages/man2/mmap.2.html We should probably throw a warning or similar if MAP_POPULATE isn't defined and |
I ended up moving to a different ann library because of this. My best guess about the problem is that something about the cloud/kubernetes configuration where the code was running caused MAP_POPULATE to be undefined, but I did not confirm. |
@scottbreyfogle why is this a problem though? you can just do some random lookups to warm up the page cache, (or just loop through all vectors) anyway closing for now |
That is a possibility that I considered. It was really a cost-benefit. A 15 second latency hit was really not acceptable for us, and I decided that it would be more reliable to switch to something I understood more. If I can't understand fully what is going wrong (only guess), then I'm not comfortable making guarantees about the comprehensiveness of a solution. Are there multiple data structures that need to be warmed? If I call a It was just simpler to move to a solution where I didn't have to test all that, especially when validation needed to be done mostly on a remote machine with a very large dataset, since the issue is hard to detect locally or with a small dataset. Closing seems reasonable though. |
seems fair. generally i don't think the linux kernel guarantees anything about mmap and swapping it out from primary memory, but i could be wrong. i'll look into the mmap flags again at some point |
Hello @erikbern, |
yeah, basically you need to scan through the index and make sure every page is hit (i think the linux page size is 4kb?) so scan through and hit maybe every 100 vectors, and that should be fine (it probably won't be much slower to hit every vector actually) |
@erikfox what about the distribution of the hits among the index? Shall we randomly select the hits to call / 100 vectors so that the distribution of the hits over the data will likely be uniform? |
A comment as a SRE. Relying on disk cache is unstable in terms of performance. |
@scottbreyfogle Which solution did you move to, to address this problem, as I am facing similar issues. |
you can load it with |
@erdtman I get the following error, i am working on mac, prefault is set to true, but MAP_POPULATE is not defined on this platform. |
yeah i believe prefault doesn't work on os x unfortunately |
as a workaround, you could iterate over all indices and run |
@erikbern what will happen when doing |
@loretoparisi typically the kernel will cache that file in memory, meaning subsequent random access to it will be very fast |
https://serverfault.com/a/43391 confirms this :) |
I'm running into an interesting issue -- when I query locally (1.5m items, 128-d, num_trees = 100, k = 1000, search_k = 500000), I always get sub 100ms. This is on OSX, where |
I've confirmed with vmtouch that my local environment is correctly mmapping the index, whereas on GAE it is not (even with both
|
i don't think prefault is supported on all platforms |
I believe |
This worked like magic. Thanks! |
I'm seeing a problem when using the library where calling get_nns_by_vector is very slow on first execution for a given value (15s for an index with ~2m vectors of length 200), and then much faster in subsequent calls (subsecond). I've also seen similar behavior in get_item_vector, but I'm not sure if it's related. I've set prefault to true when loading the index from disk.
The strangest part is that it only occurs on some computers, and I'm not able to repro on my local machine to get details on what is going on. If I load the same index on my computer, the execution time is always subsecond. I'm continuing to look into this to see if I can find a reliable method of reproducing.
Has anyone seen performance patterns like this? Do you have thoughts on what the problem may be? I'm not sure that it's Annoy specifically, but would be good to know if it is.
My current thoughts are that it has to do with the MMAPing and that the index is not all saved into RAM and the disk lookups are slow on some machines. I'm not very familiar with MMAPing and would appreciate outside thoughts on whether that's reasonable and/or how to verify if it's the problem.
The text was updated successfully, but these errors were encountered: