-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Optimize astar #142
Optimize astar #142
Conversation
valid_exit() in the benchmark will receive negative x and y deltas from get_available_exist() which when the point they are examining a loc which is x=0 or y=0 would create a negative search position. This in turn would panic in algorithm2d when trying to coerce them into usize.
[Note: I am making a larger PR fixing performance on astar but I will break them up into individual commits so the relative impact can be seen and reasoned with vs one big commit] I noticed a missing 'break' in the original code but then converted it to use find (code generated seemingly has same performance). Here is the output from bench: Benchmarking a_star_test_map: Collecting 100 samples in estimated 8.4233s (10k iterations) Benchmarking a_star_test_map: Analyzing a_star_test_map time: [821.89 us 847.62 us 877.18 us] change: [-10.059% -6.2216% -2.1936%] (p = 0.00 < 0.05) Performance has improved.
This is very minor perfwise but it seems reasonable to not make Node until you know you should_add.
Reduces checking to see if an element exists with a second retrieval with a single get. Get will return None if it does not exist so in essence this halves hashing/bounds checks. Benchmarking a_star_test_map: Analyzing a_star_test_map time: [733.21 us 748.35 us 766.88 us] change: [-11.883% -9.4497% -6.8425%] (p = 0.00 < 0.05) Performance has improved.
This is a very minor perf change but it just avoids making an object unless you need it.
We know how many elements will be in answer so use with_capcity instead of new.
The check and remove is not needed because an insert happens right after that.
Err the first commit is the commit is the other PR so I guess only merging this PR would fix the other open PR as well. |
I just realized this benchmark is making a mistake on the last point of get_available_exits. This ends up causing a lot more searching but it this is ok since it is now properly determining all possible valid exits. Technically, this commit should have been another PR... |
I just realized that I made an error in a294e8c. Using parents obviously works but way over allocates. I will think about this for a little bit and revert that change if I do not come up with something more reasonable. |
This reverts commit a294e8c.
A small gain in this benchmark by trading n memcpy from insert 1 vs n memory location swaps (pushes + reverse). If we could know size this would be even bigger as I think LLVM would basically perform the reverse for us for free.
I wanted to make sure I was actually not breaking astar based on amethyst#142.
BinaryHeap iter() will return elements in an arbitrary order so since we cannot take advantage of f being in an order we may as well use idx as the first comparison operation.
Compiler seems to figure out this need not be on the heap perhaps? Pretty decent perf win
Any news about this old P.R. @thebracket? |
As author of this PR I am willing to resolve merge conflicts but another idea would be to just use pathfinding crate for astar (and pretty much any algo for searching). In my own project which has a similar map (this was a long time ago so my memory is fuzzy) but I think using an iterator for proposed moves (get_available_exits) and pathfinding was like 20x faster than this. It might be better to open a newer PR which makes those changes...or not. I just remembered this is a public repo: https://github.com/enebo/mappy/blob/main/src/map.rs#L244 . This just gives an idea and mappy is not exactly the same as bracket-lib but the benchmark itself is the same so it can be compared. |
This is a series of small discrete changes which improves the performance of astar:
I have a second PR for getting the bench running due to some sign issue making usizes which is needed to be able to reproduce these results.
I have a third PR which adds some minimal unit tests
*Note: This was updated to reflect latest updates and to use the fixed benchmark master:
3072985