Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Area LibGEOS Polygon problems #32

Open
skygering opened this issue Dec 27, 2023 · 8 comments
Open

Area LibGEOS Polygon problems #32

skygering opened this issue Dec 27, 2023 · 8 comments

Comments

@skygering
Copy link
Collaborator

I think the LibGEOS memory leak is causing havok far and wide. I was just doing some quick benchmarks on the new area code and just about had a heart attack. I was using this polygon:

import LibGEOS as LG
import GeoInterface as GI
import GeometryOps as GO
using BenchmarkTools
pLG = LG.Polygon([
    [
        [0.0, 5.0], [2.0, 2.0], [5.0, 2.0], [2.0, -2.0], [5.0, -5.0],
        [0.0, -2.0], [-5.0, -5.0], [-2.0, -2.0], [-5.0, 2.0], [-2.0, 2.0],
        [0.0, 5.0],
    ],
])

pGI = GI.Polygon([
    [
        (0.0, 5.0), (2.0, 2.0), (5.0, 2.0), (2.0, -2.0), (5.0, -5.0),
        (0.0, -2.0), (-5.0, -5.0), (-2.0, -2.0), (-5.0, 2.0), (-2.0, 2.0),
        (0.0, 5.0),
    ],
])

GO.equals(pLG, pGI)  # true

@benchmark LG.area(pLG)
@benchmark GO.area(pLG)
@benchmark GO.area(pGI)

I got the following:
Screenshot 2023-12-27 at 2 49 01 PM

So it seems like the method itself isn't bad, but it does NOT play nice with LibGEOS.

@rafaqz
Copy link
Member

rafaqz commented Dec 27, 2023

It could be type instability too

Or an inneficient GeoInterface path

(nope its just LibGEOS)

@evetion
Copy link
Member

evetion commented Jan 1, 2024

Screenshot 2024-01-01 at 14 49 21

This was runtime dispatch on GeoInterface.x and y. Solved by JuliaGeo/LibGEOS.jl#191, which brings the benchmark to 4μs for me (still slow, but it's copying points from C, so expected).

@rafaqz
Copy link
Member

rafaqz commented Jan 1, 2024

That's not time for copying points from C - it's mostly for allocating arrays in Julia, the yellow Array

@evetion
Copy link
Member

evetion commented Jan 1, 2024

That's not time for copying points from C - it's mostly for allocating arrays in Julia, the yellow Array

That picture was the before state. The afterward profile is in the other issue in LibGEOS.

@rafaqz
Copy link
Member

rafaqz commented Jan 1, 2024

Ok right, I think there are still a bunch of allocations happening though when each point is loaded.

I can get it 7x faster changing area in the benchmark to use getpoint with getgeom(linestring) implemented in JuliaGeo/LibGEOS.jl#193.

See #36 for faster area and benchmarks

@rafaqz rafaqz mentioned this issue Jan 1, 2024
@rafaqz
Copy link
Member

rafaqz commented Jan 1, 2024

Also see:
JuliaGeo/LibGEOS.jl#196

Turns out most of the time is making clones. Without them we are down to:

julia> @benchmark GO.area($pLG)
BenchmarkTools.Trial: 10000 samples with 337 evaluations.
 Range (min  max):  256.193 ns   10.465 μs  ┊ GC (min  max): 0.00%  96.32%
 Time  (median):     260.564 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   277.483 ns ± 167.507 ns  ┊ GC (mean ± σ):  1.02% ±  1.67%

  ▅█▄▅▄▄▃▂▁  ▁        ▂▁ ▂  ▁▃  ▂▁                              ▂
  ████████████▇▇██▇█▇█████▇▆██████▇▇█▇▅▅▇▇▃▅▄▇▃▃▄▃▆▃▁▁▃▇▃▁▁▄▄▅▅ █
  256 ns        Histogram: log(frequency) by time        418 ns <

 Memory estimate: 48 bytes, allocs estimate: 2.

@rafaqz
Copy link
Member

rafaqz commented Jan 2, 2024

With some fixes to context:

julia> @benchmark GO.area($pLG)
BenchmarkTools.Trial: 10000 samples with 535 evaluations.
 Range (min  max):  210.733 ns    6.821 μs  ┊ GC (min  max): 0.00%  96.09%
 Time  (median):     213.314 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   220.193 ns ± 135.810 ns  ┊ GC (mean ± σ):  1.36% ±  2.14%

  ▄██▄▂▃▄▁  ▁▁                                     ▂      ▁     ▂
  ████████▇████▇▅▆▅▆▆▅▆▅▅▄▅▆▅▅▃▃▅▅▁▁▃▅█▇▄▁▃▅█▇▄▁▅▃██▇▁▄▅▄██▅▁▁▄ █
  211 ns        Histogram: log(frequency) by time        284 ns <

 Memory estimate: 48 bytes, allocs estimate: 2.

@rafaqz
Copy link
Member

rafaqz commented Jan 2, 2024

And fixing how we get points a bit more:

julia> @benchmark GO.area($pLG)
BenchmarkTools.Trial: 10000 samples with 870 evaluations.
 Range (min  max):  128.115 ns    4.316 μs  ┊ GC (min  max): 0.00%  95.92%
 Time  (median):     130.037 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   138.450 ns ± 117.333 ns  ┊ GC (mean ± σ):  2.64% ±  3.03%

  ▅█▆▃▅▄ ▃▃▁▁▃▃                                  ▁              ▁
  █████████████▇▆▆▆▆▆▇▇▇█▇▇▆▇▇▇▇▇▇▅▅▅▄▅▇▆▅▅▅██▅▅▆███▅▆▆▇▇▆▃▃▂▃▆ █
  128 ns        Histogram: log(frequency) by time        185 ns <

 Memory estimate: 64 bytes, allocs estimate: 3.

So under 4x using LibGEOS directly is possible.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants