Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Using bounding_rect() on a GeometryCollectionArray causes a stack overflow #979

Closed
BlakeOrth opened this issue Jan 28, 2025 · 6 comments · Fixed by #984
Closed

Using bounding_rect() on a GeometryCollectionArray causes a stack overflow #979

BlakeOrth opened this issue Jan 28, 2025 · 6 comments · Fixed by #984

Comments

@BlakeOrth
Copy link
Contributor

I recently upgraded to the most recently published 0.4.0-beta.3 branch so I could use geoarrow with the updated geo dependency. However, as the title states, using bounding_rect() on a GeometryCollectionArray causes a stack overflow. Reproduction code below:

    #[test]
    fn stack_overflow_repro() {
        let points: Vec<geo::Point<f64>> = vec![
            point! {x: -104.991, y: 39.7392},
            point! {x: -73.7897, y: 40.7379},
        ];
        let point_array: PointArray = (points.as_slice(), Dimension::XY).into();
        assert!(point_array.is_valid(0));
        assert!(point_array.is_valid(1));
        let _ = point_array.bounding_rect();

        let gc_array: GeometryCollectionArray = point_array.into();
        assert!(gc_array.is_valid(0));
        assert!(gc_array.is_valid(1));
        let _ = gc_array.bounding_rect();

        panic!("unreachable code");
    }
@kylebarron
Copy link
Member

Thanks; this is a helpful bug report! Is there a traceback you can include?

@BlakeOrth
Copy link
Contributor Author

Unfortunately I couldn't coerce a backtrace from either this reproduction test, or my actual test case that started failing after updating. It seems the normal RUST_BACKTRACE doesn't apply to stack overflow panics like it does to normal panics. If you know something I don't I'm happy to try alternative options to get one though.

I guess I should probably be a bit more clear with the bug report. I can't guarantee that the issue only exists in 0.4.0-beta.3. The removal of the MixedGeometryArray from that version forced me to refactor to using a GeometryCollectionArray so there's a possibility this behavior has existed prior to 0.4.0-beta.3.

@kylebarron
Copy link
Member

Unless you specifically have geometry collections, you should use GeometryArray if you were previously using MixedGeometryArray.

That said, we should fix this because it'll likely pop up in other places where we use GeometryCollections.

I don't have a guess as to what's happening off the top of my head

@BlakeOrth
Copy link
Contributor Author

Thanks for the tip on using GeometryArray, I'll refactor into using it instead. It wasn't immediately obvious which array I should use to migrate, so that might be a good item to include in the release notes of 0.4.0 whenever it's ready!

As far as this bug, I've done some digging for us and I think I've found it. In short, geo::Geometry::from(value: GeometryCollection) ends up calling itself. The rough callstack from bounding_rect() is through the iter_geo() method which ends up invoking the above method.

If you place this test in src/scalar/geometrycollection/scalar.rs it shows the behavior:

#[cfg(test)]
mod tests {
    use arrow_buffer::OffsetBufferBuilder;

    use crate::array::PointArray;

    use super::*;

    #[test]
    fn stack_overflow_repro() {
        let array: MixedGeometryArray =
            PointArray::from((vec![geo::point!(x: 0., y: 0.)].as_slice(), Dimension::XY)).into();
        let mut offsets = OffsetBufferBuilder::new(1);
        offsets.push_length(1);
        let offsets = offsets.finish();
        let gc = GeometryCollection::new(&array, &offsets, 0);

        let _: geo::Geometry = gc.into();

        panic!("unreachable code")
    }
}

@kylebarron
Copy link
Member

#984 should fix this.

By the way

using bounding_rect()

Since bounding rect is such a simple algorithm to implement (well, excluding the antimeridian), we have a "native" impl as well, that doesn't convert coordinates to geo objects first. That should be slightly faster.

Now that I'm looking at that code again, the naming could probably be better though. BoundingRect being a trait in the geo impl and a struct in the native impl isn't great. (Any naming suggestions are welcome)

@BlakeOrth
Copy link
Contributor Author

Thanks for the tip on the native impl, I'll move to using that instead.

As far as naming goes (one of the truly hard problems in code), after looking at both the trait and the struct, I'm wondering if renaming the struct to Bounds might end up being a bit more appropriate. It seems like keeping geo::BoundingRect in line with the geo crate is important for discoverability/parity between this crate and geo. Additionally, the native BoundingRect struct includes an optional Z dimension (and I suspect potentially M at some point in the future) which, if we're being pedantic, makes it optionally not a "Rectangle." Other options might be something like Bbox or Extent, but those already feel like overloaded terms in the various "geo" code bases that exist.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants