Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Project future? #121

Open
drwelby opened this issue Nov 7, 2024 · 3 comments
Open

Project future? #121

drwelby opened this issue Nov 7, 2024 · 3 comments

Comments

@drwelby
Copy link

drwelby commented Nov 7, 2024

For @geospatial-jeff but also @vincentsarago, @dmahr1, @kylebarron and other interested parties.

Are any of you still interested in working on this project? Async COG reading is quite useful to us Maxar folks so if we get more active in its use and upkeep do any of you still want to do the top-level management of steering the project, accepting PRs, and such? We can certainly work in our own fork, but if there's interest in "handing this off", I'd be happy to discuss.

@kylebarron
Copy link

I think there's significant potential in an async COG reader for Python. But personally I think the greatest potential would be to implement this in Rust with bindings to Python.

Rust has proven its potential as a language that's stable, easy to maintain, easy to bind to Python, and really fast.

Look at our project obstore. It's a Rust-powered Python library to interact with AWS/GCS/Azure from Python (GET/HEAD/LIST/DELETE, etc). It's useful in its own right, but also useful as a benchmarking tool to see how fast the async Rust-Python integration can be. In https://github.com/geospatial-jeff/pyasyncio-benchmark @geospatial-jeff is working on benchmarks for obstore and his early results indicate that it may enable significantly higher throughput than aioboto3.

This is consistent with the initial results that Earthmover found in Icechunk, that Zarr v3 with Icechunk as the IO layer can be 2x faster than Zarr v3. The mechanism for this is likely that Icechunk is also using async Rust as the IO layer.

A Rust COG reader could potentially be even faster by decoding image data on a separate thread from the coroutine, so we could stack improvements in the IO and CPU layers.

I spent some time prototyping a Rust port of aiocogeo in https://github.com/developmentseed/aiocogeo-rs. I was able to fully read COG metadata but didn't yet implement data reads, so there's no benchmark yet. There's a separate geotiff Rust crate, but that builds on the tiff crate, which doesn't have async support. There's some ongoing discussion on this here (georust/geotiff#13).

I'd love to push aiocogeo-rs forward to at least get some benchmarks and see if it's worth continuing development, but I'm working on a lot of projects and it's hard to dedicate time to it without funding.

I'm looking forward to hearing others' thoughts as well!

@drwelby
Copy link
Author

drwelby commented Nov 7, 2024

Thanks Kyle, there's some very exciting potential here. Let me catch up on all your links. Maybe the best route here is that we all leapfrog ahead and support the Rust work where we can.

@rdenham
Copy link

rdenham commented Feb 20, 2025

I would look forward to something in rust, and I might perhaps forward this project information on to a rust colleague of mine who would be better placed to contribute.

In case it's useful, I did hack together a similar python approach, https://gitlab.com/jrsrp/sys/asyncog based on this. Not as polished as this original project, but it could be useful for those who need to keep within the python ecosystem.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants