Automatically calculate country codes per quadkey & remove country_iso flag #29
Labels
get_buildings
Issues related to the get_buildings operations
help wanted
Extra attention is needed
interesting
Issues that are interesting problems that could be fun to take on
Milestone
It seems like it should be possible to automatically include a country_iso to substantially speed up the query. The current method of having the user supply it is potentially error prone, and annoying.
The idea would be to calculate the list of country_iso values for every single quadkey. This would have to be a list, since quadkeys can cross countries, and big ones could have a hundred or more countries in them. But most should be one or a handful of countries, which will most always speed up the query.
There are 16 million quadkeys at level 12, but many are likely in the ocean. We likely could use a quadkey at level 10 or even 8, as having a couple more hive partitions to help wouldn't still make it worth it.
So I think the main thing would be to make a script that generates a list of country iso codes for every quadkey. Then store that as a parquet file, and if it's not too big we could likely just include it in the open_buildings package.
If we had this then we could remove the country_iso flag, as we'd be able to always use a hive partition.
The text was updated successfully, but these errors were encountered: