Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

add option to download reduced size images from Mapillary #56

Open
ericfrench2015 opened this issue Jul 3, 2024 · 7 comments
Open

add option to download reduced size images from Mapillary #56

ericfrench2015 opened this issue Jul 3, 2024 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@ericfrench2015
Copy link

ericfrench2015 commented Jul 3, 2024

[EDIT from @danbjoseph added here so it is visible at the top of this thread]

this issue is to add an optional flag to the assign_images step when using the MAPILLARY option to download the reduced size images instead of the full-size originals. this will save on disk space and doesn't seem to negatively impact the results of the facebook_mask2former-swin-large-cityscapes-semantic segmentation. we want to keep the default of downloading the full-size originals because we don't know what resolution images work best with future models.


Per Dan's request...

Running the sample code a while ago I ended up with 2600 images on disk at 12-13MB per - roughly 32GB. The hypothesis is if you are able to optionally download smaller versions of the image ( thumb_2048_url instead of thumb_original_url field in the request ) that would significantly reduce all compute/network resources that is brought to bear on the task while still being accurate enough for the task.

I did try just swapping out the params as mentioned above for a quick test and got the following error, but if I had to bet I'd say it's because I'm doing something wrong. Hoping an expert can weight in.

400 Client Error: Bad Request for url:
https://graph.mapillary.com/images?access_token=--REDACTED--&fields=id%2Cthumb_2048_url%2Cgeometry&is_pano=true&bbox=-85.65528860431792%2C41.95015961908728%2C-85.65510860413792%2C41.950339619267275

@dragonejt dragonejt self-assigned this Jul 10, 2024
@dragonejt dragonejt added the enhancement New feature or request label Jul 10, 2024
@dragonejt
Copy link
Contributor

dragonejt commented Jul 10, 2024

Do you want to the scripts to download just thumb_2048_url, or have the option to switch between thumb_original_url and thumb_2048_url? I am currently able to download thumb_2048_url images.

@ericfrench2015
Copy link
Author

Channeling Dan I imagine he would want it an option

@danbjoseph
Copy link
Member

i think the first thing would be to test to see if running the segmentation on the 2 different image sizes ends up with different results depending on the image size?

@ericfrench2015
Copy link
Author

that can be done but by depending on your definition of "different" it's pretty much a foregone conclusion you're going to get a +/- some small percentage just due to different pixel counts in the images. I suggest we would get far bigger bang for the buck at this stage by validating and taking action on my comments re: replacing otsu thresholding for a fixed value.

@danbjoseph
Copy link
Member

danbjoseph commented Jul 14, 2024

this is a very old post but i think it suggests that there might be more than just a small +/- ?

Since you probably don't know exactly how these networks were trained I would suggest that you resize your images to match the approximate shape that the network you are using was trained on.

also, if we're pulling 360 degree images, do they capture more of a scene and so are compressed more by shrinking them down to a set width than doing so with something like an image from a regular GoPro with a field of vision of about 170 degrees?

@ericfrench2015
Copy link
Author

Sure, my comment was scoped to the green pixel counting approach implemented in this project.

Here's a sample of results variation using facebook_mask2former-swin-large-cityscapes-semantic - interestingly the sky and vegetation portions are very similar, it's in the jumble of stuff at street level where there are big differences.

The pretty intense backlighting I'm sure doesn't help, but I agree a larger scale analysis of the effect of image resolution is absolutely worth the effort. I'll aim for an analysis of 100 images quantifying the sky and vegetation percentages by ... let's say Wednesday?

As you also point out, there are many other variables - focal length, camera orientation, distance to objects in the photo, orientation of sun and clouds. It would be good to have a discussion on your priorities for assessing each based on most likely kinds of photo inputs you expect.

As a side-note, I believe the mapillary segmentation is done on one image (probably the highest resolution) and then the user has to scale the polygons accordingly.

image
image

@danbjoseph danbjoseph changed the title look at possibility for optionally downloading 2048px wide images add option to download reduced size images from Mapillary Aug 26, 2024
@ericfrench2015
Copy link
Author

ericfrench2015 commented Aug 31, 2024

Update on this - I performed a comparison on about 1200 street-level images (orig vs smaller size) downloaded for Medan, Indonesia. The average difference in percent of the image attributed to vegetation is 0.4%, with extreme outliers in the +/- 10% range. In looking at those outliers, the primary source of the deviation appears to be flipping between classifying pixels as vegetation vs terrain (which includes grassy fields) or the dash/hood of the car, which if masked would reduce this issue greatly. Based on this I expect that the reduction in compute resources consumed by operating on smaller files is worth it. Note out of scope for this analysis was an assessment of which one was more accurate, but anecdotally I think they're about the same. Both make mistakes, just different ones.

Summary and examples below. Happy to share more detailed data if anyone is interested.

image
mapillary_compare_orig_vs_small-471867191708434
mapillary_compare_orig_vs_small-122944657251916

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

3 participants