The visual grounding result seems didn't work out so well. #25

MingkunLishigure · 2024-09-03T03:28:48Z

Hello, Thank you for your outstanding work!
I am trying to test the LHRS bot on some other datasets, such as HRSC-2016, which is a remote sensing dataset of different types of ships.
I use the FINAL.pt in the development checkpoint.
And the inference result are shown in below image:

From the results, the model seems to be able to describe the image to some extent better, but the ability for visual localisation tasks is not satisfactory enough, is it because the dataset I am using is too cold, or because there is a problem with the prompt used, or there is some error in the visualisation process?

I also tested on classic images and found this similar problem, can you please tell me that you can reason better in the same type of images based on the current model?

Image：

[VG] Bus:

Thank you very much!

pUmpKin-Co · 2024-09-27T05:13:02Z

Hi!

Thank your for pointing out.

I believe the root problem is related to #27.

And we will continuously improve our model for improving the VG ability.

Thanks!

MingkunLishigure · 2024-09-28T11:28:39Z

Thank you for your answer！Hope everything goes well！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The visual grounding result seems didn't work out so well. #25

The visual grounding result seems didn't work out so well. #25

MingkunLishigure commented Sep 3, 2024

pUmpKin-Co commented Sep 27, 2024

MingkunLishigure commented Sep 28, 2024

The visual grounding result seems didn't work out so well. #25

The visual grounding result seems didn't work out so well. #25

Comments

MingkunLishigure commented Sep 3, 2024

pUmpKin-Co commented Sep 27, 2024

MingkunLishigure commented Sep 28, 2024