-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
关于VG检测的问题 #27
Comments
您好, 我们IoU代码参考自这里, 感谢指出! |
This was referenced Sep 27, 2024
DIOR-RSVG的实验结果也存在相同的问题。MiniGPT-v2 甚至不用在遥感数据上微调就能达到80.65... 而在另一篇论文H2RSVLM中不用微调的模型得分大概在30-40这个水平,即使在训练集上微调了也不到50 怀疑另一篇论文 SkyEyeGPT 评测代码也有相同的 BUG, RSVG 和 DIOR-RSVG 的结果也有误。 我用 #18 中作者提供的RSVG和DIOR-RSVG预测结果,测出来的指标分别是0.31% 和 11.91%。我用作者提供的FINAL.pt 测出来的指标分别是1.72%和12.93%。下图展示了bug修复前后的分数差异: 修复了一下现在的评测代码,把归一化之后的坐标转换回去:
|
This was referenced Sep 30, 2024
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
作者您好,我在使用模型对RSVG数据集进行检测的时候,发现检测的可视化结果非常不好,例如:
但是我使用main_vg.py同样复现出了71分的结果:
[09/25 16:00:08 train]: Full config saved to /gpfsdata/home/zhangchenkai/LHRS-Bot/output_vg/config.json
[09/25 16:00:08 train]: accelerator: gpu
adjust_norm: false
alignment_dim: 768
batch_size: 1
bf16: true
bits: 16
config: null
data_path: /gpfsdata/home/zhangchenkai/LHRS-Bot/rsvg/images
data_target: /gpfsdata/home/zhangchenkai/LHRS-Bot/rsvg/RSVG_test.json
double_quant: true
dtype: float16
enable_amp: true
entity: pumpkinn
epochs: 2
eval:
dataset: AID
fp16: false
generate: false
gpus: 0
inf_sampler: false
is_distribute: false
local_rank: 0
lora:
enable: false
lora_alpha: 256
lora_bias: none
lora_dropout: 0.05
lora_r: 128
lr: 0.0002
max_grad_norm: 0.3
model_path: /gpfsdata/home/zhangchenkai/LHRS-Bot/Stage3/FINAL.pt
optimizer: adanp
opts: null
output: /gpfsdata/home/zhangchenkai/LHRS-Bot/output_vg
project: MaskIndexNet
prompt_template: llava_llama_2
quant_type: nf4
rank: 0
rgb_vision:
arch: vit_large
attn_pooler:
num_attn_heads: 16
num_layers: 6
num_query: 144
input_patchnorm: false
input_size:
patch_dropout: 0.0
tune_pooler: true
vit_name: openai/clip-vit-large-patch14
sar_vision:
activate: sigmoid
alpha: 0.2
arch: base
branch_temp: 0.07
decoder:
heads: 12
hidden_size: 768
layers: 12
mask_color: mean
mask_ratio: 0.6
focal_gamma: 1.0
in_chans: 2
input_size:
loss_weight: 1.0
n_queries: 256
online_temp: 0.1
reduction: none
residual: false
unmask_weight: 0.0
warmup_branch_temp: 0.04
warmup_branch_temp_epochs: 2
schedule:
decay_epochs: 30
decay_rate: 0.1
gamma: 0.1
min_lr: 2.0e-05
multisteps: []
name: cosine
warmup_epochs: 100
warmup_factor: 0.01
warmup_method: linear
seed: 322
stage: 0
text:
bos_token_id: 1
eos_token_id: 2
hidden_act: silu
hidden_size: 4096
initializer_range: 0.02
intermediate_size: 11008
max_position_embeddings: 2048
num_attention_heads: 32
num_hidden_layers: 32
pad_token_id: 0
path: /gpfsdata/home/zhangchenkai/download/Llama-2-7b-chat-hf
rms_norm_eps: 1e-5
tie_word_embeddings: false
use_cache: true
vocab_size: 32000
transform:
input_size:
rand_aug: rand-m5-n2-mstd0.5-inc1
tune_im_patch: false
tune_im_start: false
tune_rgb_bk: false
tune_rgb_pooler: false
use_checkpoint: false
wandb: false
wd: 0.0
workers: 1
world_size: 1
[09/25 16:00:08 train]: Creating model
/gpfsdata/home/zhangchenkai/miniconda3/envs/lhrs/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning:
resume_download
is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, useforce_download=True
.warnings.warn(
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:13<00:00, 6.95s/it]
[09/25 16:00:42 train]: Data Length: 1227
[09/25 16:00:42 train]: Loading pretrained checkpoint from /gpfsdata/home/zhangchenkai/LHRS-Bot/Stage3/FINAL.pt
[09/25 16:00:48 train]: Loading RGB encoder.
[09/25 16:00:48 train]: After loading RGB encoder: Missing: []. Unexpected: []
[09/25 16:00:48 train]: Loadding LoRA parameters.
Evaluating: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.23k/1.23k [13:36<00:00, 1.50it/s]
[09/25 16:14:37 train]: result file saved to /gpfsdata/home/zhangchenkai/LHRS-Bot/output_vg/eval_save_file.json
[09/25 16:14:37 train]: Accuracy: 71.54088050314465
[09/25 16:14:37 train]: Fail Sample: 3
[09/25 16:14:37 train]: Accuracy With Fail Sample: 71.20500782472612
检查代码发现,在main_vg.py计算iou的函数中:
def calculate_iou(box1, box2):
x1, y1, x2, y2 = box1
x3, y3, x4, y4 = box2
您对intersection_area 、box1_area 、box2_area 都进行了+1处理,如果我没理解错的话,传入的参数box1, box2都已经是归一化的结果,进行+1会导致结果完全不正确(intersection_area 的值会算出来比较大,导致了iou也较大)。您在 #18 中提供的rsvg_eval_save_file.json 通过简单的手动查证就可以发现,前几个预测和真实的框完全没有重叠:
如果按正确的方法计算iou,在RSVG上的结果只有1.88%
The text was updated successfully, but these errors were encountered: