Skip to content
This repository has been archived by the owner on Nov 23, 2023. It is now read-only.

not-edge loss value is too small, edge loss is nan/inf #14

Open
zhengtianyu1996 opened this issue Jan 7, 2019 · 4 comments
Open

not-edge loss value is too small, edge loss is nan/inf #14

zhengtianyu1996 opened this issue Jan 7, 2019 · 4 comments

Comments

@zhengtianyu1996
Copy link

Hi, I tried your affinity loss (not adaptive) as my loss function, my network is DeeplabV3+, MobileNet, My own dataset. I set margin=3.0, lambda1=1.0, lambda2=1.0
But there is something wrong with the loss, the not-edge loss is really small and not converge.

Here is a part of nor-edge loss value during training:
Mean Aff Loss is:[6.15826357e-05] Mean Aff Loss is:[7.15486458e-05] Mean Aff Loss is:[4.56848611e-05] Mean Aff Loss is:[5.51421945e-05] Mean Aff Loss is:[7.94407606e-05] Mean Aff Loss is:[0.000143873782] Mean Aff Loss is:[6.04316447e-05] Mean Aff Loss is:[9.94381699e-05] Mean Aff Loss is:[0.000107184518] Mean Aff Loss is:[6.87552383e-05] Mean Aff Loss is:[7.98113e-05] Mean Aff Loss is:[0.000122067388] Mean Aff Loss is:[5.42108719e-05]

As for edge loss value, it will alert Nan or Inf in the beginning. It troubles me so much :(

Could anyone give some advice?

@zhengtianyu1996
Copy link
Author

okay, update something about the Nan/Inf problem:

in losses.affinity_loss function
the edge seems good, the not_ignore seems good. But when it runs to tf.logical_and:

edge = tf.logical_and(edge, not_ignore)

the output 'edge' will be a completely zero-matrix. That means no effective value left. So the final edge_loss will meet problems.

I will continue debugging, hope could help someone.

@zhengtianyu1996
Copy link
Author

I think the problem is just caused by:
edge_indices = tf.where(tf.reshape(edge, [-1]))
because 'edge' sometimes will be a completely zero-matrix, so the shape of edge_indices will be (0,1) sometimes. Then
edge_loss = tf.gather(edge_loss, edge_indices)
will generate inf values.

So I think 'edge' and 'not_ignore' should be checked well. However, I still don't know whether it's a common problem, maybe it's related to the dataset itself. How do you think about it? @twke18

@arc144
Copy link

arc144 commented Jan 17, 2019

okay, update something about the Nan/Inf problem:

in losses.affinity_loss function
the edge seems good, the not_ignore seems good. But when it runs to tf.logical_and:

edge = tf.logical_and(edge, not_ignore)

the output 'edge' will be a completely zero-matrix. That means no effective value left. So the final edge_loss will meet problems.

I will continue debugging, hope could help someone.

I was looking how ignores_from_label and edges_from_label compute the edge map and it seems they compute it differently. ignores_from_label does it backwards, i.e. for st_y in range(2*size,-1,-size):, whereas edges_from_label does it forward, i.e. for st_y in range(0,2*size+1,size):.

Is it intentional? Could this be the source of the zero-matrix issue when edge = tf.logical_and(edge, not_ignore) is computed?

@xychenunc
Copy link

Have you guys obtained improved results using affinity field loss? I tried many times, but I can hardly get improved results over my baseline. I also have the same issue as yours and I just do not take the term which is Nan into account when computing the total loss.

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants