Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

about temporal_loss? only need 2 frames, Would it be good to calculate the effect between many frames #4

Open
zhanghongyong123456 opened this issue Oct 11, 2022 · 9 comments

Comments

@zhanghongyong123456
Copy link

I see the basic loss calculation, it only takes two frames,Most of our actual videos are 30fps, so how good is the two-frame calculation? Is it necessary to add multiple consecutive frames for calculation?
image

@daipengwa
Copy link
Member

I agree with you, maybe you can apply some long-term contraints on more frames. In our experiment, using two frames brings us improved temporal consistency.

@zhanghongyong123456
Copy link
Author

I agree with you, maybe you can apply some long-term contraints on more frames. In our experiment, using two frames brings us improved temporal consistency.

I just have a simple idea, I don't fully understand this consistency loss(especially Multi-Scale Region-Level Relation Loss), can you give a general idea of the specific implementation of multi-frame time consistency, thank you very much, like samples have 10 frames,what should i do?

@zhanghongyong123456
Copy link
Author

I agree with you, maybe you can apply some long-term contraints on more frames. In our experiment, using two frames brings us improved temporal consistency.

  1. i debug code , Notice that the code is a little bit different from the paper,
    image
    image

@daipengwa
Copy link
Member

First, the motivation of using multi-scale design is because the distance between eye and screen are not fixed (e.g., the screen will cover a small area in your eyes when you stand at a far distance, vice versa).

Second, I think you can choose frames with random steps (t, t+?), or propagte it (t, t+2) = (t, t+1) + (t+1, t+2)? I am not sure

Third, thanks for pointing out this, the lamda should be put at the first part. you can freely change theses hyperparameters.

@zhanghongyong123456
Copy link
Author

ok,For the second point, it's use (t, t+n) or ((t, t+2) = (t, t+1) + (t+1, t+2) i try it,Thanks for your idea

@zhanghongyong123456
Copy link
Author

First, the motivation of using multi-scale design is because the distance between eye and screen are not fixed (e.g., the screen will cover a small area in your eyes when you stand at a far distance, vice versa).

Second, I think you can choose frames with random steps (t, t+?), or propagte it (t, t+2) = (t, t+1) + (t+1, t+2)? I am not sure

Third, thanks for pointing out this, the lamda should be put at the first part. you can freely change theses hyperparameters.

Hi, I would like to get your guidance, thank you very much

for second, i use temporal_loss for video matte, but test result is not good ,This my design config (temporal_loss_mode = 1, weight_t=50)

  1. Is my design correct? First calculate the difference of the images in sequence, add them, and finally perform the L1 loss calculation uniformly
    <0> mode == 0
    image
    < 1> mode == 1
    image
  2. for output=alpha ,Is mode 0(basic relation-based loss) better than mode 1(multi-scale relation-based loss)?
    Because the alpha output is just a black and white image,No need for multiscale image

@daipengwa
Copy link
Member

  1. For your design, if the img_count=40, finally, the gt_error=img(39)-img(0). you skip all the frames between (0~39), only keep the long-term change between 0 and 39.
  2. I suggest you begin with the basic one, then add other designs to see if the quality will be improved?

@zhanghongyong123456
Copy link
Author

  1. For your design, if the img_count=40, finally, the gt_error=img(39)-img(0). you skip all the frames between (0~39), only keep the long-term change between 0 and 39.
  2. I suggest you begin with the basic one, then add other designs to see if the quality will be improved?

ok ,
sorry i made a mistake, i see, If there are multiple frames of images, it should be like this, right?
image

@onlyinheaven
Copy link

  1. For your design, if the img_count=40, finally, the gt_error=img(39)-img(0). you skip all the frames between (0~39), only keep the long-term change between 0 and 39.
  2. I suggest you begin with the basic one, then add other designs to see if the quality will be improved?

ok , sorry i made a mistake, i see, If there are multiple frames of images, it should be like this, right? image

Your discussion is very interesting. I am currently also experimenting with similar things. I would like to know if you have figured out how to implement temporal loss between multiple images in the end.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants