Skip to content

Add different swish implementations #88

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 1 commit into from
Oct 15, 2019
Merged

Conversation

qubvel
Copy link
Contributor

@qubvel qubvel commented Oct 14, 2019

Add different swish implementations:

  1. Memory efficient swish
  • GPU memory friendly
  • Less computationally efficient while training
  • Does not supported by torch.jit / torch.onnx
  1. Original swish (x * torch.sigmoid(x))
  • Less memory efficient
  • More computationally efficient while training
  • Model can be saved with torch.jit / torch.onnx

Default: memory efficient
Model swish implementation can be changed by .set_swish(memory_efficient=False/True) method

@lukemelas lukemelas merged commit 8a5da1d into lukemelas:master Oct 15, 2019
@glenn-jocher
Copy link

@qubvel thanks for function! I've tried to implement this in our repo: https://github.com/ultralytics/yolov3, but get worse results (lower mAP and higher loss) when compared to a default Swish() class. Do you know why this might be? See ultralytics/yolov3#441 (comment)

@cswwp
Copy link

cswwp commented Aug 5, 2020

@qubvel If i train with Memory efficient swish, and exporting model.pt with model.set_swish(memory_efficient=False) + torch.jit.trace(model, example), will it hurt the score?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants