Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Abstract layers #525

Closed
datnamer opened this issue Dec 20, 2018 · 8 comments
Closed

Abstract layers #525

datnamer opened this issue Dec 20, 2018 · 8 comments

Comments

@datnamer
Copy link

datnamer commented Dec 20, 2018

I was looking over some tensorflow source code for some models I was thinking of porting to flux and I noticed that there are a fair amount of concrete layer inheritance examples, such as: https://github.com/stanfordnlp/mac-network/blob/master/mac_cell.py

The layers in flux seem to be concrete structs so inheritance is obviously disallowed. For code reuse, would it make sense to move to an abstract layer hierarchy? Or, would composition work just as well?

For example, I'd like to be able to subclass RNN and write my own MAC cell.

@MikeInnes
Copy link
Member

Yes, composition should be just fine here. I don't think there's any great benefit to having e.g. an abstract RNN cell. You can look at the existing RNN layers in Flux if you want to see how we reuse infrastructure for statefulness and such.

@datnamer
Copy link
Author

Ok cool I'll take a look at doing that.

@datnamer datnamer reopened this Feb 26, 2019
@datnamer
Copy link
Author

Reopening this issue per the recommendation of @MikeInnes to discuss how the changes in #628 will interact with abstract inheritance of layers. I don't see flux moving to an inheritance based design, but I'd like to have the option to do so my own packages.

Would this not dovetail with the whole differentiable programming push since it is necessary to work with more of Julia's design paradigms for conventional programming?

@MikeInnes
Copy link
Member

Would this not dovetail with the whole differentiable programming push since it is necessary to work with more of Julia's design paradigms for conventional programming?

I think this issue is essentially independent of #628, for exactly this reason. The AD affects the interface at the boundary where you get gradients, but doesn't have anything to say about what f contains or how you express models. So if you can write down the forward pass using composition, abstract types, traits, or even metaprogramming then that should be fine.

Is there a specific concern about using abstract types? I suspect the biggest issue really is:

julia> (::AbstractRNN)(x) = x
ERROR: cannot add methods to an abstract type

but again if you were to work around this with an apply function or something, the AD will happily work with it.

@darsnack
Copy link
Member

Flux should support all the inheritance available in Julia. Deeply nested type hierarchies are discouraged in Julia, but if you wish to do them, then it should be no problem for Flux. Re-open if there is a specific issue.

@churchofthought
Copy link
Contributor

churchofthought commented Mar 2, 2021

Can anyone give me an example about best practices on inheriting from the Conv layer?
I am new to Julia and am just trying to make an altered convolution (deformable convolution) without rolling everything myself.

It seems I must use the @forward macro from Lazy.jl along with composition?

@darsnack
Copy link
Member

darsnack commented Mar 2, 2021

Julia doesn't allow inheriting from concrete types like Conv. The correct way to reuse the code will depend on the final use-case. I am not familiar with deformable convolutions, so perhaps you can explain exactly what you are trying to do a bit more (FYI, discourse.julialang.org might be better for this kind of discussion...I think you'll get more opinions).

A quick search tells me that deformable convolutions sample from parts of the image offset from the sliding window location? There are two ways I can think of doing this functionality.

First, is the obvious way — to change the index into the input array to get the "deformed" pixel. This would be fairly deep into the lower levels of the forward pass. In Flux, this is handled by NNlib.conv (separate package), and the actual Conv layer is a thin wrapper around calling this function. You would need to write your own version of that function, so in this case, there isn't much reuse even if you could inherit from Conv.

Second, you could shuffle/sample the input array before passing it to a standard convolution layer so that the "deformed" pixels end up in the correct locations when a standard sliding window is applied. This would be the cleanest way to do this in Julia IMO. You would have:

# define a struct that wraps a conv layer and a sample function
struct DeformedConv{T, F}
  sample::F
  conv::T
end

# simple syntactic sugar to make constructing the layer more intuitive
DeformedConv(sample, args...; kwargs...) = DeformedConv(sample, Conv(args...; kwargs...))

# this will make your struct behave like other layers in Flux
# see https://fluxml.ai/Flux.jl/stable/models/advanced/
@functor DeformedConv (conv,)

# define the forward pass
# first re-sample the input array
# then apply the convolution to that array
(m::DeformedConv)(x) = m.conv(m.sample(x))

Hope that helps! If you have more questions, please ask on Discourse which is better suited for such discussions.

@churchofthought
Copy link
Contributor

Thank you @darsnack . This is exactly what I needed. 👍💯

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants