Abstract layers #525

datnamer · 2018-12-20T22:24:28Z

I was looking over some tensorflow source code for some models I was thinking of porting to flux and I noticed that there are a fair amount of concrete layer inheritance examples, such as: https://github.com/stanfordnlp/mac-network/blob/master/mac_cell.py

The layers in flux seem to be concrete structs so inheritance is obviously disallowed. For code reuse, would it make sense to move to an abstract layer hierarchy? Or, would composition work just as well?

For example, I'd like to be able to subclass RNN and write my own MAC cell.

MikeInnes · 2018-12-21T10:47:51Z

Yes, composition should be just fine here. I don't think there's any great benefit to having e.g. an abstract RNN cell. You can look at the existing RNN layers in Flux if you want to see how we reuse infrastructure for statefulness and such.

datnamer · 2018-12-21T17:12:58Z

Ok cool I'll take a look at doing that.

datnamer · 2019-02-26T00:09:15Z

Reopening this issue per the recommendation of @MikeInnes to discuss how the changes in #628 will interact with abstract inheritance of layers. I don't see flux moving to an inheritance based design, but I'd like to have the option to do so my own packages.

Would this not dovetail with the whole differentiable programming push since it is necessary to work with more of Julia's design paradigms for conventional programming?

MikeInnes · 2019-02-26T08:35:54Z

Would this not dovetail with the whole differentiable programming push since it is necessary to work with more of Julia's design paradigms for conventional programming?

I think this issue is essentially independent of #628, for exactly this reason. The AD affects the interface at the boundary where you get gradients, but doesn't have anything to say about what f contains or how you express models. So if you can write down the forward pass using composition, abstract types, traits, or even metaprogramming then that should be fine.

Is there a specific concern about using abstract types? I suspect the biggest issue really is:

julia> (::AbstractRNN)(x) = x
ERROR: cannot add methods to an abstract type

but again if you were to work around this with an apply function or something, the AD will happily work with it.

darsnack · 2021-02-12T23:30:20Z

Flux should support all the inheritance available in Julia. Deeply nested type hierarchies are discouraged in Julia, but if you wish to do them, then it should be no problem for Flux. Re-open if there is a specific issue.

churchofthought · 2021-03-02T04:47:51Z

Can anyone give me an example about best practices on inheriting from the Conv layer?
I am new to Julia and am just trying to make an altered convolution (deformable convolution) without rolling everything myself.

It seems I must use the @forward macro from Lazy.jl along with composition?

darsnack · 2021-03-02T16:45:02Z

Julia doesn't allow inheriting from concrete types like Conv. The correct way to reuse the code will depend on the final use-case. I am not familiar with deformable convolutions, so perhaps you can explain exactly what you are trying to do a bit more (FYI, discourse.julialang.org might be better for this kind of discussion...I think you'll get more opinions).

A quick search tells me that deformable convolutions sample from parts of the image offset from the sliding window location? There are two ways I can think of doing this functionality.

First, is the obvious way — to change the index into the input array to get the "deformed" pixel. This would be fairly deep into the lower levels of the forward pass. In Flux, this is handled by NNlib.conv (separate package), and the actual Conv layer is a thin wrapper around calling this function. You would need to write your own version of that function, so in this case, there isn't much reuse even if you could inherit from Conv.

Second, you could shuffle/sample the input array before passing it to a standard convolution layer so that the "deformed" pixels end up in the correct locations when a standard sliding window is applied. This would be the cleanest way to do this in Julia IMO. You would have:

# define a struct that wraps a conv layer and a sample function
struct DeformedConv{T, F}
  sample::F
  conv::T
end

# simple syntactic sugar to make constructing the layer more intuitive
DeformedConv(sample, args...; kwargs...) = DeformedConv(sample, Conv(args...; kwargs...))

# this will make your struct behave like other layers in Flux
# see https://fluxml.ai/Flux.jl/stable/models/advanced/
@functor DeformedConv (conv,)

# define the forward pass
# first re-sample the input array
# then apply the convolution to that array
(m::DeformedConv)(x) = m.conv(m.sample(x))

Hope that helps! If you have more questions, please ask on Discourse which is better suited for such discussions.

churchofthought · 2021-03-02T22:20:14Z

Thank you @darsnack . This is exactly what I needed. 👍💯

datnamer closed this as completed Dec 21, 2018

datnamer reopened this Feb 26, 2019

darsnack closed this as completed Feb 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abstract layers #525

Abstract layers #525

datnamer commented Dec 20, 2018 •

edited

Loading

MikeInnes commented Dec 21, 2018

datnamer commented Dec 21, 2018

datnamer commented Feb 26, 2019

MikeInnes commented Feb 26, 2019

darsnack commented Feb 12, 2021

churchofthought commented Mar 2, 2021 •

edited

Loading

darsnack commented Mar 2, 2021

churchofthought commented Mar 2, 2021

Abstract layers #525

Abstract layers #525

Comments

datnamer commented Dec 20, 2018 • edited Loading

MikeInnes commented Dec 21, 2018

datnamer commented Dec 21, 2018

datnamer commented Feb 26, 2019

MikeInnes commented Feb 26, 2019

darsnack commented Feb 12, 2021

churchofthought commented Mar 2, 2021 • edited Loading

darsnack commented Mar 2, 2021

churchofthought commented Mar 2, 2021

datnamer commented Dec 20, 2018 •

edited

Loading

churchofthought commented Mar 2, 2021 •

edited

Loading