Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

NO_FIELDS #175

Closed
willtebbutt opened this issue Jun 28, 2020 · 15 comments · Fixed by #358
Closed

NO_FIELDS #175

willtebbutt opened this issue Jun 28, 2020 · 15 comments · Fixed by #358
Labels
design Requires some desgin before changes are made Structural Tangent Related to the `Tangent` type for structured (composite) values
Milestone

Comments

@willtebbutt
Copy link
Member

Currently NO_FIELDS is equal to Zero(). Arguably, a more natural thing would be to make it an empty Composite and ensure that empty Composites can be added to pretty much anything. Thoughts @oxinabox ?

@oxinabox
Copy link
Member

Yes, arguably that would be more natural.
But it seemed like a lot of effort at the time.

Also because Composites carry their primals it would need to do that also.

@willtebbutt
Copy link
Member Author

willtebbutt commented Jun 28, 2020

Yeah, this is really low priority in my opinion.

@nickrobinson251 nickrobinson251 added the Structural Tangent Related to the `Tangent` type for structured (composite) values label Jul 6, 2020
@nickrobinson251 nickrobinson251 added the design Requires some desgin before changes are made label Apr 8, 2021
@nickrobinson251
Copy link
Contributor

Maybe it should be DoesNotExist

JuliaDiff/ChainRules.jl#396 (comment)
JuliaDiff/FiniteDifferences.jl#153 (comment)

@willtebbutt
Copy link
Member Author

Yeah, this is really low priority in my opinion.

This seems like it's higher priority now 😂

@willtebbutt
Copy link
Member Author

willtebbutt commented Apr 8, 2021

There appear to be several different opinions on this.

For a structtype T, they are

  1. Zero() (the status quo)
  2. DoesNotExist()
  3. Composite{T, NamedTuple{(), Tuple{}}}

I am in favour of 3 because it's consistent, and AD likes consistency because it means that you need fewer special cases, and can derive more things automatically.

edit: updated the second type parameter in the Composite example. I think I've derived the correct NamedTuple subtype, but I might not have it perfectly correct. Either way, the point is that it's empty.

@oxinabox
Copy link
Member

oxinabox commented Apr 8, 2021

3 is annoying amounts of work to type, in almost every rule. Even as Composite{T}() somce tjuat os Composite{typeof(foo)}()

The main virtue of composite charrying an a primal type with a Composite is to prevent adding things that are just happening agree on field names. That isn't so much of a problem if there are no fields/it is an AbstractZero. Since even if adding the wrong things you will still get the right answer.

Aside: we should add iszero(::Composite{T}) = all(iszero, backing(x)) which will be true if it's backing is empty

@willtebbutt
Copy link
Member Author

willtebbutt commented Apr 8, 2021

3 is annoying amounts of work to type, in almost every rule. Even as Composite{T}() somce tjuat os Composite{typeof(foo)}()

Can we not just make NO_FIELDS this rather than Zero? Should only need to type it once :)

Aside: we should add iszero(::Composite{T}) = all(iszero, backing(x)) which will be true if it's backing is empty

Seems reasonable to me

edit: we'd need to modify NO_FIELDS. Maybe NO_FIELDS{T}?

@oxinabox
Copy link
Member

I still think either Zero or DoesNotExist() is the best type for this.
Writing NO_FIELDS{typeof(sin)} etc is just annoying and not helpful since it is never going to protect us from incorrect behavour.

While I agree consistency is nice, we expressly do allow multiple differential types for the same primal type.
And this is the least painful case for it, since it will never lead to ambiguities.
Even if some places Composite{typeof(sin)}() is used, and others Zero() is used, it is still fine and leads to the correct behavour -- not ambiguity error or a MethodError.
Unlike other cases like mixing structual and natural differentials e.g. MyArray(...) and Composiite{MyArray}(data=[...]), which can sometimes lead to ambiguities errors (i think that one is actually fine, which is cool).

@willtebbutt
Copy link
Member Author

While I don't disagree with anything in particular that you've said, I think there are other reasons to like consistency. Specifically:

  1. it's easy to explain why it is the way it is, because it follows logically from our decision to use Composites. Conversely, the Zero / DoesNotExist choice is harder to justify -- you could go with "it doesn't really matter" or "feels reasonable", but that feels less satisfying. Phrased differently, I would argue that there's really nothing to explain about using an empty Composite -- it's the logical choice -- whereas Zero / DoesNotExist requires at least some explanation.
  2. it doesn't introduce an additional edge-case which simplifies code in a number of places e.g. we could probably remove a conditional from our rand_tangent implementation. I'm not arguing that e.g. the total number of characters introduced will be less than having to change NO_FIELDS to NO_FIELDS{T}, I'm just saying that there is a trade-off to be made.
  3. it should make debugging AD easier at times, because you can trace where things have come from more straightforwardly when e.g. looking at @code_warntype for one of Zygote's pullbacks (where it's really helpful to have as much info as possible inside the types of the differentials).

AFAICT, the only advantage of Zero or DoesNotExist is the removal of the need to write NO_FIELDS{T} occassionally.

@nickrobinson251
Copy link
Contributor

I don't think writing Composite{T}() is so bad, and in the case where we renamed things (#305) we would probably spell this Tangent{T}(). And, other than Tangent arguably being a better name than Composite, i think Tangent{T}() might also be less mysterious than NO_FIELDS. (I dislike the spelling NO_FIELDS{T}() or @NO_FIELDS{T} or anything like that, as that seems most mysterious of all).

@oxinabox
Copy link
Member

I don't think we would name Composite as just Tangent: that is too generical, all the things are tangents.
#305 proposes naming it CompositeTangent (it is the tangent for composite types, and itself is a composite type).

I am still pretty strongly opposed to it.
It is CompositeTangent{typeof(besselj}() it is so long.
It is annoying to write, it makes defintions longer and harder to read.
And it is going to make the compiler have to work so much harder; lets not increase the compile times.
Remember it show up in basically every single rule.

julia> @time (()->Composite{typeof(sin)}())()
  0.000560 seconds (622 allocations: 44.603 KiB, 94.05% compilation time)
Composite{typeof(sin)}NamedTuple()

julia> @time (()->Zero())()
  0.000197 seconds (490 allocations: 37.922 KiB, 81.38% compilation time)
Zero()

@willtebbutt
Copy link
Member Author

The compilation time is a pretty compelling reason to like Zero.

@nickrobinson251
Copy link
Contributor

And it is going to make the compiler have to work so much harder

Yes, that's a much more compelling reason than the spelling (to me, since i think NO_FIELDSis already pretty ugly and more mysterious).

@oxinabox
Copy link
Member

oxinabox commented Apr 29, 2021

Thought about this more.
It should be DoesNotExist(). (rather than Zero())
Which is the type we use for if the input can not be perturbed.
This input can not be perturbed, normally we mean that as there is no epsion small change possible as it is discretely valued, or that perturbing it would lead to an errror (like in indexing) but in this case we mean it in that it is impossible to perturb it because there is nothing there to perturb

It is also what we will want it to be for purposes of filling in the right default type for (non-functor) functions in JuliaDiff/ChainRulesTestUtils.jl#117 so as to prevent our testing tools trying to pertube it.

Zero() can never be the differential for all instances of a type, only for the output of a pullback -- it represents that perturbing that input would have no effect on the output, it is disconnected from the computational graph.

@oxinabox
Copy link
Member

when we do this we might want to @deprecate_binding NO_FIELDS NoTangent()?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
design Requires some desgin before changes are made Structural Tangent Related to the `Tangent` type for structured (composite) values
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants