You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Background
Our grad mechanism currently only handles differentiation w.r.t. a single Variable. For certain use cases (like computing Jacobians or Hessians, or typical backprop on a large network), we might want partial derivatives with respect to all parameters in one pass.
Potential Approaches
Extend the Derivative class to store a tuple/list of variables.
Reuse single-variable derivatives but memoize or reuse computations so we don’t reevaluate sub-expressions multiple times.
Exploit the isomorphic hashing approach to avoid redundant computations (caching repeated sub-graphs).
Additional Context
This is important for scenarios like second-order optimization (Hessian-based methods) or certain advanced autodiff use cases.
The text was updated successfully, but these errors were encountered:
Background
Our
grad
mechanism currently only handles differentiation w.r.t. a singleVariable
. For certain use cases (like computing Jacobians or Hessians, or typical backprop on a large network), we might want partial derivatives with respect to all parameters in one pass.Potential Approaches
Derivative
class to store a tuple/list of variables.Additional Context
The text was updated successfully, but these errors were encountered: