-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Allow multiple outputs for agg_mode=True in Feature Ablation #425
Conversation
This pull request was exported from Phabricator. Differential Revision: D22416476 |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Differential Revision: D22416476 fbshipit-source-id: 786acb543c9249465e132f65713693ad3d89101d
c32c797
to
7a9fef6
Compare
This pull request was exported from Phabricator. Differential Revision: D22416476 |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Differential Revision: D22416476 fbshipit-source-id: 67ca51aa79de0dee137ac90e3057dc4127a288ad
7a9fef6
to
7c983fa
Compare
This pull request was exported from Phabricator. Differential Revision: D22416476 |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Differential Revision: D22416476 fbshipit-source-id: 1b3d41e8096acb0dbdf0f9fd173c3cf46ecbe680
7c983fa
to
7fdbbac
Compare
This pull request was exported from Phabricator. Differential Revision: D22416476 |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Differential Revision: D22416476 fbshipit-source-id: 0d08ca990a1e999339e51f0a7fa50be197d2f3b9
7fdbbac
to
070efc8
Compare
This pull request was exported from Phabricator. Differential Revision: D22416476 |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Reviewed By: vivekmig Differential Revision: D22416476 fbshipit-source-id: eff7da94323e1e3c01d73ea377902df1bc6a4e76
070efc8
to
eae51dc
Compare
This pull request was exported from Phabricator. Differential Revision: D22416476 |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Reviewed By: vivekmig Differential Revision: D22416476 fbshipit-source-id: 344bc6db17e1bb04570e68ebc20a0a3da7c09c73
eae51dc
to
1cfc5e3
Compare
This pull request was exported from Phabricator. Differential Revision: D22416476 |
This pull request has been merged in eb3e758. |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Reviewed By: vivekmig Differential Revision: D22416476 fbshipit-source-id: d9094754ec31152a0a2199403a8b709b39a92d04
Description
What is aggregation output mode? It can be defined as:
When there is no 1:1 correspondence with the
num_examples
(batch_size
) and the amount of outputs your model produces, i.e. the model output size does not grow in size as thebatch_size
becomes larger.This allows for an arbitrary sized tensor to be output from the
forward_func
for feature ablation.Implementation Details
We assume
aggregation_output_mode
to be the case if:perturbations_per_eval == 1
and [feature_mask is None
or is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model hasbatch_size
outputs) andagg_output_mode=True
If
agg_output_mode == True
:1xOxF
whereO
is the number of output features andF
is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).If we are not in
agg_output_mode
we must ensure the number of elements isn
(batch_size
). If it is not, we output an error to the user. Here we could actually check if the element size is at leastn
, but for simplicity I am not doing this.Tests
Added tests to check for:
agg_mode=True
:fm.shape[0] > 1
)Fx1
tensor whereF
is the number of features in the input2x3x5
constant tensor (not associated to outputs)1x30
2D tensoragg_mode=False
:n
outputs wheren == batch_size
=> if not then check that we throw an exception (assertion error). This already exists intest_error_perturbations_per_eval_limit_batch_scalar
Notes
I created a new function rather than modifying
_find_output_mode_and_verify
; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.Differential Revision: D22416476