Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Allow multiple outputs for agg_mode=True in Feature Ablation #425

Closed

Conversation

miguelmartin75
Copy link

@miguelmartin75 miguelmartin75 commented Jul 9, 2020

Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the num_examples (batch_size) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the batch_size becomes larger.

This allows for an arbitrary sized tensor to be output from the forward_func for feature ablation.


Implementation Details

We assume aggregation_output_mode to be the case if: perturbations_per_eval == 1 and [ feature_mask is None or is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has batch_size outputs) and agg_output_mode=True

If agg_output_mode == True:

  • Feature ablation will output a tensor of shape 1xOxF where O is the number of output features and F is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in agg_output_mode we must ensure the number of elements is n (batch_size). If it is not, we output an error to the user. Here we could actually check if the element size is at least n, but for simplicity I am not doing this.

Tests

Added tests to check for:

agg_mode=True:

  • Incorrect feature mask (i.e. where fm.shape[0] > 1)
  • Output a Fx1 tensor where F is the number of features in the input
  • The above but for a feature mask with the first two features treated as one feature
  • Output a 2x3x5 constant tensor (not associated to outputs)
    • internally this will be interpreted as a 1x30 2D tensor

agg_mode=False:

  • Check there is exactly n outputs where n == batch_size => if not then check that we throw an exception (assertion error). This already exists in test_error_perturbations_per_eval_limit_batch_scalar

Notes

I created a new function rather than modifying _find_output_mode_and_verify; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request Jul 10, 2020
…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

fbshipit-source-id: 786acb543c9249465e132f65713693ad3d89101d
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request Jul 10, 2020
…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

fbshipit-source-id: 67ca51aa79de0dee137ac90e3057dc4127a288ad
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request Jul 10, 2020
…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

fbshipit-source-id: 1b3d41e8096acb0dbdf0f9fd173c3cf46ecbe680
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request Jul 12, 2020
…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

fbshipit-source-id: 0d08ca990a1e999339e51f0a7fa50be197d2f3b9
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request Jul 16, 2020
…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Reviewed By: vivekmig

Differential Revision: D22416476

fbshipit-source-id: eff7da94323e1e3c01d73ea377902df1bc6a4e76
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Reviewed By: vivekmig

Differential Revision: D22416476

fbshipit-source-id: 344bc6db17e1bb04570e68ebc20a0a3da7c09c73
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in eb3e758.

NarineK pushed a commit to NarineK/captum-1 that referenced this pull request Nov 19, 2020
…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Reviewed By: vivekmig

Differential Revision: D22416476

fbshipit-source-id: d9094754ec31152a0a2199403a8b709b39a92d04
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants