Skip to content

Commit 5e8177d

Browse files
99warriorsfacebook-github-bot
authored andcommitted
modify tracin self influence helpers
Summary: change `TracInCP._self_influence_batch_tracincp` and `TracInCP._self_influence_batch_tracincp` `TracInCP._self_influence_batches_tracincp_fast` to be named `self_influence`, which is now public, and now accept a DataLoader yielding batches (as well as a single batch, as before). The modified helper function can be called by external functions to compute self influence. The helper itself is also changed to improve efficiency, by reducing the number of times checkpoints are loaded. The modified helper, despite being able to compute self influence scores for a dataloader yielding batches, still only loads each checkpoint once, per call. This is because the modified helper now has an outer iteration over checkpoints, and an inner iteration over batches (the order of iteration is reversed compared to before). This helper is called by `influence` when running it in self influence mode. The reason we cannot just increase the batch size to reduce the number of checkpoint loadings is that for large models (precisely those for which loading checkpoints is expensive), the model takes up too much memory, so that the batch size cannot be too large. Minor change: for `influence_src_dataset` argument of all `__init__`'s, add description of what assumptions we make of the batches yielded by the dataloader. Reviewed By: NarineK Differential Revision: D35603078 fbshipit-source-id: aff397c8278d60f1eb93f126d9703fe447c6ca71
1 parent b84980a commit 5e8177d

File tree

5 files changed

+636
-294
lines changed

5 files changed

+636
-294
lines changed

0 commit comments

Comments
 (0)