Analysis interface design
Closed this issue · 2 comments
Currently, users can add and use analysis plug-ins as:
analog.add_analysis({"influence": InfluenceFunction}) # add your custom analysis
analog.influence.compute_influence_all(test_log, log_loader) # data attribution
analog.influence.compute_self_influence(test_log) # uncertainty
Issues
I believe this interface is somewhat unnatural for two reasons:
- Users can arbitrarily set the name for the analysis plugin, which can be confusing.
- While users directly access attributes of other handlers via
AnaLog
(e.g.analog.get_hessian_state()
callsanalog.hessian_handler.get_hessian_state()
, users need to directly specify the name of the analysis plug-in to user their attributes (e.g.analog.influence.compute_influence
).
Potential Solutions
- Make
AnaLog
return the analysis plugin that users can directly use:
if_computer = analog.add_analysis(InfluenceFunction)
if_computer.compute_influence_all(test_log, log_loader) # data attribution
if_computer.compute_self_influence(test_log) # uncertainty
- Similar to other handlers, manually implement proxy attributes in
AnaLog
for all existing analysis plug-ins' attributes
analog.compute_influence_all(test_log, log_loader) # data attribution
analog.compute_self_influence(test_log) # uncertainty
- Use
registry
as infairseq
(See https://github.com/facebookresearch/fairseq/blob/main/fairseq/models/transformer/transformer_legacy.py#L157).
@register_analysis_plugin("influence")
class InfluenceFunction(AnalysisBase):
...
@register_analysis("influence", "compute_influence")
@torch.no_grad()
def compute_influence(self, src, tgt, preconditioned=False, damping=None):
...
analog.compute_influence_all(test_log, log_loader) # data attribution
analog.compute_self_influence(test_log) # uncertainty
The first solution is useful when users want to try out their custom analysis tools, but the second solution may provide more consistent user programming interface. If done correctly, the third solution may enjoy the benefits of both 1 & 2, but I am not sure if this is achievable. Any opinions will be much appreciated!
Solution 1 looks like it would be clunkier when we have many analysis plugins running at once. To me, solutions 2/3 seem cleaner here, although again with a large enough set of plugins, it could also pollute the high-level namespace. Actually, I think that if the interface to add plugins was just analog.add_analysis([InfluenceFunction, ...])
and the plugins were defined using a @register_analysis_plugin("influence")
then keeping the .influence
namespace is a bit cleaner overall.
Although this solves issue 1, it doesn't really solve issue 2. But maybe if the plugins are not enforcing an API to perform the computations (like .compute_<plugin_name>()
, say), then maybe it is best to keep them in separate namespaces?
At the moment, we decided to give users both options, and see what they prefer. This is implemented in PR #85 .