CogComp/saul

training with disconnected nodes to head

kordjamshidi opened this issue · 5 comments

For the moment when training a set of classifiers, we use the most global object i.e. head and the other objects (of the destination nodes) are derived from the head. This makes us lose all the examples that are not connected to a head object and brings the performance of the single classifiers down in the jointLearning setting. One way to fix this is that to use the edge from head node to the specific destination node that we need and then use all the instances of the destination node. This is versus the current implementation in which we get the instances of the head node first and then go from each head instance to a the connected instances of the destination node.

This step is not clear to me:

One way to fix this is that to use the edge from head node to the specific destination node that we need and then use all the instances of the destination node.

Could you elaborate? Or give example?

I have x=node[HEAD], y=node[T] and I have edge e=edge(x,y), and I have cc ConstrainedClassifier[T, HEAD](base-classifier); now this gives us the training instances of y starting from x: x.getraininginstances.foeach{t->cc.getCandidates(t)} . But we need to directly get all instances of y. I was thinking if possible to use the edge between nodes -not the instances-, go to y and there gettrainingistances. Now, I think another solution is to write the ConstraintClassifier also node based same as Learnable. If we have access to the nodes in there we can get all the training instances directly from the destination node.

Edge between nodes: The edges between nodes would give ALL the instances, not the relevant subset. Am I misunderstanding?
I'm still not clear. Do you want to delete definition of pathToHead and just use the node information?

Relevant to #322

Edges will give all the connected instances. But we need to think about it and make it well defined in every situation. I am talking about using the examples that are not necessarily connected to a head object but still can serve as examples for the single classifiers. For example in CoNLL relations are not a good starting point for deriving the CoNLLToken examples. We miss many of those if we just use the edges from relations to tokens.

👍 Agree with need to simplification. We can still have pathToHead as the edge specification but internally we should deal directly with node items. We do this for the test method in Constrained Classifier. The current test method look at all instances of head that may not be participating in any constraint.