Participant %10: Team Ericsson-RISE, Ericsson & RISE
chenzimin opened this issue · 3 comments
Created for Jesper and Olof from Ericsson and RISE for discussions. Welcome!
Copy+paste from email sent to the organizers
Shared here for the sake of openness:
I am sorry to say that despite our effort in beating string-distance we have failed.
We will therefore not submit any new final version, it would at best perform on par with our previous model but wasting alot more cycles.
Just some background on what we have done for the sake of openness:
We set a challenging condition upon ourselves that whatever model we trained, we wanted it o be a learned model without relying on any parser or AST.
The main model was a character based bidirectional RNN where we encoded each line (replacement line) and candidate line (+-1 line for context).
We then used cosine similarity between these two embeddings to determine if these should be replaced or not.
Unfortunatly it did not perform very well, that simple model had a recall@1 score of about 0.65 on dataset 4 (significantly lower than string distance) and its predictions had a very high correlation with string distance.
I think we are fairly confident in that whatever it learned, it was something very related to string distance.
We also tried to boost that by using an ensemble architecture where we had the replacement line as input (this time as a bag-of-word features) and output would be which model (out of several we tried) would perform best on this particular replacement line.
But since the RNN had such a high similarity with string distance it was not very distinguishable.
That means that we will have to give up at this point.
Thank you for organizing this challenge and I think we have a new respect for ML in learning formal languages.
Thanks again,
Jesper & Olof
I think we are fairly confident in that whatever it learned, it was something very related to string distance.
This is interesting per se, that we can learn meaningful string distance metrics, and may be something equivalent to tf-idf, in a blackbox manner.
Maybe, although we dont really know that, we havent looked into the details that deeply.
Worth mentioning though is that we used a character based model, so it does not necessarily learn words.