found two bugs that could cause your inferior performance than original paper
jind11 opened this issue ยท 17 comments
Hi, I have carefully read your code and found two bugs that could potentially cause your inferior performance compared with the original paper. 1. in the new_convolution function, there is missing use of self.tanh() as the activation after the convolution layer; 2. in the original paper, the convolution kernel is 1 since the input is already trigram, so there is no need to use kernel size of 3 in the new_convolution function. If you have doubt on my comments, welcome to discuss with me, thanks!
ohh.i think the conv function include tanh activation. it is an interesting understanding on kernel. i think you may right. i will reconsider the kernel implementation. thanks for your discussion. keep connection
hi, thanks for the quick response. But I am sure the conv function at Pytorch does not include any activation function.
i had read paper second time and found that kernel should be 1. cnn should not comprise activation, i searched that. could we build some new connection? you helped a lot. thank you
sure! my email is jindi15@mit.edu. And I also have some other revisions to your code, if you want, I can sent it to you for references.
copy that. so nice to you.
hi, what is your email? I have some other questions for the algorithm understanding. Of if you are in US, we can have a phone call. My phone number is 617-710-6221
sorry, i had sent an email to you 4 days ago. maybe my gmail can not work in the wall. oh, i am in China by the way. the policy of vpn reduce our space further.๐๐๐
my working email is dgai_ruc@aliyun.com.
hi, sent you an email to dgai_ruc@aliyun.com but did not get reply. my wechat is jindi930617, feel free to add me if you want
Hi everybody,
I cannot reproduce the results at all (~25%). What were your final macro F1-score ?
Best
@lawlietAi hi, I have sent you a email for the same confusion. Looking forward to your reply
Hi, any progress? What's your F1 score now?
Hi, did you reproduce the performance declared in the original paper? I implemented it with TF, but the performance is much lower.
no, I can never replicate the performance in that paper. my performance is not good, around 82.7%
no, I can never replicate the performance in that paper. my performance is not good, around 82.7%
why my accuracy of model just is 25% ?
Hi everybody,
I cannot reproduce the results at all (~25%). What were your final macro F1-score ?
Best
Have you found the problem of accuracy with 25%?