A question on the source code(refere to datasets for train)
Opened this issue · 2 comments
Thanks for your amazing work! It helps a lot when I was struggled with the paper.
I hava notice that your datasets is using python code,therefore using python library to parser code into AST structure is very convinience. I am working on the project require using c/c++ code as my training data,any guide or thoughts ?
I did a bit diging and figure out there is no way to depend on a library to do all the work, I have tried the Clang ,but I have trouble to save the tree-based data into my local file.(this file will be used in trainging)
If any of this make sence to you,would please give me some hint or guide?
thanks again!
Please how did you run the code to do the AST.because I found a lot of scripts and i dont know which to choose and how can I transform the C data palgorithms to AST.thanks a lot
you can refer to my improved version here: https://github.com/bdqnghi/tbcnn-tensorflow, which includes more details instruction, and reproduce the results of the original paper using the original C dataset.