About the paper.
Opened this issue · 2 comments
anzosasuke commented
In figure 2 of the paper, there seems to be a input sequence, what are those input sequences? from the looks of it, it looks like the raw bytes of disassembled code.
davidepi commented
Yes, just the bytes belonging to the .text section of the binary.
the entire data is divided into chunks of 2048 bytes and submitted to the network.
during training a random portion of these chunks is replaced by zeroes, for the reason I explained in #1
anzosasuke commented
Hi, I would like to generate the dataset. How do you think I should proceed?. There is binaryds.py file in there. and similar tests files. I think you have used radare to retreive text section as well. Am I correct?