xiaojunxu/dnn-binary-code-similarity

How dose the features come from?

Opened this issue · 4 comments

Hi,

When I looked at the data, I saw that in each line, you have features of each node. I'm wondering how you generated those features. I didn't find that in your paper, either. I really appreciate your reply. Thanks a lot.

Hello, do you know how to generate ACFG

I would like to know how they generated input feature json file as well. I don't have any idea to change my binary into the valid input format

Actually, I noticed that Genius has used the same feature as Gemini and this is the repo of Genius: https://github.com/qian-feng/Gencoding

What confuses me now is that Gemini seems to have 7 features, but Genius generates 8 features. Besides, they all illustrate 8 features in their paper but the sequence of generated features seems different from that. Moreover, I did not find the feature: Betweenness. That's wired.

Actually, I noticed that Genius has used the same feature as Gemini and this is the repo of Genius: https://github.com/qian-feng/Gencoding

What confuses me now is that Gemini seems to have 7 features, but Genius generates 8 features. Besides, they all illustrate 8 features in their paper but the sequence of generated features seems different from that. Moreover, I did not find the feature: Betweenness. That's wired.

I think they removed Betweenness , it requires only basic block-level attributes and the number of offspring (which is cheap to compute)