NSSL-SJTU/HermesSim

model training

yyhhxx opened this issue · 3 comments

When training the model, /Dataset-1/pairs/validation/validation_functions.csv is used. How is this file generated? The step of lifting and preprocess didn't generate such file.

Please refer to the previous work binary_function_similarity.

Sorry, I didn't found scrips to generate it. This file path is used in

func_info_csv_path=os.path.join(valdir, "validation_functions.csv"),
. But after running binary_function_similarity/blob/main/DBs/Dataset-1/Dataset-1 creation.ipynb, I only got the following files in the pairs folder:
│ └── validation
│ ├── neg_validation_Dataset-1.csv
│ └── pos_validation_Dataset-1.csv

Sorry, I thought that you were asking about another file previously.
I indeed forgot to upload the script to generate this file.
The /Dataset-1/pairs/validation/validation_functions.csv actually contains only a list of all functions in the validation dataset. And the binary functions with the same function symbol are assigned with the same group ID.
( I will try to find this script and upload it later... )