ZGC-LLM-Safety/TrafficLLM

NotADirectoryError: [Errno 20] Not a directory: '../datasets/raw_data/ustc-tfc-2016/ustc-tfc-2016_detection_packet_test.json'

Opened this issue · 2 comments

Dear author,

I encountered an issue when running the command:

python preprocess_dataset.py --input /Your/Raw/Dataset/Path --dataset_name /Your/Raw/Dataset/Name --traffic_task detection --granularity packet-level --output_path /Your/Output/Dataset/Path --output_name /Your/Output/Dataset/Name

The error is:

NotADirectoryError: [Errno 20] Not a directory: '../datasets/raw_data/ustc-tfc-2016/ustc-tfc-2016_detection_packet_test.json'

I downloaded the ustc-tfc-2016 files from training datasets. Could you please confirm whether the files from this link are already preprocessed or if they are the raw files needed for running the script?

Thank you!

The training datasets are already preprocessed and can be directly used to train LLMs in step 2.4 and 2.5. The preprocess codes only work for extracting training data from raw traffic (i.e., .pcap files). If you want to reproduce the process of extracting the training data from the raw dataset of USTC TFC 2016, please download the raw dataset using its released link.

I hope this reply can help you.

The training datasets are already preprocessed and can be directly used to train LLMs in step 2.4 and 2.5. The preprocess codes only work for extracting training data from raw traffic (i.e., .pcap files). If you want to reproduce the process of extracting the training data from the raw dataset of USTC TFC 2016, please download the raw dataset using its released link.

I hope this reply can help you.

thank you very much