cisco/joy

Cannot run model.py

Closed this issue · 2 comments

Hi,

I am trying to run model.py to generate a new set of params, however I am encountering the following issue. I have double checked that there is indeed malware.gz under malware_train and benign.gz under benign_train, which are genrated by Joy. I see that the numbers of positive and negative are both zero, is there something wrong with opening the generated files?

Thanks very much for your help.
Xiaoban

/joy/analysis$ python model.py -m -l -t -p ../benign_train/ -n ../malware_train/ -o params.txt
Num Positive: 0
Num Negative: 0

Features Used:
Metadata (7)
Packet Lengths (100)
Packet Times (100)
Total Features: 207

/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/logistic.py:433: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.
FutureWarning)
Traceback (most recent call last):
File "model.py", line 150, in
main()
File "model.py", line 146, in main
learn_param(data, labels, args.output)
File "model.py", line 49, in learn_param
logreg.train(data, labels)
File "/home/acanets/joy/analysis/classifier.py", line 58, in train
self.logreg.fit(data,labels)
File "/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/logistic.py", line 1285, in fit
accept_large_sparse=solver != 'liblinear')
File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 756, in check_X_y
estimator=estimator)
File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 552, in check_array
"if it contains a single sample.".format(array))
ValueError: Expected 2D array, got 1D array instead:
array=[].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

I think the generated file by the command "./bin/joy bidir=1 dist=1 ../malware/*.pcap > ../malware_train/malware.gz" is not zipped file, hence in "data_parser.py" it reports error though the "try" has not raised any errors.

After I change "with gzip.open(json_file,'r') as fp:" to "with open(json_file,'r') as fp:", it works fine.

IF this is being performed on the latest code in the repo, when issuing the configure command make sure to add the "-gzip-enabled" flag so that gzip is turned on.

./configure -enable-gzip