/malware-viz

Malware visualization in 2D using t-SNE

Primary LanguageJupyter Notebook

malware-viz

Malware visualization in 2D using t-SNE

contents:

  • sally.cfg : configuration file for the sally tool
  • stoptokens.txt : to eliminate 00 and ??
  • preprocess.sh and preprocess_test.sh : for preprocessing .bytes files
  • MalwareFeatExtAndViz.ipynb : notebook to experiment with feature extraction, t-SNE and plots
  • FeatSelectionTrainTest.ipynb : notebook that we used to generate the testing instances predictions using a t-SNE + SVM classifier
  • MalwareFeatureExtraction-spark-databricks.ipynb : notebook that we used on databricks to experiment with a pyspark implementation.
  • test_instances_predictions.csv : the predictions file of our late submission to kaggle (logloss = 0.1719). (A no-clue classifier scores 2.1972)