
Magnum-NLC2CMD is the winning solution for the NeurIPS 2020 NLC2CMD challenge.

  • numpy
  • six
  • nltk
  • experiment-impact-tracker
  • scikit-learn
  • pandas
  • flake8==3.8.3
  • spacy==2.3.0
  • tb-nightly==2.3.0a20200621
  • tensorboard-plugin-wit==1.6.0.post3
  • torch==1.6.0
  • torchtext==0.4.0
  • torchvision==0.7.0
  • tqdm==4.46.1
  • OpenNMT-py==2.0.0rc2

How it works


  1. Create a virtual environment with python3.6 installed(virtualenv)
  2. git clone --recursive https://github.com/magnumresearchgroup/Magnum-NLC2CMD.git
  3. use pip3 install -r requirements.txt to install the two requirements files.

Data pre-processing

  1. Run python3 main.py --mode preprocess --data_dir src/data --data_file nl2bash-data.json and cd src/model && onmt_build_vocab -config nl2cmd.yaml -n_sample 10347 --src_vocab_threshold 2 --tgt_vocab_threshold 2 to process raw data.
  2. You can also download the Original raw data here


  1. cd src/model && onmt_train -config nl2cmd.yaml
  2. Modify the world_size in src/model/nl2cmd.yaml to the number of GPUs you are using and put the ids as gpu_ranks.
  3. You can also download one of our pre-trained model here


  1. onmt_translate -model src/model/run/model_step_2000.pt -src src/data/invocations_proccess_test.txt -output pred_2000.txt -gpu 0 -verbose


  1. python3 main.py --mode eval --annotation_filepath src/data/test_data.json --params_filepath src/configs/core/evaluation_params.json --output_folderpath src/logs --model_dir src/model/run --model_file model_step_2400.pt model_step_2500.pt

  2. You can change the gpu=-1 in src/model/predict.py to gpu=0, and replace the code in src/model/predict.py accordingly with the following code for faster inference time

    invocations = [' '.join(tokenize_eng(i)) for i in invocations]
    translated = translator.translate(invocations, batch_size=n_batch)
    commands = [t[:result_cnt] for t in translated[1]]
    confidences = [ np.exp( list(map(lambda x:x.item(), t[:result_cnt])) )/2 for t in translated[0]]
    for i in range(len(confidences)):
        confidences[i][0] = 1.0


Accuracy metric

𝑆𝑐𝑜𝑟𝑒(𝐴(𝑛𝑙𝑐))=max𝑝∈𝐴(𝑛𝑙𝑐)𝑆(𝑝) if ∃𝑝∈𝐴(𝑛𝑙𝑐) such that 𝑆(𝑝)>0;

𝑆𝑐𝑜𝑟𝑒(𝐴(𝑛𝑙𝑐))=1|𝐴(𝑛𝑙𝑐)|∑𝑝∈𝐴(𝑛𝑙𝑐)𝑆(𝑝) otherwise.


  1. We used 2x Nvidia 2080Ti GPU + 64G memory machine running Ubuntu 18.04 LTS
  2. Change the batch_size in nl2cmd.yaml to the largest your GPU can support without OOM error
  3. Train multiple models by modify seed in nl2cmd.yaml, you should also modify the save_model to avoid overwrite existing models.
  4. Hand pick the best performed ones on local test set and put their directories in the main.py



This work was supported in part by NSF Award# 1552836, At-scale analysis of issues in cyber-security and software engineering.


See the LICENSE file for license rights and limitations (MIT).