facebookresearch/CodeGen

Could any one help me with this error, failed to learn bpe.

dinaalaaahmed opened this issue · 0 comments

INFO - 11/18/21 08:01:43 - 0:01:18 - training bpe on /home/dina/CodeGen/data/test_dataset/cpp-java-python.sa-cl.tok.shuf.50gb...
Traceback (most recent call last):
File "/home/dina/miniconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/dina/miniconda3/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/dina/CodeGen/codegen_sources/preprocessing/preprocess.py", line 214, in
preprocess(args)
File "/home/dina/CodeGen/codegen_sources/preprocessing/preprocess.py", line 102, in preprocess
dataset.learn_bpe(ncodes=args.ncodes, executor=cluster_train_bpe)
File "/home/dina/CodeGen/codegen_sources/preprocessing/dataset_modes/dataset_mode.py", line 589, in learn_bpe
self._learn_bpe(ncodes, executor)
File "/home/dina/CodeGen/codegen_sources/preprocessing/dataset_modes/monolingual_functions_mode.py", line 123, in _learn_bpe
job.result()
File "/home/dina/.local/lib/python3.8/site-packages/submitit/core/core.py", line 263, in result
r = self.results()
File "/home/dina/.local/lib/python3.8/site-packages/submitit/core/core.py", line 291, in results
raise job_exception # pylint: disable=raising-bad-type
submitit.core.utils.FailedJobError: Job (task=0) failed during processing with trace:

Traceback (most recent call last):
File "/home/dina/.local/lib/python3.8/site-packages/submitit/core/submission.py", line 53, in process_job
result = delayed.result()
File "/home/dina/.local/lib/python3.8/site-packages/submitit/core/utils.py", line 122, in result
self._result = self.function(*self.args, **self.kwargs)
File "/home/dina/CodeGen/codegen_sources/preprocessing/bpe_modes/fast_bpe_mode.py", line 53, in learn_bpe_file
assert (
AssertionError: failed to learn bpe on /home/dina/CodeGen/data/test_dataset/cpp-java-python.sa-cl.tok.shuf.50gb, command: /home/dina/CodeGen/codegen_sources/model/tools/fastBPE/fast learnbpe 50000 /home/dina/CodeGen/data/test_dataset/cpp-java-python.sa-cl.tok.shuf.50gb > /home/dina/CodeGen/data/test_dataset/cpp-java-python.sa-cl.codes


You can check full logs with 'job.stderr(0)' and 'job.stdout(0)'or at paths:

  • /home/dina/CodeGen/data/test_dataset/log/5615_0_log.err
  • /home/dina/CodeGen/data/test_dataset/log/5615_0_log.out