Missing HDFS.log_structured.csv file
alishan2040 opened this issue · 2 comments
Hello, I was trying to run deeplog on hdfs dataset but ended up with the following error.
command I used:
!python main_run.py --folder=bgl/ --log_file=HDFS.log --dataset_name=hdfs --model_name=deeplog --window_type=sliding\ --sample=sliding_window --is_logkey --train_size=0.8 --train_ratio=1 --valid_ratio=0.1 --test_ratio=1 --max_epoch=100\ --n_warm_up_epoch=0 --n_epochs_stop=10 --batch_size=1024 --num_candidates=150 --history_size=10 --lr=0.001\ --accumulation_step=5 --session_level=hour --window_size=60 --step_size=60 --output_dir=experimental_results/demo/random/ --is_process
Are we supposed to run other scripts first to generate such files (for example data_loader.py
or synthesize.py
)
Can we re-run the code with other formats of HDFS dataset which are publicly available?
Thanks,
Do you solve this problem? How can we get this structured.csv?
You can use logparser(can be found in github) to preprocess HDFS dataset, and it can generate HDFS.log_structured.csv