LogIntelligence/LogADEmpirical

Missing HDFS.log_structured.csv file

alishan2040 opened this issue · 2 comments

Hello, I was trying to run deeplog on hdfs dataset but ended up with the following error.

image

command I used:
!python main_run.py --folder=bgl/ --log_file=HDFS.log --dataset_name=hdfs --model_name=deeplog --window_type=sliding\ --sample=sliding_window --is_logkey --train_size=0.8 --train_ratio=1 --valid_ratio=0.1 --test_ratio=1 --max_epoch=100\ --n_warm_up_epoch=0 --n_epochs_stop=10 --batch_size=1024 --num_candidates=150 --history_size=10 --lr=0.001\ --accumulation_step=5 --session_level=hour --window_size=60 --step_size=60 --output_dir=experimental_results/demo/random/ --is_process

Are we supposed to run other scripts first to generate such files (for example data_loader.py or synthesize.py)
Can we re-run the code with other formats of HDFS dataset which are publicly available?
Thanks,

Do you solve this problem? How can we get this structured.csv?

You can use logparser(can be found in github) to preprocess HDFS dataset, and it can generate HDFS.log_structured.csv