This project aims to construct a comprehensive machine learning and deep learning framework for text mining, including text classification, text matching, text generation, information extraction, and so on.
- Use show_json_data to briefly review the data:
python3 Data_Processor.py --phase show_json_data
- Use extract_abs_label to extract input and output from the data:
python3 Data_Processor.py --phase extract_abs_label
- Use save_abs_label to save the clean input and output to a clean path:
python3 Data_Processor.py --phase save_abs_label
- Use split_data to split the clean data to N folds:
python3 Data_Processor.py --phase split_data+aapr.dl.mlp.norm
- Use get_vocab to get the vocabulary/word dictionary from the corpus/dataset:
python3 Data_Processor.py --phase get_vocab+aapr.dl.mlp.norm
python3 main.py --phase aapr.dl.mlp.norm > aapr.dl.mlp.norm.log
python3 main.py --phase aapr.dl.textcnn.norm > aapr.dl.textcnn.norm.log
python3 main.py --phase aapr.ml.lr.tf > aapr.ml.lr.tf.log