Simple implementation of Decision Tree and Pruning. Has working examples on two datasets for demonstration. All the scripts are in python 3.5
decision_tree.py: It contains implementation of Decision Tree using "Information Gain" as the selection criterion for the best attribute.
decision_tree_gain_ratio.py: It contains implementation of Decision Tree using "Gain Ratio" as the selection criterion.
data_manip.py: It contains functions to create data partitions, calculate accuracy etc.
q1_run_decision_tree_on_lenses: Answer to question 1 (Please refer Decision_tree_problems)
q2_run_decision_tree_on_other_dataset.py : Answer to question 2
run_decision_tree_on_other_dataset_basic_demo.py: Run it to get an idea of the inner procedures.
run_decision_tree_on_other_dataset_using_gain_ratio.py : Answer to question 2 using gain ratio as selection rule
- Keep all the files in same folder (Do not move any of them, as modules are interdependent.)
- Run q1_run_decision_tree_on_lenses.py for question 1: *5-fold cross validation has been used. *Node_id: "{2}_{3} Tear_production" means a the node is the 3rd node at level 2 and has best attribute = "Tear_production"
- Run q2_run_decision_tree_on_other_dataset.py for Ques. 2: *L and K values used are L = [15,20] and K = [15,20,30,40,50] to get 10 combinations
- Run run_decision_tree_on_other_dataset_using_gain_ratio.py for solving Ques.2 using gain ratio based Decision Tree.
This project is licensed under the MIT License - see the LICENSE.md file for details