Decision_tree_implementation: A Python repository from carlosrcoelho

This code is an implementation of a Decision Tree classifier from scratch using Python. Here's what each part of the code does:

Entropy Calculation (entropy function):
- Calculates the entropy of a given set of labels (y) using the formula for entropy in information theory.
Node Class:
- Represents nodes in the decision tree.
- Stores information about each node, including the feature to split on, threshold value, child nodes (left and right), and value (if the node is a leaf node).
DecisionTree Class:
- Initializes and builds the decision tree.
- Contains methods to fit the tree to the training data (fit), predict labels for new data (predict), and grow the tree recursively (_grow_tree).
fit Method:
- Grows the tree from the root node using the _grow_tree method.
_grow_tree Method:
- Recursively builds the decision tree based on conditions like maximum depth, minimum samples to split a node, and number of unique labels.
- Conducts a greedy search to find the best feature and threshold to split on, creating child nodes accordingly.
Other Helper Methods:
- _best_criteria: Determines the best feature and threshold to split on based on information gain.
- _information_gain: Calculates the information gain using entropy.
- _split: Splits the data based on a threshold value.
- _most_common_label: Finds the most common label in a given set.
predict Method:
- Uses the trained decision tree to predict labels for new data by traversing the tree.
_traverse_tree Method:
- Traverses the tree recursively to predict the label for a single data point.

This implementation constructs a decision tree by recursively choosing splits that maximize information gain, using entropy as a measure of impurity. The tree is then used for prediction based on the learned structure.

carlosrcoelho/Decision_tree_implementation