A Python machine learning project to predict the gender of a given name.
Gender prediction based on a person's name is a common natural language processing (NLP) task. This project uses machine learning to predict whether a given name is male or female. It includes a pre-trained model that can be used for predictions and serves as a starting point for similar NLP tasks.
- Name gender prediction based on machine learning.
- Uses TF-IDF vectorization and a MultinomialNB.
- Pre-trained model for quick predictions.
- Easily customizable for your own dataset.
- Python 3.x
- pandas
- scikit-learn
- joblib (for saving and loading models)
-
Clone the repository:
git clone https://github.com/yourusername/name-gender-prediction.git
-
Navigate to the project directory:
cd name-gender-prediction
-
Install the required Python packages
To predict the gender of a name using the pre-trained model:
import joblib
# Load the pre-trained model
loaded_model = joblib.load('gender_predictor.pkl')
# Example name for prediction
new_name = "Alice"
new_name = [new_name.lower()]
# Use the loaded model for prediction
predicted_gender = loaded_model.predict(new_name)
# Display the predicted gender (0 for male, 1 for female)
if predicted_gender[0] == 0:
print("Predicted Gender: Male")
else:
print("Predicted Gender: Female")
Replace "Alice"
with the name you want to predict.
If you want to retrain the model with your own dataset or fine-tune it:
- Prepare your dataset in CSV format with columns "Name" and "Gender."
- Replace
'name_dataset.csv'
in the code with the path to your dataset. - Run the training code and follow the instructions in the Model Training section of the README.
Contributions are welcome! If you want to contribute to this project, please follow the guidelines in CONTRIBUTING.md.
This project is licensed under the MIT License - see the LICENSE.md file for details.