/textpredict

TextPredict is a powerful Python package designed for various text analysis and prediction tasks using advanced NLP models. It simplifies the process of performing sentiment analysis, emotion detection, zero-shot classification, named entity recognition (NER), and more.

Primary LanguagePythonMIT LicenseMIT

python PyPI - Version Code style: black Ruff security: bandit Downloads

TextPredict Logo

Advanced Text Classification with Transformer Models

TextPredict is a powerful Python package designed for various text analysis and prediction tasks using advanced NLP models. It simplifies the process of performing sentiment analysis, emotion detection, zero-shot classification, named entity recognition (NER), and more. Built on top of Hugging Face's Transformers, TextPredict allows seamless integration with pre-trained models or custom models for specific tasks.

Features

  • Sentiment Analysis: Determine the sentiment of text (positive, negative, neutral).
  • Emotion Detection: Identify emotions such as happiness, sadness, anger, etc.
  • Zero-Shot Classification: Classify text into custom categories without additional training.
  • Named Entity Recognition (NER): Extract entities like names, locations, and organizations from text.
  • Sequence Classification: Fine-tune models for custom classification tasks.
  • Token Classification: Classify tokens within text for tasks like NER.
  • Sequence-to-Sequence (Seq2Seq): Perform tasks like translation and summarization.
  • Model Comparison: Evaluate and compare multiple models on the same dataset.
  • Explainability: Understand model predictions through feature importance analysis.
  • Text Cleaning: Utilize utility functions for preprocessing text data.

Supported Tasks

  • Sentiment Analysis
  • Emotion Detection
  • Zero-Shot Classification
  • Named Entity Recognition (NER)
  • Sequence Classification
  • Token Classification
  • Sequence-to-Sequence (Seq2Seq)

Installation

You can install the package via pip:

pip install textpredict

Quick Start

Initialization and Simple Prediction

Initialize the TextPredict model and perform simple predictions:

from textpredict as initialize

# Initialize for sentiment analysis

# task : ["sentiment", "ner", "zeroshot", "emotion", "sequence_classification", "token_classification", "seq2seq" etc]

text = "I hate this product!" # ["I love this product!", "I hate this product!"]

model = initialize("sentiment") 
result = model.analyze()

Using Pre-trained Models from Hugging Face

Utilize a specific pre-trained model from Hugging Face:

model = initialize("emotion", model_name="AnkitAI/reviews-roberta-base-sentiment-analysis", source="huggingface")
result = model.analyze(text)

Using Models from Local Directory

Load and use a model from a local directory:

model = initialize("ner", model_name="./results", source="local")
result = model.analyze(text, return_probs=True) # if return_probs = True, It returns labels, score and probabilities

Training a Model

Train a model for sequence classification:

from textpredict import SequenceClassificationTrainer
from datasets import load_dataset

# Load dataset
train_data = load_dataset("imdb", split="train")
val_data = load_dataset("imdb", split="test")

# Initialize and train the model
trainer = SequenceClassificationTrainer(model_name="bert-base-uncased", output_dir="./results", train_dataset=train_data, val_dataset=val_data)
trainer.train()
trainer.save()
metrics = trainer.evaluate(test_dataset=val_data)

For detailed examples, refer to the examples directory.

Explainability and Feature Importance

Understand model predictions with feature importance:

from textpredict import Explainability

explainer = Explainability(model_name="bert-base-uncased", task="sentiment", device="cpu")
importance = explainer.feature_importance(text)

and many more.... check documentations

Documentation

For detailed documentation, please refer to the TextPredict Documentation.

Contributing

Contributions are welcome! Please read our Contributing Guidelines before making a pull request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Credits

This project leverages the Transformers library by Hugging Face. We extend our gratitude to the Hugging Face team and to the developers, contributors for their work for their work in creating and maintaining such a valuable resource for the NLP community.

Links