Human Text Vs AI-Generated Text

Model - distilbert/distilbert-base-uncased Dataset -AI Vs Human Text from Kaggle, https://www.kaggle.com/datasets/shanegerami/ai-vs-human-text Mongo Db Flask Deployed on AWS EC2 server

Workflows

Data Ingestion - Data can be downloaded from a server or database like MySQL or MongoDB. Here data is downloaded from Mongo.
Data Validation - To check data are available for further processing and training.
Data Transformation - Involves various tools and technology to process row data and make it suitable for training.
Model training - The tokenizer and model ( ‘distilbert/distilbert-base-uncased’) is used for model training.
Model Evaluation - After training model is evaluated by accuracy score from sklearn metrics.

After evaluation of model it deployed on AWS EC2 instance ubuntu server, Method of deployment is CI/CD deployment using github action and docker.

STEPS FOR DELPOYMENT -