/GML

Auto Data Science - Python Library.

Primary LanguageHTMLMIT LicenseMIT

GML Brain+Machine Adding AI Revolution

Generic badge Generic badge Generic badge Generic badge
PyPI version PyPI license PyPI pyversions GitHub issues

Creators

Muhammad Ahmed
Naman Tuli

Contributors

Mehran Kamal
Rafey Iqbal Rahman

Tired of doing Data Science manually? GML is here for you!

GML is an automatic data science library in python built on top of multiple Python packages. Complete features which we offer are listed as:


Installation:


pip install GML

https://pypi.org/project/GML
If you are facing any pytorch related issue during installation, kindly refer to following solution: #6 (comment)

Features:


Auto Feature Engineering



from GML import FeatureEngineering

fe = FeatureEngineering(Data, 'target', fill_missing_data=True, encode_data=True, 
                        normalize=True, remove_outliers=True, 
                        new_features=True, feateng_steps=2 ) # feateng_steps = 0 for features selection without feature creation

X_new, y, test = fe.get_new_data()

Click Here for complete DEMO


Auto EDA (Powered by Sweetviz)



from GML import sweetviz

result1 = sweetviz.compare([train,'train'],[test,'test'],'target') 
result2 = sweetviz.analyze([train,'train'])

result.show_html()
result2.show_html()

Click Here for complete DEMO


Auto Machine Learning



from GML import AutoML

gml_ml = AutoML()

gml_ml.GMLClassifier(X, y, metric = accuracy_score, folds = 10)

Click Here for complete DEMO

Auto Text Cleaning



from GML import AutoNLP

nlp = AutoNLP()

cleanX = X.apply(lambda x: nlp.clean(x))

Click Here for complete DEMO


Auto Text Classification using transformers



from GML import AutoNLP

nlp = AutoNLP()

nlp.set_params(cleanX, tokenizer_name='roberta-large-mnli', BATCH_SIZE=4,
               model_name='roberta-large-mnli', MAX_LEN=200)

model = nlp.train_model(tokenizedX, y)

Click Here for complete DEMO


Auto Image Classification with Augmentation



from GML import Auto_Image_Processing

gml_image_processing = Auto_Image_Processing()

model = gml_image_processing.imgClassificationcsv(img_path = './covid_image_data/train', 
                                                  train_path = './covid_image_data/Training_set_covid.csv', 
                                                  model_list = models,
                                                 tfms = True, advance_augmentation = True, 
                                                  epochs=1)

Click Here for complete DEMO


Text Augmentation using transformers: GPT-2



from GML import AutoNLP

nlp = AutoNLP()

nlp.augmentation_train('./data.csv')

nlp.set_params(X['Text'])

new_Text = nlp.augmentation_generate(y = y, SENTENCES = 100) 

Click Here for complete DEMO



More cool features and handling of different data types like audio data etc will be added in future.
Feel free to give suggestions, report bugs and contribute.