Implementation of different python modules for data preprocessing, classification, regression, nlp and recommendation engine algorithms
Printing dimensions, printing unique information for each column, basic data profiling, basic missing value analysis, removing any columns which have same value all across, treating missing values in numeric columns and categorical columns, min max scaler, z scaler. label encoding, one hot encoding
Train-test split, K-fold cross validation, Variable importance plot, xgboost, LightGBM, Extra Tree, Random Forest, Logistic Regression, Decision Tree, K nearest neighbour
Train-test split, K-fold cross validation, Variable importance plot, xgboost, LightGBM, Extra Tree, Random Forest, Linear Regression, Decision Tree, K nearest neighbour
word grams, creating wordcloud, get tokens, convert lowercase, remove punctuations, remove stopwords, convert stemmer, convert lemmatizer, creating tf-idf, creating count vector
Creating Interaction matrix, creating user dictionary, creating item dictionary, running matrix-factorization algorithm, producing user recommendations, producing a list of top N interested users for a given item, creating item-item distance embedding matrix, creating item-item recommendation