Pinned Repositories
Adding-and-Dropping-columns
Add column, drop column, creating a new column, creating a where function for column outcomes, combining conditions and rules with select, and dropping a column
Adding-text-box-to-data
Add text to data, data point call out, annotate, textcoords, call outs, xytext, bounding box, fill color, edge color, arrowstyle, arrow props
Advanced-K-Means-Clustering
Create data, merge data with inner join, drop rows with non food names, group by, pivot table, missing values, normalize, WCSS to find k value, instantiate and fit model, cluster profile breakdown
Advanced-linear-regression
Pickle file, shuffle data, missing values, drop values, outliers, box plot, split, categorical variables and one hot encoder, feature selection and RFECV, feature selection, optimal number of features plot, r-square, cross validation scores, adjusted r-square, model coefficients table and intercept, final interpretation
Aggregating_Data
Merging, value counts, sum, groupby, quantile, reset_index, sum and mean for multiple columns, aggregate, specify aggregate items for which columns
Airlines-Data-Set-
Running size, column view, replace NaN, data shape, group, and other related functions on data set.
ANN-with-Desc-null-balance-bootstrap-plots-seab-label-enc-corr-compile-early-stop-hyppar.
ANN with Desc, null, balance (bootstrap), plots, seab, label enc, corr , compile, early stop, hyppar.
Apply-Map-Reduce
Map different dictionary keys to values, run replace, write a function and view length, use function to code column, apply changes to data frame, rows and columns
NLP-index-tokenize-sentence-detection-NER-POS-chunking-tagging-parsing-vectors-similarities.
NLP index, tokenize, sentence detection, NER, POS, chunking, tagging, parsing, vectors, similarities, and pipelines with spaCY words.
Use-of-correlations-heatmap-plotted-logistic-regression-chart.
Use of correlations, heatmap, plotted logistic regression chart in examination of insurance data.
aphd87's Repositories
aphd87/Advanced-K-Means-Clustering
Create data, merge data with inner join, drop rows with non food names, group by, pivot table, missing values, normalize, WCSS to find k value, instantiate and fit model, cluster profile breakdown
aphd87/Advanced-linear-regression
Pickle file, shuffle data, missing values, drop values, outliers, box plot, split, categorical variables and one hot encoder, feature selection and RFECV, feature selection, optimal number of features plot, r-square, cross validation scores, adjusted r-square, model coefficients table and intercept, final interpretation
aphd87/Artificial-neural-network
Artificial neural network, feature scaling, network architecture, compile, summary, training network, plot loss accuracy, make predictions on new data
aphd87/Bank-Churn-Project
test file, train file, submission file, describe with highlights, outliers, missing values, plots, XGB, RF, GB, CatBoost, column transformers, NN, test data flattened, processed into submission file
aphd87/Basic-regression
split, instantiate, train, r2 score for accuracy
aphd87/Causal-Impact-Analysis
Import data tables, aggregate by group by, merge, pivot table, frequency, change column order, rename columns, apply causal impact, print out casual impact report
aphd87/Classification-Tree
Classification tree, split, instantiate, train, assess model accuracy, plot decision tree
aphd87/Classification-Tree-Advanced
Classification tree, shuffle data, class balance, missing values, model shape, categorical variables, model training, decision tree plotmax_depth, max_accuracymodel assessment, y_probabilities, confusion_matrix, accuracy_score, recall, F1 score, precision,
aphd87/CNN-Fruit-Project-Keras-Tuner
Architecture tuning, random search, tuner search, tuner results, hyperparameters, best models, sequential, plots, accuracy, class probabilities, matrix, accuracy scores
aphd87/Creating-our-regression
Combine excel databases and sheets, merge files, left join, groupby and aggregate new variables, inner join, pickle dump files
aphd87/dsi-streamlit-web-app
aphd87/Fruit-classification-augmentation
Data augmentation to improve accuracy, shape of class probs, class probs, predicted class, predicted label, predictions, test accuracy, confusion matrix
aphd87/Fruit-classification-dropout
Training and validating data, network architecture with layers added, training parameters, plot learning, predict class prob, class index, label, accuracy, and confusion matrix
aphd87/Grid-Search
Grid Search, hyperparameter optimization, get best cv score, best parameters, best estimator, create optimal model object
aphd87/Image-search-engine
Vgg16 architecture, pre-trained model, model summary, preprocess image, featurise image, pass in images, pickle dump, load file, search results, cosine scores, image distances, image indices, print out
aphd87/K-Means-Clustering
Plotting cluster, instantiate cluster, adding labels to cluster, value counts, cluster centroid locations, plotting clusters and centroids
aphd87/KNN-Basic-Classification
Train test split, instantiate, basic cluster analysis and accuracy score
aphd87/KNN-cluster-analysis-
Data model, shuffle data, class balance, missing values, model shape, outliers, splits, categorical variables, feature scaling, feature selection, confusion matrix, F1, accuracy, precision, recall, optimal value of k,
aphd87/Logistic-Regression-Advanced
Pickle upload, drop data, shuffle, class balance, value counts, missing values, shape, outliers, categorical, feature selection, model training, data probabilities, probability of class, confusion matrix, acc, precision, recall, f1, ROC AUC, optimal threshold, plotting
aphd87/Logistic-Regression-Basic
Logistic regression, train test split, instantiate, train, assess with test, individual probabilities, plt seaborn, confusion matrix
aphd87/PCA
PCA, drop column, shuffle data, class balance, missing values, shape, split variables, feature scaling, instantiate PCA, explained variance, plot variance explained, plot cumulative variance, accuracy score
aphd87/Pipelines
Pipelines, specify numeric and categorical features, transform numeric and categorical features, preprocess, logistic regression, random forest, save pipeline, import pipeline and predict on new data, pass new data and receive predictions
aphd87/Predicting-Missing-Loyalty-Scores
Pickle load, drop columns, drop missing values, apply one hot encoding, concatenate, make loyalty customer predictions
aphd87/Random-Forest-Classification
Random forest, train test split, instantiate, train, accuracy, notes around parameters we can use
aphd87/Random-Forest-Classification-Advanced
Pickle file, drop column, shuffle data, class balance, value counts, dealing with missing values, shape, split, dealing with categorical variables, model training, model assessment, class probabilities, confusion matrix, accuracy, precision, recall, f1 score, feature importance, permutation importance,
aphd87/Random_Forest_Regression_Advanced
Pickle file, data drop, shuffle data, missing values, drop data, outlier investigation, split, one hot encoding, R2, cross validation, adjusted R2, feature importance, permutation importance, exam one customer observation, predictions under hood
aphd87/Regression-Tree-Advanced
Pickle load, column drop, missing values, outliers, test train split, one hot encoding, R square, cross validation, cross validation, overfitting, max_depth, accuracy score, optimal score, decision tree
aphd87/streamlit
2019 Deloitte data set with Streamlit web app
aphd87/Streamlit-2024-Project
Streamlit dashboard that uses Deloitte media survey data to help predict who will have a streaming service subscription.
aphd87/Transferred-learning-via-vgg-frozen-layers-accuracy-matrix-and-matrix-percentages
Transferred learning with CNN