Pinned Repositories
.Movie-Recommendaton-System-using-Machine-Learning
Built a content based movie recommender system using cosine similarity, where the recommendations are based on the item metadata (i.e - movies, products, songs etc.) Contains the idea of a user liking an item, thereafter the other user gets recommended with the similar items.
Gradient-boosting
What is gradient boosting regression in machine learning? Image result for gradient boosting algorithm Gradient boosting Regression calculates the difference between the current prediction and the known correct target value. This difference is called residual. After that Gradient boosting Regression trains a weak model that maps features to that residual.
Human-Voice-to-Text-and-Deploy-it-on-Hugging-Face
This project focuses on developing a Human-Voice-to-Text system using speech recognition technology and deploying it on the Hugging Face platform
Industrial-Equipments-Detection-Yolov8-on-Custom-Dataset-and-deploy-it-on-Hugging-Face
Objective of this project is to build an accurate and efficient computer vision model capable of detecting industrial equipment in images.
K-Fold-Cross-Validation
What is K-fold in cross-validation? K-fold Cross-Validation is when the dataset is split into a K number of folds and is used to evaluate the model's ability when given new data. K refers to the number of groups the data sample is split into. For example, if you see that the k-value is 5, we can call this a 5-fold cross-validation.
Machine-Learning-Projects
We are required to model the price of cars with the available independent variables. It will be used by the management to understand how exactly the prices vary with the independent variables. They can accordingly manipulate the design of the cars, the business strategy etc. to meet certain price levels.
MMuttalib1326
NYC-Taxi-Trip-Duration-Prediction
Task is to build a model that predicts the total ride duration of taxi trips in New York City. primary dataset is one released by the NYC Taxi and Limousine Commission, which includes pickup time, geo-coordinates, number of passengers, and many other variables
Olympic-Games-Data-Analysis
we are going to see the Olympics analysis using Python. The modern Olympic Games or Olympics are leading international sports events featuring summer and winter sports competitions in which thousands of athletes from around the world participate in a variety of competitions. The Olympic Games are considered the world’s foremost sports competition with more than 200 nations participating. The total number of events in the Olympics is 339 in 33 sports. And for every event there are winners. Therefore various data is generated. So, by using Python we will analyze this data. Modules Used Pandas: It is used for analyzing the data, NumPy: NumPy is a general-purpose array-processing package. Matplotlib: It is a numerical mathematics extension NumPy seaborn: It is used for visualization statistical graphics plotting in Python
Quora-Question-Pairs-Similarity
Quora Question Pairs Similarity Problem,In this Project i have dealing with the task of pairing up the duplicate questions from quora. More formally, the followings are our problem statements Identify which questions asked on Quora are duplicates of questions that have already been asked. this could be useful to instantly provide answers to questions that have already been answered. We are tasked with predicting whether a pair of questions are duplicates or not.
MMuttalib1326's Repositories
MMuttalib1326/MMuttalib1326
MMuttalib1326/K-Fold-Cross-Validation
What is K-fold in cross-validation? K-fold Cross-Validation is when the dataset is split into a K number of folds and is used to evaluate the model's ability when given new data. K refers to the number of groups the data sample is split into. For example, if you see that the k-value is 5, we can call this a 5-fold cross-validation.
MMuttalib1326/Olympic-Games-Data-Analysis
we are going to see the Olympics analysis using Python. The modern Olympic Games or Olympics are leading international sports events featuring summer and winter sports competitions in which thousands of athletes from around the world participate in a variety of competitions. The Olympic Games are considered the world’s foremost sports competition with more than 200 nations participating. The total number of events in the Olympics is 339 in 33 sports. And for every event there are winners. Therefore various data is generated. So, by using Python we will analyze this data. Modules Used Pandas: It is used for analyzing the data, NumPy: NumPy is a general-purpose array-processing package. Matplotlib: It is a numerical mathematics extension NumPy seaborn: It is used for visualization statistical graphics plotting in Python
MMuttalib1326/1st-and-Future---Player-Contact-Detection-Detect-Player-Contacts-from-Sensor-and-Video-Data
The goal of this competition is to detect external contact experienced by players during an NFL football game. You will use video and player tracking data to identify moments with contact to help improve player safety.
MMuttalib1326/bias---variance-tradeoff
In statistics and machine learning, the bias–variance tradeoff is the property of a model that the variance of the parameter estimated across samples can be reduced by increasing the bias in the estimated parameters.
MMuttalib1326/Decision-tree-implementation
A decision tree is a non-parametric supervised learning algorithm, which is utilized for both classification and regression tasks. It has a hierarchical, tree structure, which consists of a root node, branches, internal nodes and leaf nodes.
MMuttalib1326/Ensembles-of-Decision-Trees-Implementation-
Ensemble learning helps improve machine learning results by combining several models. This approach allows the production of better predictive performance compared to a single model.
MMuttalib1326/Industrial-Equipments-Detection-Yolov8-on-Custom-Dataset-and-deploy-it-on-Hugging-Face
Objective of this project is to build an accurate and efficient computer vision model capable of detecting industrial equipment in images.
MMuttalib1326/K-Nearest-Neighbors
The k-nearest neighbors algorithm, also known as KNN or k-NN, is a non-parametric, supervised learning classifier, which uses proximity to make classifications or predictions about the grouping of an individual data point.
MMuttalib1326/Logistic-regression-implementation
Logistic regression estimates the probability of an event occurring, such as voted or didn't vote, based on a given dataset of independent variables. Since the outcome is a probability, the dependent variable is bounded between 0 and 1.
MMuttalib1326/Logistic-Regression-Practical-Implementation
MMuttalib1326/Machine-Learning-Projects
We are required to model the price of cars with the available independent variables. It will be used by the management to understand how exactly the prices vary with the independent variables. They can accordingly manipulate the design of the cars, the business strategy etc. to meet certain price levels.
MMuttalib1326/Model-explainability
Explainability is how to take an ML model and explain the behavior in human terms. With complex models (for example, black boxes ), you cannot fully understand how and why the inner mechanics impact the prediction.
MMuttalib1326/Quora-Question-Pairs-Similarity
Quora Question Pairs Similarity Problem,In this Project i have dealing with the task of pairing up the duplicate questions from quora. More formally, the followings are our problem statements Identify which questions asked on Quora are duplicates of questions that have already been asked. this could be useful to instantly provide answers to questions that have already been answered. We are tasked with predicting whether a pair of questions are duplicates or not.
MMuttalib1326/Topic-Modeling
Topic modeling is a machine learning technique that automatically analyzes text data to determine cluster words for a set of documents. This is known as 'unsupervised' machine learning because it doesn't require a predefined list of tags or training data that's been previously classified by humans
MMuttalib1326/Anomaly-detection
Anomaly detection is the process of identifying unexpected items or events in data sets, which differ from the norm. And anomaly detection is often applied on unlabeled data which is known as unsupervised anomaly detection.
MMuttalib1326/General-Modelling-Technique
MMuttalib1326/Human-Voice-to-Text-and-Deploy-it-on-Hugging-Face
This project focuses on developing a Human-Voice-to-Text system using speech recognition technology and deploying it on the Hugging Face platform
MMuttalib1326/Kaggle-Repository
MMuttalib1326/Netflix-Movies-and-TV-Shows-Clustering
In this project, we worked on a text clustering problem wherein we had to classify/group the Netflix shows into certain clusters such that the shows within a cluster are similar to each other and the shows in different clusters are dissimilar to each other.
MMuttalib1326/portfolio
MMuttalib1326/Principal-component-analysis
Principal component analysis, or PCA, is a dimensionality reduction method that is often used to reduce the dimensionality of large data sets, by transforming a large set of variables into a smaller one that still contains most of the information in the large set.
MMuttalib1326/Text-Summarization
MMuttalib1326/Time-series
A time series is a data set that tracks a sample over time. In particular, a time series allows one to see what factors influence certain variables from period to period. Time series analysis can be useful to see how a given asset, security, or economic variable changes over time.
MMuttalib1326/Time-Series-Krish-Naik-
A time series is a data set that tracks a sample over time. In particular, a time series allows one to see what factors influence certain variables from period to period. Time series analysis can be useful to see how a given asset, security, or economic variable changes over time.
MMuttalib1326/Data-Leakage
"A scenario when ML model already has information of test data in training data, but this information would not be available at the time of prediction, called data leakage. It causes high performance while training set, but perform poorly in deployment or production."
MMuttalib1326/Grocery-
MMuttalib1326/Predictive_Analysis_Challenge
MMuttalib1326/SynergyLabs-
MMuttalib1326/Taiyo