Koorimikiran369
Master in Statistics, Osmania University Working as a Data Scientist
PrimEra Medical TechnologiesHyderabad
Pinned Repositories
500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code
500 AI Machine learning Deep learning Computer vision NLP Projects with code
Big-Mart-Sales
Boosting-Concepts
Ada Boost classifier and Gradient Boosting
Breast-Cancer-Predictions
Inbuilt Dataset From scikit learn
Cluster-Analysis-for-mall-customers
K-means clustering an Agglomerative clustering
Koorimikiran369
LinearRegression-on-Boston-Dataset
oyo-rooms-project
Project-on-German-AFD-political-party-
How the vote sharing suddenly changes in German AFD political party, what are the main causes to change that sharing percentage for that i find some significant columns which are given in data
Statistics-Notes
iPython NOtebooks on Stats
Koorimikiran369's Repositories
Koorimikiran369/Koorimikiran369
Koorimikiran369/Statistics-Notes
iPython NOtebooks on Stats
Koorimikiran369/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code
500 AI Machine learning Deep learning Computer vision NLP Projects with code
Koorimikiran369/handcalcs
Python library for converting Python calculations into rendered latex.
Koorimikiran369/Innomatics_Internship
Koorimikiran369/LLM-Finetuning
LLM Finetuning with peft
Koorimikiran369/Machine-Learning-with-Python
Python code for common Machine Learning Algorithms
Koorimikiran369/ML-end_to_end_project
Koorimikiran369/Quora-Question-Pairing
Quora Question Pair Similarity Over 100 million people visit Quora every month, so it's no surprise that many people ask similarly worded questions. Multiple questions with the same intent can cause seekers to spend more time finding the best answer to their question, and make writers feel they need to answer multiple versions of the same question. Quora values canonical questions because they provide a better experience to active seekers and writers, and offer more value to both of these groups in the long term. The main aim of the project is to predict whether a pair of questions are similar or not. Problem Statement: Identify which questions asked on Quora are duplicates of questions that have already been asked. Real world/Business Objectives and Constraints: The cost of a mis-classification can be very high. You would want a probability of a pair of questions to be duplicates so that you can choose any threshold of choice. No strict latency concerns. Interpretability is partially important. Tasks to perform: Import the General libraries, NLP module, and Machine learning modules Load the dataset Text Preprocessing: Removing html tags Removing Punctuations Performing stemming Removing Stop words Expanding contractions etc. Apply Tokenization Apply Stemming Apply Pos Tagging Apply Lemmatization Apply label encoding Feature Extraction Apply BOW Apply TFIDF vectorizer Apply Word2Vector vectorizer Apply Glove Data preprocessing Model Building Evaluate the model confusion matrix Classification report Data Overview Data will be in a file Train.csv Train.csv contains 5 columns : qid1, qid2, question1, question2, is_duplicate Size of Train.csv - 60MB Number of rows in Train.csv = 404,290 Mapping the real world problem to an ML problem Datalink: https://drive.google.com/file/d/10QDGTSI5PEV9e7CTpfzsXRpUwRIsJA-J/view?usp=sharing Type of Machine Learning Problem It is a binary classification problem, for a given pair of questions we need to predict if they are duplicate or not.
Koorimikiran369/Satistics_Lectures
Koorimikiran369/Testing-Libraries
Koorimikiran369/webscrping
Koorimikiran369/AB_Testing
A/B Testing — A complete guide to statistical testing
Koorimikiran369/Bank_Response
Koorimikiran369/cowin-vaccination-slot-availability
Script to check the available slots for Covid-19 Vaccination Centers from CoWIN API in India
Koorimikiran369/Deployment
Koorimikiran369/Hackerrank
Solutions to the practice exercises, coding challenges, and other problems on Hackerrank! www.Hackerrank.com
Koorimikiran369/IPL_Score_Prediction_Deployment
Koorimikiran369/jajaj
Koorimikiran369/Koo
Koorimikiran369/lang2sql
Language to SQL Translator
Koorimikiran369/learn-traffic-crashes
Learn Python Data Analytics by Example - Chicago Traffic Crashes
Koorimikiran369/my-website
Koorimikiran369/portfolio
Koorimikiran369/Project
Koorimikiran369/Python
This repository helps you understand python from the scratch.
Koorimikiran369/Python-ajay
Koorimikiran369/TSF-Internship-EDA-Retail
To analyse the SampleSuperstore dataset which contains data about a superstore and the sales done along with some of the factors and their corresponding profits.
Koorimikiran369/webscraping
Koorimikiran369/webscraping-on-cars24