shubhrock777
Intern at INNODATATICS * Innovation || Data Scientist || Data| Tech Lover🔥 || Data Scientist 🦠||
Bangalore
Pinned Repositories
Shubham-Sharma
Multiple-linear-regression
In statistics, linear regression is a linear approach to modelling the relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression.
Decision-tree-Random-forest-
A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains conditional control statementsRandom forests or random decision forests are an ensemble learning method for classification, regression and other tasks that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean/average prediction (regression) of the individual trees.
ensemble-models
Ensemble modeling is a process where multiple diverse models are created to predict an outcome, either by using many different modeling algorithms or using different training data sets. The ensemble model then aggregates the prediction of each base model and results in once final prediction for the unseen data.
Natural-language-processing
Natural language processing is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data
kaggle-
ANN-For-Iris-Dataset
This program designs and implements a singal-layer neural network to classify 3 ... First, the program trains the ANN, and then classifies the plants based on user ... This allows the user to test our model's ability on the iris dataset with a greater ... with the use of hidden layers that retain the connection weight of the prior layers
Artificial-neural-network
Artificial neural networks, usually simply called neural networks, are computing systems vaguely inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain.
Association-rule-learning
Association rules are usually required to satisfy a user-specified minimum support and a user-specified minimum confidence at the same time. Association rule generation is usually split up into two separate steps: A minimum support threshold is applied to find all frequent itemsets in a database.
Confidence-interval
In statistics, a confidence interval is a type of estimate computed from the statistics of the observed data. This gives a range of values for an unknown parameter. The interval has an associated confidence level that gives the probability with which the estimated interval will contain the true value of the parameter
shubhrock777's Repositories
shubhrock777/kaggle-
shubhrock777/ANN-For-Iris-Dataset
This program designs and implements a singal-layer neural network to classify 3 ... First, the program trains the ANN, and then classifies the plants based on user ... This allows the user to test our model's ability on the iris dataset with a greater ... with the use of hidden layers that retain the connection weight of the prior layers
shubhrock777/Recommender-system
A recommender system, or a recommendation system, is a subclass of information filtering system that seeks to predict the "rating" or "preference" a user would give to an item.
shubhrock777/Python-basic-code-
Python is an interpreted high-level general-purpose programming language. Python's design philosophy emphasizes code readability with its notable use of significant indentation.
shubhrock777/PCA
The principal components of a collection of points in a real coordinate space are a sequence of p unit vectors, where the i-th vector is the direction of a line that best fits the data while being orthogonal to the first i-1 vectors.
shubhrock777/K-Means-Clustring-
k-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster.
shubhrock777/data-preprocessing-with-python
Inone of my previous posts, I talked about Data Preprocessing in Data Mining & Machine Learning conceptually. This will continue on that, if you haven’t read it, read it here in order to have a proper grasp of the topics and concepts I am going to talk about in the article. Data Preprocessing refers to the steps applied to make data more suitable for data mining. The steps used for Data Preprocessing usually fall into two categories:
shubhrock777/Text-Mining
Text mining, also referred to as text data mining, similar to text analytics, is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources.
shubhrock777/Shubham-Sharma
shubhrock777/Recurrent-neural-network
A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes form a directed graph along a temporal sequence. This allows it to exhibit temporal dynamic behavior.
shubhrock777/Convolutional-Neural-Network
In deep learning, a convolutional neural network is a class of deep neural network, most commonly applied to analyze visual imagery
shubhrock777/Artificial-neural-network
Artificial neural networks, usually simply called neural networks, are computing systems vaguely inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain.
shubhrock777/Network-Analysis
Social network analysis is the process of investigating social structures through the use of networks and graph theory. It characterizes networked structures in terms of nodes and the ties, edges, or links that connect them.
shubhrock777/multinomial-logistic-regression
In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes.
shubhrock777/Multiple-linear-regression
In statistics, linear regression is a linear approach to modelling the relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression.
shubhrock777/Hierarchical-Clustering
Hierarchical clustering, also known as hierarchical cluster analysis, is an algorithm that groups similar objects into groups called clusters. The endpoint is a set of clusters, where each cluster is distinct from each other cluster, and the objects within each cluster are broadly similar to each other.
shubhrock777/Simple-Linear-Regression
Linear regression is a statistical method for modelling relationship between a dependent variable with a given set of independent variables. Note: In this article, we refer dependent variables as response and independent variables as features for simplicity. In order to provide a basic understanding of linear regression, we start with the most basic version of linear regression, i.e. Simple linear regression. Simple Linear Regression Simple linear regression is an approach for predicting a response using a single feature. It is assumed that the two variables are linearly related. Hence, we try to find a linear function that predicts the response value(y) as accurately as possible as a function of the feature or independent variable(x)
shubhrock777/Survival-Analytics
Survival analysis is a branch of statistics for analyzing the expected duration of time until one or more events happen, such as death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysis in engineering, duration analysis or duration modelling in economics, and event history analysis in sociology. Survival analysis attempts to answer certain questions, such as what is the proportion of a population which will survive past a certain time? Of those that survive, at what rate will they die or fail? Can multiple causes of death or failure be taken into account? How do particular circumstances or characteristics increase or decrease the probability of survival?
shubhrock777/forecasting-time-series
Time series forecasting is the use of a model to predict future values based on previously observed values. Time series are widely used for non-stationary data, like economic, weather, stock price, and retail sales in this post. We will demonstrate different approaches for forecasting retail sales time series
shubhrock777/Support-vector-machine
In machine learning, support-vector machines are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis.
shubhrock777/Ridge-Regression-and-Lasso-Regression
Ridge Regression : In Ridge regression, we add a penalty term which is equal to the square of the coefficient. The L2 term is equal to the square of the magnitude of the coefficients. We also add a coefficient \lambda to control that penalty term. In this case if \lambda is zero then the equation is the basic OLS else if \lambda \, > \, 0 then it will add a constraint to the coefficient. As we increase the value of \lambda this constraint causes the value of the coefficient to tend towards zero. This leads to both low variance (as some coefficient leads to negligible effect on prediction) and low bias (minimization of coefficient reduce the dependency of prediction on a particular variable) .Lasso Regression : Lasso regression stands for Least Absolute Shrinkage and Selection Operator. It adds penalty term to the cost function. This term is the absolute sum of the coefficients. As the value of coefficients increases from 0 this term penalizes, cause model, to decrease the value of coefficients in order to reduce loss. The difference between ridge and lasso regression is that it tends to make coefficients to absolute zero as compared to Ridge which never sets the value of coefficient to absolute zero.
shubhrock777/logistic-regression
n statistics, the logistic model is used to model the probability of a certain class or event existing such as pass/fail, win/lose, alive/dead or healthy/sick. This can be extended to model several classes of events such as determining whether an image contains a cat, dog, lion, etc.
shubhrock777/Natural-language-processing
Natural language processing is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data
shubhrock777/ensemble-models
Ensemble modeling is a process where multiple diverse models are created to predict an outcome, either by using many different modeling algorithms or using different training data sets. The ensemble model then aggregates the prediction of each base model and results in once final prediction for the unseen data.
shubhrock777/Hypothesis-Testing
A statistical hypothesis is a hypothesis that is testable on the basis of observed data modelled as the realised values taken by a collection of random variables.
shubhrock777/Confidence-interval
In statistics, a confidence interval is a type of estimate computed from the statistics of the observed data. This gives a range of values for an unknown parameter. The interval has an associated confidence level that gives the probability with which the estimated interval will contain the true value of the parameter
shubhrock777/Decision-tree-Random-forest-
A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains conditional control statementsRandom forests or random decision forests are an ensemble learning method for classification, regression and other tasks that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean/average prediction (regression) of the individual trees.
shubhrock777/knn-classifier
k-NN is a type of classification where the function is only approximated locally and all computation is deferred until function evaluation. Since this algorithm relies on distance for classification, if the features represent different physical units or come in vastly different scales then normalizing the training data can improve its accuracy dramatically. A glass manufacturing plant, uses different Earth elements to design a new glass based on customer requirements for that they would like to automate the process of classification as it’s a tedious job to manually classify it, help the company reach its objective by correctly classifying the Earth elements, by using KNN Algorithm
shubhrock777/Naive-Bayes-classifier
n statistics, naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes' theorem with strong independence assumptions between the features. They are among the simplest Bayesian network models, but coupled with kernel density estimation, they can achieve higher accuracy levels.
shubhrock777/Association-rule-learning
Association rules are usually required to satisfy a user-specified minimum support and a user-specified minimum confidence at the same time. Association rule generation is usually split up into two separate steps: A minimum support threshold is applied to find all frequent itemsets in a database.