garyramah/Predict

Once the data set is understood through intelligent discovery, supervised approaches are applied to predict what will happen in the future. These types of problems include classification, regression and ranking. For this pillar, most companies use a standard set of supervised machine learning algorithms, including random forests, gradient boosting, linear/sparse learners. It should be noted, however, that the unsupervised work from the previous step is highly useful in many ways. For example, it can generate relevant features for use in prediction tasks or finding local patches of data where supervised algorithms may struggle (systematic errors). The predict phase is an important part of the business value associated with data science; however, generally, in predictive analytics, there exists a notion that this is the sum total of machine learning. This is not the case by far. Prediction, while important, is pretty well understood and does not, on its own qualify as “intelligence.” Further, prediction can go wrong along a number of dimensions, particularly if the groups on which you are predicting are racked with some type of bias. In and of itself, prediction is not AI, and we need to stop calling it as such.