/HighDimensionalData

Code used to generate results from Chapter 18 of The Elements of Statistical Learning

Primary LanguageR

Introduction to High-Dimensional Problems (p >> N)

And when "Less fitting is better"

In this article, we will work through and discuss part of Chapter 18 from "The Elements of Statistical Learning, Second Edition," titled "High-Dimensional Problems: p >> N."

High-dimensional data refers to data in which there is a high number of columns relative to the number of observations. In particular, if we have N observations and each observation has p columns (or features), and the number of features is much larger than the number of observations (p >> N), it is said to be a high-dimensional problem. 

The associated article can be seen here Medium Article

Here is a quick comparison;

Comparing regular data to high-dim

And here is a recreated plot from the Textbook (code provided)

recreated plot from The Elements of Statistical Learning