/kmeans-portfolio

A security screening technique with simple K-Means Clustering.

Primary LanguageMATLAB

kMeans Portfolio Selection


A security screening technique with simple K-Means Clustering. Ensure all variables inputted are scaled such that high values are preferential to low values. Ex: P/B ratios should be B/P for a value stock screen. Optimal K is determined through a simple heuristic rule and from an average number of asset constraint (See below). Values in the X data set are automatically scaled to have 0 mean and 1 variance.

###Usage

Simply run the function kPortfolio() in Matlab with your dataset. The file NCheck.m (along with NCheck() function) is an accessory file in assisting to find optimal K

###Input

  • x - an N (assets) by M (variables) matrix
  • avgN - Required Average number of Assets per clustered portfolio. This rule is used to screen out undiversified, concentrated portfolios. Default: 25
  • avgRange - Range that average number of assets may fall within ex: [avgN - 2, avgN + 2]. This input guarantees that the final portfolio will have atleast N=avgN-avgRange securities in the portfolio. Default: 2

###Output

  • pIndex - Vector of Index of assets in the portfolio. ex: [1st security, 5th security ... ]
  • C - Re-scaled mean of the centroid
  • sse - Total Sum of Squares of cluster error

###Resources

Using K-Means for Value Investors
k-means clustering Matlab function
k-means clustering Wikipedia