karpathy/randomfun

lasso logistic regression vs knn_vs_svm?

myazdani opened this issue · 0 comments

Very cool idea on using SVM for similarity search.

As an alternative, we could also use logistic regression with l1 penalty. Because of the induced sparsity, this gives us the benefit of only storing the subset of embedding dimensions that are relevant and so reduce storage and computation during inference.

Small nit: I think technically we would call these examples instances of Positive Unlabeled (PU) learning where our "negatives" are not labeled (but we assume some of them are positive matches against the query).