/ML-NN

Machine Learning and Neural Network category

Primary LanguageJupyter Notebook

Custom RBF (Gaussian) string kernel

It is know that SVM cannot process string inputs in its default configuration. To overcome that problem there exists different implementations of string kernels that may operate with texts. This kernel is an additional configuration of RBF (Gaussian) kernel with a slight alteration of distance (similarity) calculation. Instead of dot product of digit vectors (Euclidean distance) it evaluates Levenshtein distance between strings to form a grammian matrix. Such strings are defined out-of-function (thus, explicitly).

However, the .fit method of SVC receives not a string array as input (apart from the target variable) but an array of indices that relate to the strings (see example).

Still personally debating over square of Levenshtein distance with myself...

Machine Learning and Neural Network category

Quick guide Q(^,^Q)

- Use sigmoid neurons with cross-entropy loss

- With softmax neurons, though, use log-likelihood cost

The latter is to convert classified outputs to probability outputs. The softmax conversion basically forms probability distribution (see directory) with sum of weights being 1.

...