typedb/typedb-ml

Initial investigation into Random Forests in Grakn

grabl opened this issue · 0 comments

grabl commented

This issue was originally posted by @jmsfltchr on 2018-08-31 17:27.

Is it possible to create a random forest that sits inside Grakn so that it can be used for classification/regression at query-time? This experiment has not yet gone far enough to determine feasibility in terms of speed. The blocker that was encountered before this was being able to perform aggregations in rules, because we need to do aggregate mode inside a rule in order to implement the majority voting of trees in a forest to classify an example. Performing this operation outside Grakn seems to defeat the point of embedding the forest in Grakn at all. I have made no effort to consider how to build or "train" (*1) the forest. This training (*1) could be done in application code and then the tree translated into Grakn. *1 by "training" I mean that the trees are not built totally randomly, the discrimination boundary picked for each node (and which feature to use, picked from a random set?) is picked based upon the basis of what divides the data the most.