This is a library for machine learning in F#.
- A learning algorithm is a tuple
(P,i,u,r)consisting of a parameter spaceP, a parameterized implementation functionP×A → B, an update ruleP×A×B → P, and backward propagation ruleP×A×B → A, together forming a categoryLearn. - Parameterized functions on Hilbert spaces,
P×A → Bwith function composition as morphism composition form a categoryPara. - Neural networks are morphisms with composition as network concatenation (layering) form a category
NNet. - A functor
L : Para → Learnsends parameterized functions to learning algorithms using gradient descent and backpropagation given an error function (aka objective function) and a (learning) step size. - A functor
I : NNet → Paramapping the size of the network to the dimensionality of a space, and networks to parameterized functions give an activation function. - Composing functors
IandLgives us functorNNet → Learnwhich states that a neural network defines a learning algorithm, given an error function, step size and activation function. - Differentiation is performed using Automatic Differentiation.
- Probability spaces are modeled using the Giry monad.
A neural network layer of type (n1,n2):Nat × Nat is C ⊆ {1...n1} × {1...n2} wherein (i,j) ∈ C denotes a connection between nodes i and j. The values n1 and n2 represent the number of nodes on each side of the layer and |C| is the number of connections. Given an activation function σ : R → R, a neural network layer defines a parameterized function of the form P×A → B where P = R^(|C| + n2), A = R^n1, B = R^n2.
For example, the layer C = {(1,1),{2,1},(2,2)} ⊆ [2]×[2] has 3 connections and defines a parameterized function:
/// R^5 x R^2 → R^2
fun (((w11,w21,w22),(w1,w2)),a1,a2) → s(w11 * a1 + w1), s(w21 * a1 + w22 * a2 + w2)A neural network is a sequence of layers {(n0,n1),(n1,n2),...} and the composition of these layers defines a composite parameterized function P x R^n0 → R^nk.
Categorically, a neural network is a method for defining parameterized functions. Smooth parameterized functions form a category Para where the objects are spaces, morphisms are functions and composition of morphisms is function composition. A learning algorithm based on a parameterized function is determined on the basis of a functor Para → Learn using gradient descent, given an error function e and step size δ.