Mango-ml is a machine learning framework designed for fast experimentation.
Mango provides a set of interfaces and guidelines to build and maintain machine learning models, including support for advanced logging and keeping track of parameters between runs.
Mango has a bunch of useful abstractions that encourage separation of concerns and code reuse:
- Model
- Dataset
- Loader
- Reporter
- Experiment
The following file represent a very simple classification system written with mango.
import mango
class Dataset(mango.Dataset):
seed = mango.Param(int, default=0)
def build(self):
random.seed(self.seed)
X, y = make_classification()
self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y)
def train(self):
self.data = self.X_train, self.y_train
def eval(self):
self.data = self.X_test, self.y_test
class Model(mango.Model):
alpha = mango.Param(float)
def build(self):
self.clf = LogisticRegression(alpha=self.alpha)
def train(self, loader):
X, y = loader.train()
self.clf.fit(X, y)
def eval(self, loader):
X, y = loader.eval()
self.reporter.add_scalar("score", self.clf.score(X, y))
self.reporter.log('Saving model')
with self.context['experiment'].file('file.pickle') as fd:
pickle.dump(fd, self.clf)
class Main(mango.Experiment):
reporter = mango.CombinedReporter([LogReporter(), TextReporter()])
model = Model(alpha=0.5)
dataset = Dataset()
loader = SimpleLoader(dataset)
trainer = SimpleTrainer(model, loader)
Once we defined our Experiment, we can run it using the following command:
mango train example.Main
Many machine learning system support long running times and minibatch training, we'll see how to create a simple minibatch trainer using mango
class Dataset(mango.BatchedDataset):
def build(self):
random.seed(self.seed)
X, y = make_classification()
self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y)
class BatchLogistic(mango.BatchModel):
def build(self):
pass
def batch(self, batch, step_info):
pass
def epoch(self, loader, step_info):
pass
class MiniBatch(mango.experiment)
dataset = Dataset()
model = mango.BatchModel()
In this section we will see how to train a machine learning system with mango. While mango doesn't enforce a particular file structure, a good standard is to use a typical python package layout. In this example, we will create a package structure containing the models, experiments and datasets modules.
hello/
README
hello/
__init__.py
models.py
experiments.py
datasets.py
The entry point for a mango application is the Experiment.