Trainers
Closed this issue · 10 comments
New front-end for training algorithms, which act as a builder (GoF design pattern) of probabilistic models.
I believe we could have both:
- static methods that train a model, like:
auto hmm = HiddenMarkovModel::trainBaumWelch(obs_training_set, initial_model, 100, 0.01);
- A builder:
auto model = initial_model.trainer(HiddenMarkovMOdel::ML, 100, 0.01)->add_training_set(obs1, labels1)
->add_training_set(obs2, labels2)
->add_training_set(obs3, labels3)
->add_training_set(obs4, labels4)
->train();
// or
auto model = HiddenMarkovModel::trainer(HiddenMarkovMOdel::ML, initial_model, 100, 0.01)->add_training_set(obs1, labels1)
->add_training_set(obs2, labels2)
->add_training_set(obs3, labels3)
->add_training_set(obs4, labels4)
->train();
I believe that's possible. But, what is initial_model
? Would that be an "empty" model or something like this?
I was thinking about two nice things we could do to simplify the make
/ train
methods:
- Implement
make
only once
In the main hierarchy, I believe we could use the classProbabilisticModelCrtp
to define a methodmake
in therms of theDerived
template parameter, and then use them for all subclasses. It has a problem, though: in order to build the new class, constructors will have to be public in all classes. I don't believe this is a problem - we would just allow one to create a class directly, if there was no need to reuse it. That would allow us to do optimizations inside the algorithms, and we'd have no memory management problems, as our algorithms and data structures will still demand shared pointers of models. What do you think? - Create
enum classes
with training algorithms for models
As training algorithms are very different, it's difficult to standardize them under an unique method signature. So, I think we could do the following:- Create an
enum class training_algorithm
for each concrete model, with the name of all training algorithms available for that class. - Create a
Trainer<Decorator>
front-end, with a methodadd_training_sequence
andtrain
- Create a
SimpleTrainer<Decorator, Model>
, inheriting from the above. It will have a constructor .that will accept atraining_algorithm
and will perfect forward the other parameters to the methodtrain
inside the model. Thistrain
would only accept a complete training set, as the trainer would work as a storage of parameters and as a builder who could create a new model in lazy way.
- Create an
Looking to this, it seems it's possible to store any kind of parameters using the perfect forward technique. It will require a little of though, but I think I can create something to deal with that.
I believe that's possible. But, what is initial_model
? Would that be an "empty" model or something like this?
I was thinking about two nice things we could do to simplify the make
/ train
methods:
- Implement
make
only once
In the main hierarchy, I believe we could use the classProbabilisticModelCrtp
to define a methodmake
in therms of theDerived
template parameter, and then use them for all subclasses. It has a problem, though: in order to build the new class, constructors will have to be public in all classes. I don't believe this is a problem - we would just allow one to create a class directly, if there was no need to reuse it. That would allow us to do optimizations inside the algorithms, and we'd have no memory management problems, as our algorithms and data structures will still demand shared pointers of models. What do you think? - Create
enum classes
with training algorithms for models
As training algorithms are very different, it's difficult to standardize them under an unique method signature. So, I think we could do the following:- Create an
enum class training_algorithm
for each concrete model, with the name of all training algorithms available for that class. - Create a
Trainer<Decorator>
front-end, with a methodadd_training_sequence
andtrain
- Create a
SimpleTrainer<Decorator, Model>
, inheriting from the above. It will have a constructor .that will accept atraining_algorithm
and will perfect forward the other parameters to the methodtrain
inside the model. Thistrain
would only accept a complete training set, as the trainer would work as a storage of parameters and as a builder who could create a new model in lazy way.
- Create an
Looking to this, it seems it's possible to store any kind of parameters using the perfect forward technique. It will require a little of though, but I think I can create something to deal with that.
The initial_model is an valid (non empty) model that is used as a initial guess to the training algorithm.
Ah, I forgot. @renatocf , I liked your proposal. It will remove a lot of duplications (make methods).
I think we should do it.
Nice. I'll do this change on branch trainer
. I'm also trying to solve a problem with the Makefile, so I've been doing several commits trying to upgrade it to solve this little problem. Soon I think everything will be right.
@igorbonadio, I've finished both changes, but I did them on branch master
. You can take a look on them.
To formalize and publicly document our idea:
Trainer
is a front-end that can be of two kinds: "fixed", whose parameters are set in its construction, through a previously created model; and "plain" (the best name I could think), whose parameters are estimated from its data. The Trainer
interface will have an add_training_set
method (to act as a Builder) and a train
method, whose first parameter will be an enum of training algorithms.
At the same time, we identified that a Trainer
can be a Trainer
for sequences or labels (as Generator
and Estimator
).
This way, we will have 4 different methods to create trainers:
Sequence | Labeling | |
---|---|---|
Fixed | sequence_fixed_trainer() | labeling_fixed_trainer() |
Plain | sequence_plain_trainer() | labeling_plain_trainer() |
With the following usage:
/* Creating and training an iid */
auto iid_t = IID::sequenceFixedTrainer();
iid_t->addTrainingSet(dataset);
model = iid_t->train(IID::sequence_training_algorithm::Burge);
/* Two types of trainers: fixed and plain */
auto iid_trainer_obs = IID::sequenceFixedTrainer(iid_t); /* return fixed trainer */
auto iid_trainer_dur = IID::sequencePlainTrainer(); /* return plain trainer */
/* An example of observation and label */
auto lab = Sequence{ 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1 };
auto obs = Sequence { 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1 };
/* Creating and training a HMM */
auto hmm_sequence_trainer = GHMM::sequencePlainTrainer { state1, state2, ... };
hmm_sequence_trainer->add_training_set(obs);
auto hmm1 = hmm_sequence_trainer->train(GHMM::sequence_training_algorithm::BaumWelch);
auto hmm_labeling_trainer = GHMM::labelingPlainTrainer { state1, state2, ... };
hmm_labeling_trainer->add_training_set(Labeling<Sequence>(obs, lab));
auto hmm2 = hmm_labeling_trainer->train(GHMM::labeling_training_algorithm::MaximumLikehood);
/* Creating and training a GHMM */
auto ghmm_labeling_trainer = GHMM::labelingPlainTrainer { state1, state2, ... };
ghmm_labeling_trainer->add_training_set(Labeling<Sequence>(obs, lab));
auto ghmm = ghmm_labeling_trainer->train(GHMM::labeling_training_algorithm::MaximumLikehood);
I'm not sure if I like this design, but I could not come up with anything better for now. Evaluator
s have a simple / cached version, and we decide which one to return through a boolean parameter in their creator methods. This can be done because SimpleEvaluator
and CachedEvaluator
have the same constructor parameters. However, PlainTrainer
and FixedTrainer
would have different constructors, wouldn't they? The first would have no parameters and the second would accept a model. Do you believe we could mix them in a way they behave like Evaluator
s?
If we agree with a good final design, I'll try to implement it in tops-architecture so I can see any technical problems.
In fact, I believe PlainTrainer
should receive n parameters. For example, Burge's algorithm of IID should receive double c
and unsigned int max_length
. So:
auto iid_trainer_dur = IID::sequencePlainTrainer(1.2, 100);
And FixedTrainer
should receive just a model.
You are right. It seems that they will have different constructors... PlainTrainer
's constructor should pass its parameters to the right static method (when .train() is called), and FixedTrainer
should just store the passed model.
Ah, and I don't know if I like the name PlainTrainer
...
The idea behind plain is that it is a synonym for simple which has the same number of letters as fixed. But I didn't accept it very well either. We can change it.
Now, about the constructors: I think we can pass the parameters when calling train
, so we will be able to create different HMMs with the same training set. But I have to think a little more about how I'll do it.
And what do you think about having 4 creator methods?