topsframework/tops

Trainers

Closed this issue · 10 comments

New front-end for training algorithms, which act as a builder (GoF design pattern) of probabilistic models.

I believe we could have both:

  • static methods that train a model, like:
auto hmm = HiddenMarkovModel::trainBaumWelch(obs_training_set, initial_model, 100, 0.01);
  • A builder:
auto model = initial_model.trainer(HiddenMarkovMOdel::ML, 100, 0.01)->add_training_set(obs1, labels1)
                                                                    ->add_training_set(obs2, labels2)
                                                                    ->add_training_set(obs3, labels3)
                                                                    ->add_training_set(obs4, labels4)
                                                                    ->train();
// or
auto model = HiddenMarkovModel::trainer(HiddenMarkovMOdel::ML, initial_model, 100, 0.01)->add_training_set(obs1, labels1)
                                                                                        ->add_training_set(obs2, labels2)
                                                                                        ->add_training_set(obs3, labels3)
                                                                                        ->add_training_set(obs4, labels4)
                                                                                        ->train();

I believe that's possible. But, what is initial_model? Would that be an "empty" model or something like this?

I was thinking about two nice things we could do to simplify the make / train methods:

  • Implement make only once

    In the main hierarchy, I believe we could use the class ProbabilisticModelCrtp to define a method make in therms of the Derived template parameter, and then use them for all subclasses. It has a problem, though: in order to build the new class, constructors will have to be public in all classes. I don't believe this is a problem - we would just allow one to create a class directly, if there was no need to reuse it. That would allow us to do optimizations inside the algorithms, and we'd have no memory management problems, as our algorithms and data structures will still demand shared pointers of models. What do you think?
  • Create enum classes with training algorithms for models
    As training algorithms are very different, it's difficult to standardize them under an unique method signature. So, I think we could do the following:
    • Create an enum class training_algorithm for each concrete model, with the name of all training algorithms available for that class.
    • Create a Trainer<Decorator> front-end, with a method add_training_sequence and train
    • Create a SimpleTrainer<Decorator, Model>, inheriting from the above. It will have a constructor .that will accept a training_algorithm and will perfect forward the other parameters to the method train inside the model. This train would only accept a complete training set, as the trainer would work as a storage of parameters and as a builder who could create a new model in lazy way.

Looking to this, it seems it's possible to store any kind of parameters using the perfect forward technique. It will require a little of though, but I think I can create something to deal with that.

I believe that's possible. But, what is initial_model? Would that be an "empty" model or something like this?

I was thinking about two nice things we could do to simplify the make / train methods:

  • Implement make only once
    In the main hierarchy, I believe we could use the class ProbabilisticModelCrtp to define a method make in therms of the Derived template parameter, and then use them for all subclasses. It has a problem, though: in order to build the new class, constructors will have to be public in all classes. I don't believe this is a problem - we would just allow one to create a class directly, if there was no need to reuse it. That would allow us to do optimizations inside the algorithms, and we'd have no memory management problems, as our algorithms and data structures will still demand shared pointers of models. What do you think?
  • Create enum classes with training algorithms for models
    As training algorithms are very different, it's difficult to standardize them under an unique method signature. So, I think we could do the following:
    • Create an enum class training_algorithm for each concrete model, with the name of all training algorithms available for that class.
    • Create a Trainer<Decorator> front-end, with a method add_training_sequence and train
    • Create a SimpleTrainer<Decorator, Model>, inheriting from the above. It will have a constructor .that will accept a training_algorithm and will perfect forward the other parameters to the method train inside the model. This train would only accept a complete training set, as the trainer would work as a storage of parameters and as a builder who could create a new model in lazy way.

Looking to this, it seems it's possible to store any kind of parameters using the perfect forward technique. It will require a little of though, but I think I can create something to deal with that.

The initial_model is an valid (non empty) model that is used as a initial guess to the training algorithm.

Ah, I forgot. @renatocf , I liked your proposal. It will remove a lot of duplications (make methods).

I think we should do it.

Nice. I'll do this change on branch trainer. I'm also trying to solve a problem with the Makefile, so I've been doing several commits trying to upgrade it to solve this little problem. Soon I think everything will be right.

@igorbonadio, I've finished both changes, but I did them on branch master. You can take a look on them.

@igorbonadio,

To formalize and publicly document our idea:

Trainer is a front-end that can be of two kinds: "fixed", whose parameters are set in its construction, through a previously created model; and "plain" (the best name I could think), whose parameters are estimated from its data. The Trainer interface will have an add_training_set method (to act as a Builder) and a train method, whose first parameter will be an enum of training algorithms.

At the same time, we identified that a Trainer can be a Trainer for sequences or labels (as Generator and Estimator).

This way, we will have 4 different methods to create trainers:

Sequence Labeling
Fixed sequence_fixed_trainer() labeling_fixed_trainer()
Plain sequence_plain_trainer() labeling_plain_trainer()

With the following usage:

/* Creating and training an iid */
auto iid_t = IID::sequenceFixedTrainer();
iid_t->addTrainingSet(dataset);
model = iid_t->train(IID::sequence_training_algorithm::Burge);

/* Two types of trainers: fixed and plain */
auto iid_trainer_obs = IID::sequenceFixedTrainer(iid_t); /* return fixed trainer */
auto iid_trainer_dur = IID::sequencePlainTrainer();      /* return plain trainer */

/* An example of observation and label */
auto lab = Sequence{ 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1 };
auto obs = Sequence { 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1 };

/* Creating and training a HMM */
auto hmm_sequence_trainer = GHMM::sequencePlainTrainer { state1, state2, ... };
hmm_sequence_trainer->add_training_set(obs);
auto hmm1 = hmm_sequence_trainer->train(GHMM::sequence_training_algorithm::BaumWelch);

auto hmm_labeling_trainer = GHMM::labelingPlainTrainer { state1, state2, ... };
hmm_labeling_trainer->add_training_set(Labeling<Sequence>(obs, lab));
auto hmm2 = hmm_labeling_trainer->train(GHMM::labeling_training_algorithm::MaximumLikehood);

/* Creating and training a GHMM */
auto ghmm_labeling_trainer = GHMM::labelingPlainTrainer { state1, state2, ... };
ghmm_labeling_trainer->add_training_set(Labeling<Sequence>(obs, lab));
auto ghmm = ghmm_labeling_trainer->train(GHMM::labeling_training_algorithm::MaximumLikehood);

I'm not sure if I like this design, but I could not come up with anything better for now. Evaluators have a simple / cached version, and we decide which one to return through a boolean parameter in their creator methods. This can be done because SimpleEvaluator and CachedEvaluator have the same constructor parameters. However, PlainTrainer and FixedTrainer would have different constructors, wouldn't they? The first would have no parameters and the second would accept a model. Do you believe we could mix them in a way they behave like Evaluators?

If we agree with a good final design, I'll try to implement it in tops-architecture so I can see any technical problems.

In fact, I believe PlainTrainer should receive n parameters. For example, Burge's algorithm of IID should receive double c and unsigned int max_length. So:

auto iid_trainer_dur = IID::sequencePlainTrainer(1.2, 100); 

And FixedTrainer should receive just a model.

You are right. It seems that they will have different constructors... PlainTrainer's constructor should pass its parameters to the right static method (when .train() is called), and FixedTrainer should just store the passed model.

Ah, and I don't know if I like the name PlainTrainer...

The idea behind plain is that it is a synonym for simple which has the same number of letters as fixed. But I didn't accept it very well either. We can change it.

Now, about the constructors: I think we can pass the parameters when calling train, so we will be able to create different HMMs with the same training set. But I have to think a little more about how I'll do it.

And what do you think about having 4 creator methods?