initialisation with know inputs and outputs
henrus opened this issue · 11 comments
We want an initialisationStrategy that takes a set of inputs and outputs to do fitting. The data will be passed when the model initialised in initiModels OR 'model' module. This strategy may be used on backward or forward mapping model.
I think that this can be implemented in user space by deriving a new class from InitialisationStrategy.
The feature is under way (currently implemented in my modelModules branch).
However, it required somewhat considerable changes to "SurrogateFunction", e.g. allowing: parameter fitting (including methods such as "error", "updateFitDataFromFwSpec", "updateMinMax"). Do you have any reason why I should not add these methods to the parent class "SurrogateModel"?
Ah, actually there is info in the right place. Please ignore my bashing in #9 to some extend.
Well, the idea was that a ForwardMappingModel would not need fitting. Now that we allow initialisation with raw data for inputs and outputs, this is not true any more and the functionality can be moved into the parent class.
I assumed that you planned that a ForwardMappingModel
would not need fitting and I wanted to keep it that way as well. My plan was therefore to make an init. strategy that could handle fitting internally. The BackwardMappingModel
would be able to use a different fitting (and out of bounds strategy) to initiate the model, as illustrated below.
initialisationStrategy= Strategy.InitialData(
initialData=
{
'input': list,
'input': list,
'output': list,
},
initialFittingStrategy= Strategy.NonLinFit(),
initialOutOfBoundsStrategy= Strategy.ExtendSpace(),
)
Moreover, I wanted the initialFittingStrategy
and initialOutOfBoundsStrategy
to be optional, meaning that they have a default behaviour. I want to be able to do this and reverse the changes I made to the ForwardMappingModel
, however, for the time being I needed to make a working example.
Okay, got it. The problem you are facing is that parameterFittingStrategy
is not defined for a ForwardMappingModel
. I think there are at least two issues we need to tackle after you get a first version up and running. The first issue is code duplication. This should be handled by multiple inheritance. The second issue is the input format. Here, I propose to keep the name and simply say:
initialisationStrategy= Strategy.InitialData(
initialData=
{
'input': list,
'input': list,
'output': list,
},
parameterFittingStrategy= Strategy.NonLinFit(),
)
Proposal: Any lookup should operate recursively until the keyword is found. This helps with looking up parameterFittingStrategy
in the two cases we consider now. However, the mechanism is quite generic and will also help de-duplicating other properties such as maxError
. This is not a problem, yet, but ...
BTW , I think there is not need to for an initialOutOfBoundsStrategy
.
Hi Sigve,
I just talked to UNITS. They will provide data to fit during initialisation, but it will come as a database collection. A Script to create the data-set is on its way. My idea would be to read the collection name and filter description in the constructor and then build the fitting data from the query result.
Henrik
Hello Henrik,
Sorry for the late reply, there are several things happening simultaneously at the moment. That sounds good, my interpretation of your idea is the following:
class InitialData(...):
def __init__(self, *args, **kwargs):
if kwargs.has_key("Collection"):
self.Query(kwargs)
InitialisationStrategy.__init__(self, *args, **kwargs)
def Query(self, **kwargs):
"""
Method for query database and processing data.
"""
query(kwargs["Collection"])
kwargs["initialData"] = # Process data and insert
def newPoints(self)
return self["initialData"]
Hello Sigve,
yes, this is practically the idea in meta-code. I would implement this feature (and other init strategies) in a class derived from one that does the fitting. The fitting data should be passed using a generator function. This will allow you to operate on minimal data regardless of its representation.
My branch contains the following strategy: (in Strategy.py
)
@explicit_serialize
class InitialData(InitialisationStrategy):
"""
Class initialising a SurrogateModel given a dataset of input-output
relations.
"""
def __init__(self, *args, **kwargs):
InitialisationStrategy.__init__(self, *args, **kwargs)
def newPoints(self):
return self['initialData']
def workflow(self, model):
et = load_object({'_fw_name': '{{modena.Strategy.InitialDataPoints}}'})
points = self.newPoints()
e = six.next(six.itervalues(points))
p = { k:[0]*len(points[k]) for k in points }
for i in xrange(len(e)):
for k in points:
p[k][i] = points[k][i]
for m in model.substituteModels:
p.update(m.callModel(p))
t = et
t['point'] = p
fw = Firework(t)
wf = Workflow2( [fw], name='initialising to dataset')
# pf = load_object({'_fw_name': '{{modena.Strategy.NonLinFitToPointWithSmallestError}}'})
# NonLinFitToPointWithSmallestError
wf.addAfterAll(model.parameterFittingStrategy().workflow(model))
return wf
I also have an example idealGas
demonstrating the principle. I believe this code will run ctrl + c
, ctrl + v
style:
from modena import SurrogateModel, InitialData
from modena.Strategy import Workflow2
from fireworks import LaunchPad
from fireworks.core.rocket_launcher import rapidfire
from fireworks.utilities.fw_serializers import load_object
f2 = CFunction(
Ccode= '''
#include "modena.h"
#include "math.h"
void idealGas
(
const double* parameters,
const double* inherited_inputs,
const double* inputs,
double *outputs
)
{
const double p0 = inputs[0];
const double T0 = inputs[1];
const double R = parameters[0];
outputs[0] = p0/R/T0;
}
''',
# These are global bounds for the function
inputs={
'p0': { 'min': 0, 'max': 9e99, 'argPos': 0 },
'T0': { 'min': 0, 'max': 9e99, 'argPos': 1 },
},
outputs={
'rho0': { 'min': 9e99, 'max': -9e99, 'argPos': 0 },
},
parameters={
'R': { 'min': 0.0, 'max': 9e99, 'argPos': 0 }
},
)
m1 = ForwardMappingModel(
_id= 'idealGas',
surrogateFunction= f2,
substituteModels= [ ],
parameters= [ 287.0 ],
inputs={
'p0': { 'min': 0, 'max': 9e99 },
'T0': { 'min': 0, 'max': 9e99 },
},
outputs={
'rho0': {'min': 0, 'max': 9e99 },
},
initialisationStrategy= InitialData(
initialData={
'p0' : [1,2,3,4,5],
'T0' : [296,297,298,299,300],
'rho0' : [0.000011771353234, 0.00002346343809758, 0.00003507705259219, 0.00004661298404, 0.0000580720092915],
},
),
)
# set up the LaunchPad and reset it
launchpad = LaunchPad()
launchpad.reset('', require_password=False)
initWfs = Workflow2([])
for m in SurrogateModel.get_instances():
initWfs.addNoLink(m.initialisationStrategy().workflow(m))
# store workflow and launch it locally
launchpad.add_wf(initWfs)
rapidfire(launchpad)
It is a couple of months since I wrote it.
Hi Sigve,
what do you mean by ctrl-c/ctrl-v style? Can please prepare a merge request. Tx.
Henrik
Haha that was not a descriptive message. I meant you should be able to copy
paste the code and run it.
Will do!
Sigve
On Friday, September 11, 2015, Henrik Rusche notifications@github.com
wrote:
Hi Sigve,
what do you mean by ctrl-c/ctrl-v style? Can please prepare a merge
request. Tx.
Henrik—
Reply to this email directly or view it on GitHub
#7 (comment)
.