Implement distributed GP
Closed this issue · 2 comments
mclaughlin6464 commented
I showed a paper a few weeks ago that shows how to mathematically properly distribute GPs. In a new branch I should implement this in place of EC, and remove SB entirely. EC should be able to distribute itself to arbitrary size, which will envelop the SB case. I think that OR will still have a place, especially w.r.t. non-GP models that scale with a full, un-diluted dataset.
mclaughlin6464 commented
Rereading the paper (available here). I'd like to take some notes of the obvious things I'll have to implement and how they'll affect things in pearce.
- Change the liklihood optimization. I can continue to assume that the metric is the same across all GPs, as they do. Their approach is basically the same as mine but formalizing may be good.
- Dataspace partitioning. This will require the most work I feel. What options do I want to have? I could have random & kd-tree options, and I need a # of experts param, as well as an overlap param.
- Emulation. I'll need to develop a notion of "mixing" the experts. The formula are easy, but they depend on the predicted uncertanties, which can be unreliable. Push comes to shove I can just do a normal average on the mean instead of a weighted average with the errors. They may cancel out though!
- Following from the above. I think I'll want to revisit emulate_wrt_*. Is it actually helpful? In this case now I don't have to do any weird interpolation, it should just be based on the basic emulator. I guess the conveinience functions will still be useful, it'll jsut behave the same way as OR essentially.
- Distribution. I think for now that's not worth considering, but in the future it could be useful. Building would be much faster even just running on multiple cores. Hypothetically, I can support 1 1000^2 matrix, I can support 4 500^2 matrices, which take ~ 5 min to build each. So distributing could be helpful there, and I'd be using twice as much data (though with less detailed covariance).
- (related to 2). Do I split on r & z? I like the idea of treating them like any other parameter. However, intuitively I'd believe that they're more correlated than random other params, but that thinking could be flawed. Code would be a lot easier if I don't do that. I guess I can try both, but I don't know if I'll actually do that.
mclaughlin6464 commented
This has been implemented and pulled into master.