8/30
- Add new feature: estimation of a lower value.
8/17
- try randomly secting the domain.
- Using the same network, can we generalize f(x) = 0 to f(x) = n?
- Sample from the neiborhood of every point.
- Compare with branch and prune, even the network doesn't give a fair reward.
a. train f(x) = 0
b. compare B&P f(x) = 0 and B&P+NN f(x) = 0
c. compare B&P f(x) = n and B&P+NN f(x) = n
8/10
- converges after finding a fair enough solution.
- nn fails to find answer if the domain is changed after training.
8/9
Can't use tilted value cuz the weights for policy and value are shared. When the interval is small, branching intervals will lead to the same action again, until masked. (Because the input for NN is very similar).
8/3/2018
- Try eliminating value head.
- Try pure MCTS.
- Try difficult functions.
7/24/2018
- multi dimension issue within sampling data representation
- Need unified naming among BB files
- Need benchmarks
- Passing messages among nodes in a graph: https://arxiv.org/pdf/1704.01212.pdf
- Reward calculation. Currently by 1-|value in function| and collect all terminal reward as training example
- backtrack
7/13/2018
- We need a systemetic guide/ introduction to branch and bound.
- We need a systemetic guide/ introduction to branch and bound.
- How to choose the middle value to cut? Why 0.4 ?
- Does relaxation mean making problem easier?
- what does subproblem infeasible mean?
- Documentation for pyibex.
- Representation of the value, can it be all nagative?
- Representation of the state.
- When it reaches ternimal, how to calculate its reward? Possible approach: calculate the mean give the lower and upper value.