ae-foster/pyro

Syntax

Closed this issue · 10 comments

I believe pyro syntax has changed (e.g. plate) since I took my fork.

@null-a this and #18 are good tasks to get started with. There are lots of calls to iarange, .independent, etc. that have been deprecated in Pyro 0.3, and a few subtle changes from PyTorch 0.4 to 1.0. I've pushed a branch that updates some of the Pyro syntax and merges Pyro's dev branch into oed-master here, but many of the OED examples are probably broken. You could start by making sure they run and produce the same results with Pyro 0.3/PyTorch 1.0 as on the oed-master branch with PyTorch 0.4.

We'll also need to make oed-master comply with Pyro's rather draconian linter, which just means running flake8 and fixing anything it complains about.

You could start by making sure they run and produce the same results with Pyro 0.3/PyTorch 1.0 as on the oed-master branch with PyTorch 0.4.

Thanks for the suggestion. I had an initial look at ab_test.py, location.py and item_response.py and they all either error out, don't produce deterministic results, or produce outputs that disagree across the two branches, or some combination thereof. I've started trying to understand what's going on but it's slow going at this stage while I'm unfamiliar with the code. If you have any ideas about any of the following I'd be happy to hear them!

  • Other than calling pyro.set_rng_seed() is there anything that typically needs to be done to make pyro programs deterministic?
  • Do you know (off the top of your head) of any changes in PyTorch that might cause a given program to generate different outputs across PyTorch versions, even when fixing the seed, thus thwarting this plan? (The normal rng seems to produce the same results between PyTorch versions at least.)

Also, here are a couple more specific observations about the examples I looked at. (I'm recording here partly for my own benefit, but also in case anyone has any input.)

  • location.py seems to be failing on branch pyro-syntax-fixes because CensoredSigmoidNormal doesn't implement the expand method.
  • ab_test.py isn't deterministic with the seed fixed, but the change in output across runs in relatively small. Larger changes are seen between the oed-master and pyro-syntax-fixes branches. One thing I noticed -- the most recent code dropped a call to gpmodel.optimize (diff below) that's present in oed-master. Is this change OK?
  • item_response.py fails because it's missing import math. With that fixed, the example is deterministic, but produces different outputs across oed-master and pyro-syntax-fixes. I don't yet know why.
index 51aef815..71a1f3fb 100644
--- a/examples/contrib/oed/ab_test.py
+++ b/examples/contrib/oed/ab_test.py

@@ -102,7 +102,6 @@ def main(num_vi_steps, num_bo_steps):
         gpmodel = gp.models.GPRegression(
             X, y, gp.kernels.Matern52(input_dim=1, lengthscale=torch.tensor(10.)),
             noise=torch.tensor(noise), jitter=1e-6)
-        gpmodel.optimize(loss=TraceEnum_ELBO(strict_enumeration_warning=False).differentiable_loss)
         gpbo = GPBayesOptimizer(constraints.interval(0, 100), gpmodel,
                                 num_acquisitions=num_acquisitions)
         pyro.clear_param_store()

they all either error out, don't produce deterministic results, or produce outputs that disagree across the two branches, or some combination thereof.

Yeah, that sounds about right :) Let's prioritize ab_test.py and item_response.py.

Other than calling pyro.set_rng_seed() is there anything that typically needs to be done to make pyro programs deterministic?

Hmm, there shouldn't be - maybe the merge introduced some extra randomness somewhere, or a seed is being set somewhere else. I'll take a closer look. One reference that might be informative but hard to grok is this Pyro PR that updated Pyro's dev branch to PyTorch 1.0 from PyTorch 0.4.

Also, are you sure you're running the oed-master code under PyTorch 0.4 and the update-oed-pytorch/pyro-syntax-changes code under PyTorch 1.0? I know I've been bitten by this repeatedly while working with the OED code, it took me a while to get parallel dev environments set up with the right dependencies with virtualenv.

the most recent code dropped a call to gpmodel.optimize (diff below) that's present in oed-master. Is this change OK?

Edit: It looks like that API was deprecated in favor of pyro.contrib.gp.util.train, but also that gp.util.train is now being called within GPBayesOptimizer, so that fix shouldn't be necessary.

@null-a I'm not seeing nondeterminism in ab_test.py using the branch in #23 and Python 3.7 and torch==1.0.1. Can you post the output of two executions of python ab_test.py --seed 1 --num-bo-steps 3, and of pip freeze?

I'm not seeing nondeterminism in ab_test.py

@eb8680 It looks like it might only show up when doing >3 BO steps. It also seems to be exaggerated by doing fewer VI steps. In particular, I was running python ab_test.py --seed 42 -n 10. I'll include a diff of the output from two runs of that below, as well as the output of pip freeze.

42,43c42,43
< tensor([ 36.2574,  36.0797,  36.2237,  99.9996,  36.0790,  99.9994,  99.9999,
<          36.4414, 100.0000,  36.2980])
---
> tensor([ 36.2576,  36.0796,  36.2238,  99.9974,  35.8862,  36.3416,  36.3387,
>          99.9991, 100.0000,  36.2980])
45c45
< tensor([1.7671, 2.5061, 2.0731, 1.8292, 2.7736, 2.0237, 2.6928, 2.3183, 2.5682,
---
> tensor([1.7671, 2.5061, 2.0731, 1.8292, 2.7735, 1.8794, 2.7151, 2.2988, 2.5688,
48,49c48,49
< tensor([ 36.3083,  36.5323,  36.4291,  99.9996, 100.0000,  99.9995,  99.9990,
<          36.7426,  36.6043,  36.3180])
---
> tensor([ 36.6276,  36.6059, 100.0000,  36.6527,  99.9994,  36.5322,  99.9993,
>          99.9997,  46.9640,  99.9997])
51,52c51,52
< tensor([2.1777, 2.0788, 2.5141, 2.3014, 2.9897, 2.4307, 2.3014, 2.6234, 2.1361,
<         2.1130], grad_fn=<DivBackward0>)
---
> tensor([2.1540, 2.1493, 2.3661, 2.2400, 3.2692, 2.3038, 2.3041, 2.6913, 2.1216,
>         2.0655], grad_fn=<DivBackward0>)
54c54
< (tensor([35.9303]), tensor(2.3194))
---
> (tensor([38.3554]), tensor(2.3084))
(env_latest) oed$ git rev-parse --short HEAD
bb542ade
(env_latest) oed$ python --version
Python 3.6.2
(env_latest) oed$ pip freeze
alabaster==0.7.12
apipkg==1.5
appnope==0.1.0
atomicwrites==1.3.0
attrs==19.1.0
Babel==2.6.0
backcall==0.1.0
bleach==3.1.0
certifi==2019.3.9
chardet==3.0.4
contextlib2==0.5.5
coverage==4.5.3
cycler==0.10.0
decorator==4.4.0
defusedxml==0.5.0
docutils==0.14
entrypoints==0.3
execnet==1.5.0
flake8==3.7.7
graphviz==0.10.1
idna==2.8
imagesize==1.1.0
ipykernel==5.1.0
ipython==7.3.0
ipython-genutils==0.2.0
ipywidgets==7.4.2
isort==4.3.15
jedi==0.13.3
Jinja2==2.10
jsonschema==3.0.1
jupyter==1.0.0
jupyter-client==5.2.4
jupyter-console==6.0.0
jupyter-core==4.4.0
kiwisolver==1.0.1
MarkupSafe==1.1.1
matplotlib==3.0.3
mccabe==0.6.1
mistune==0.8.4
more-itertools==6.0.0
nbconvert==5.4.1
nbformat==4.4.0
nbsphinx==0.4.2
nbstripout==0.3.3
nbval==0.9.1
notebook==5.7.6
numpy==1.16.2
observations==0.1.4
opt-einsum==2.3.2
packaging==19.0
pandas==0.24.2
pandocfilters==1.4.2
parso==0.3.4
pexpect==4.6.0
pickleshare==0.7.5
Pillow==5.4.1
pluggy==0.9.0
prometheus-client==0.6.0
prompt-toolkit==2.0.9
ptyprocess==0.6.0
py==1.8.0
pycodestyle==2.5.0
pyflakes==2.1.1
Pygments==2.3.1
pypandoc==1.4
pyparsing==2.3.1
-e git+https://github.com/uber/pyro.git@bb542ade6c0e8131464d7ef035d6b0c1fd92485b#egg=pyro_ppl
pyrsistent==0.14.11
pytest==4.3.1
pytest-forked==1.0.2
pytest-xdist==1.26.1
python-dateutil==2.8.0
pytz==2018.9
pyzmq==18.0.1
qtconsole==4.4.3
requests==2.21.0
scipy==1.2.1
seaborn==0.9.0
Send2Trash==1.5.0
six==1.12.0
snowballstemmer==1.2.1
Sphinx==1.8.5
sphinx-rtd-theme==0.4.3
sphinxcontrib-websupport==1.1.0
terminado==0.8.1
testpath==0.4.2
torch==1.0.1.post2
torchfile==0.1.0
torchvision==0.2.1
tornado==6.0.1
tqdm==4.31.1
traitlets==4.3.2
urllib3==1.24.1
visdom==0.1.8.8
wcwidth==0.1.7
webencodings==0.5.1
websocket-client==0.55.0
wget==3.2
widgetsnbextension==3.4.2
yapf==0.26.0

Cf #14 , perhaps let's not continue to maintain ab_test as an example. It uses the BO toolbox and was always a little flakey because of the issues mentioned in pyro-ppl#1258

item_response.py ... produces different outputs across oed-master and pyro-syntax-fixes

I ran a git bisect and the output changes at the commit (835aa31) that bumped the pytorch version. However, it looks like both branches converge to the same value (see below), though perhaps someone more familiar with the semantics ought to confirm correctness was maintained.

Diff of the outputs of python examples/contrib/oed/item_response.py -N 10000 -M 10000:

9c9
< EIG tensor([ 0.9517,  1.0228,  1.1054])
---
> EIG tensor([0.9540, 1.0216, 1.1076])

(Requires #24.)

location.py seems to be failing on branch pyro-syntax-fixes because CensoredSigmoidNormal doesn't implement the expand method.

turk_experiment.py fails for the same reason.

@null-a , eyeballing those EIG values I do believe they are converging to the same value. Presumably some random number simulator changed as the pytorch version changed.

Re: implementing expand for CensoredSigmoidNormal: this should be a case of expanding the underlying distribution (a transformed Normal), as well as the upper_lim and lower_lim attributes. I can jump in and take a look if that simple fix doesn't work.

I do believe they are converging to the same value

Great.

this should be a case of expanding the underlying distribution

Yeah, I'll take a look.