ioanabica/Counterfactual-Recurrent-Network

Questions about how to stably reproducing reported results of CRN(lambda=0) and CRN with Tumor Cancer Simulation data

mengcz13 opened this issue · 0 comments

I am writing to inquire about your ICLR 2020 paper, "Estimating counterfactual treatment outcomes over time through adversarially balanced representations", which I found very interesting. I have been attempting to reproduce your reported results of CRN(lambda=0) and CRN in Figure 1 and Table 8, but I have encountered some difficulties. I was wondering if you could kindly provide some guidance or clarification on the methodology or data used in the paper.

With gamma=10, the paper reported that "CRN improves by 48.1% on the same model architecture without domain adversarial training CRN (λ = 0)" in one-step counterfactual prediction, which is 2.41% v.s. 3.57% according to Table 8.

I tried reproducing these numbers with my own fork here: https://github.com/mengcz13/Counterfactual-Recurrent-Network (the main changes include adding the config to enable CRN(lambda=0), using the best hyperparameters reported in Table 6). I used tensorflow-gpu==1.15.0 and python==3.6.15.

For both CRN(lambda=0) and CRN, I repeated the experiments 5 times with different random seeds (see commands in https://github.com/mengcz13/Counterfactual-Recurrent-Network/blob/master/reproduce_script.sh) and got the average(stdev) of normalized RMSE for one-step prediction.

The differences between the reported results and my reproduced results are listed as follows:

Reported Results in Table 8 Reproduced Results [reported as average(stdev)]
CRN(lambda=0) 3.57% 3.93% (0.34%)
CRN 2.41% 4.22% (0.81%)

My reproduced normalized RMSE for one-step counterfactual prediction of CRN is significantly higher than the reported numbers, and could not demonstrate the benefit of having balanced representation for treatment effect estimation. I would appreciate it if you could offer any hints or insights on this matter. Thank you very much.