floswald/SMM.jl

Example not producing correct results

Closed this issue · 16 comments

@floswald I'm running the example Jupyter notebook, but it's taking almost 80 minutes and I'm getting the following as output:

SMM.summary(MA) = 3×5 DataFrame
│ Row │ id    │ acc_rate │ perc_exchanged │ exchanged_most_with │ best_val │
│     │ Int64 │ Float64  │ Float64        │ Int64               │ Float64  │
├─────┼───────┼──────────┼────────────────┼─────────────────────┼──────────┤
│ 110.0010.00-1.0     │
│ 220.0010.00-1.0     │
│ 330.0010.00-1.0     │
Quasi posterior mean for p1 = 0.2
Quasi posterior median for p1 = 0.2
True value = -1.0
abs(True value - Quasi posterior median)  = 1.2

Quasi posterior mean for p2 = -0.2
Quasi posterior median for p2 = -0.2
True value = 1.0
abs(True value - Quasi posterior median)  = 1.2

Do you happen to know what might be going on? I'm using Julia 1.5.0. I would like to use your package to teach my PhD students (and myself!) how to do SMM. If you know of other "simple" examples (e.g. binary logit estimation via SMM or linear regression via SMM) I would love to use those as well!

The exact code I ran is below. The main thing I had to change was to use Random.seed() rather than srand() which no longer works.

# add three workers
addprocs(3)

@everywhere using SMM
using DataStructures
using DataFrames
using Plots
using Random

Random.seed!(1234);

# Define true values of parameters
#---------------------------------
trueValues = OrderedDict("mu1" => [-1.0], "mu2" => [1.0])
#------------------------------------------------
# Options
#-------------------------------------------------
# Boolean: do you want to save the plots to disk?
savePlots = false

#------------------------
# initialize the problem:
#------------------------

# Specify the initial values for the parameters, and their support:
pb = OrderedDict("p1" => [0.2,-3,3] , "p2" => [-0.2,-2,2])

# Specify moments to be matched + subjective weights:
moms = DataFrame(name=["mu1","mu2"],value=[trueValues["mu1"][], trueValues["mu2"][]], weight=ones(2))

# GMM objective function to be minized.
# It returns a weigthed distance between empirical and simulated moments
@everywhere function objfunc_normal(ev::Eval; verbose = false)

    start(ev)


    # when running in parallel, display worker's id:
    #-----------------------------------------------
    if verbose == true
        if nprocs() > 1
          println(myid())
        end
    end

    # extract parameters from ev:
    #----------------------------
    mu  = collect(values(ev.params))

    # compute simulated moments
    #--------------------------
    # Monte-Carlo:
    #-------------
    ns = 100000 #number of i.i.d draws from N([mu], sigma)
    #initialize a multivariate normal N([mu], sigma)
    #sigma is set to be the identity matrix
    sigma = [1.0; 1.0]
    # draw ns observations from N([mu], sigma):
    randMultiNormal = SMM.MvNormal(mu,SMM.PDiagMat(sigma))
    # calculate the mean of the simulated data
    simM            = mean(rand(randMultiNormal,ns),2)
    # store simulated moments in a dictionary
    simMoments = Dict(:mu1 => simM[1], :mu2 => simM[2])

    # Calculate the weighted distance between empirical moments
    # and simulated ones:
    #-----------------------------------------------------------
    v = Dict{Symbol,Float64}()
    for (k, mom) in dataMomentd(ev)
        # If weight for moment k exists:
        #-------------------------------
        if haskey(SMM.dataMomentWd(ev), k)
            # divide by weight associated to moment k:
            #----------------------------------------
            v[k] = ((simMoments[k] .- mom) ./ SMM.dataMomentW(ev,k)) .^2
        else
            v[k] = ((simMoments[k] .- mom) ) .^2
        end
    end

    # Set value of the objective function:
    #------------------------------------
    setValue(ev, mean(collect(values(v))))

    # also return the moments
    #-----------------------
    setMoment(ev, simMoments)

    # flag for success:
    #-------------------
    ev.status = 1

    # finish and return
    finish(ev)

    return ev
end



# Initialize an empty MProb() object:
#------------------------------------
mprob = MProb()

# Add structural parameters to MProb():
# specify starting values and support
#--------------------------------------
addSampledParam!(mprob,pb)

# Add moments to be matched to MProb():
#--------------------------------------
addMoment!(mprob,moms)

# Attach an objective function to MProb():
#----------------------------------------
addEvalFunc!(mprob, objfunc_normal)
# estimation options:
#--------------------
# number of iterations for each chain
niter = 1000
# number of chains
nchains = 3

opts = Dict("N"=>nchains,
        "maxiter"=>niter,
        "maxtemp"=> 5,
        "coverage"=>0.025,
        "sigma_update_steps"=>10,
        "sigma_adjust_by"=>0.01,
        "smpl_iters"=>1000,
        "parallel"=>true,
        "maxdists"=>[0.05 for i in 1:nchains],
        "mixprob"=>0.3,
        "acc_tuner"=>12.0,
        "animate"=>false)


#---------------------------------------
# Let's set-up and run the optimization
#---------------------------------------
# set-up BGP algorithm:
MA = MAlgoBGP(mprob,opts)

# run the estimation:
@time SMM.run!(MA);

# show a summary of the optimization:
@show SMM.summary(MA)
# # Plot histograms for the first chain, the one with which inference should be done.
# # Other chains are used to explore the space and avoid local minima
# #-------------------------------------------------------------------------------
# p1 = histogram(MA.chains[1])
# display(p1)
# 
# if savePlots
#     savefig(p1, joinpath(pwd(),"histogram.svg"))
# end
# 
# # Plot the realization of the first chain
# #----------------------------------------
# p2 = plot(MA.chains[1])
# if savePlots
#     savefig(p2, joinpath(pwd(),"lines.svg"))
# end
# display(p2)


# Realization of the first chain:
#-------------------------------
dat_chain1 = SMM.history(MA.chains[1])

# discard the first 10th of the observations ("burn-in" phase):
#--------------------------------------------------------------
dat_chain1[round(Int, niter/10):niter, :]

# keep only accepted draws:
#--------------------------
dat_chain1 = dat_chain1[dat_chain1[:accepted ].== true, : ]

# create a list with the parameters to be estimated
parameters = [Symbol(String("mu$(i)")) for i=1:2]
# list with the corresponding priors:
#------------------------------------
estimatedParameters = [Symbol(String("p$(i)")) for i=1:2]


# Quasi Posterior mean and quasi posterior median for each parameter:
#-------------------------------------------------------------------
for (estimatedParameter, param) in zip(estimatedParameters, parameters)

  println("Quasi posterior mean for $(String(estimatedParameter)) = $(mean(dat_chain1[estimatedParameter]))")
  println("Quasi posterior median for $(String(estimatedParameter)) = $(median(dat_chain1[estimatedParameter]))")
  println("True value = $(trueValues[String(param)][])")
  println("abs(True value - Quasi posterior median)  = $(abs(median(dat_chain1[estimatedParameter]) - trueValues[String(param)][])) \n")

end

yikes, 80 minutes! that doesn't look good. I'll try to take a look!

I should clarify that it would be ~80 minutes on a single core, but I used 3 so it was more like 25 minutes. I'm more concerned about the answer being wrong than the computational time, so any guidance you could provide would be most obliged!

so no idea why that does not work. I have not written that notebook example myself so can't really tell you. I'll remove those. i confirmed that on this branch https://github.com/floswald/SMM.jl/tree/fix all tests pass. In particular, I would recommend sticking to the examples in src/Examples.jl because those are easier to test:

julia> SMM.parallelNormal()
[ Info: Starting estimation loop.
Progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:02:02
┌ Warning: could not find 'filename' in algo.opts
└ @ Main.SMM ~/.julia/dev/SMM/src/mopt/AlgoAbstract.jl:67
[ Info: Done with estimation after 2.0 minutes
summary(MA) = 3×5 DataFrame
│ Row │ id    │ acc_rate │ perc_exchanged │ exchanged_most_with │ best_val    │
│     │ Int64 │ Float64  │ Float64        │ Int64               │ Float64     │
├─────┼───────┼──────────┼────────────────┼─────────────────────┼─────────────┤
│ 110.524598.520.000719613 │
│ 220.20467815.010.00132377  │
│ 330.14364610.020.00677593  │

BGP Algorithm with 3 BGPChains
============================

Algorithm
---------
Current iteration: 200
Number of params to estimate: 2
Number of moments to match: 2



julia> ma = ans

BGP Algorithm with 3 BGPChains
============================

Algorithm
---------
Current iteration: 200
Number of params to estimate: 2
Number of moments to match: 2
julia> using Statistics

julia> median(ma.chains[1])
Dict{Symbol,Float64} with 2 entries:
  :p2 => 1.00019
  :p1 => -0.981635

so that example works. you can also look at test/test_algoBGP.jl for a parallel implementation of that example.

speed

i noticed this runs slow as well. one simple thing to do might be to decrease the number of monte carlo draws. 100_000 seems like a lot.

OK, great! Thanks so much for the quick response. For some reason I had thought you had written the Jupyter notebook (and that it was the same contents as the .jl files you referenced). I'll make sure to consult the examples that you mentioned!

You can close this, unless you would like to keep it open for your own purposes.

Just wanted to update this to say that I've definitely got the examples working, so thanks for that.

Hi all, do you mind sharing how you fix the issue in the example. I tried the above example and got the same results. Also, I got the following warning in each evaluation:

┌ Warning: caught exception MethodError([A long vector consists of numbers which might be random draws]
, (2,), 0x0000000000006e0b) at param 
OrderedCollections.OrderedDict(:p1 => 0.15909334696624722,:p2 => -0.19460764398201102)
└ @ SMM ~/.julia/packages/SMM/MDWS3/src/mopt/mprob.jl:167

It comes from the function evaluateObjective:

function evaluateObjective(m::MProb,p::Union{Dict,OrderedDict};noseed=false)

It seems that there is an error when doing ev = m.objfunc(ev).

Thank you very much.

Hi @zxjroger, is this happening in the tests folder, or with the code that I originally pointed to in this issue (in the notebook)?

Hi @tyleransom,
I have tried one example at
https://github.com/floswald/SMM.jl/blob/master/src/example.ipynb
and another example at
https://github.com/floswald/SMM.jl/blob/fix/src/example.ipynb
Neither of them works. And I don't find any big differences between the two examples. Here is my code based on the second example.

using Distributed
addprocs(Sys.CPU_THREADS-1)
@everywhere using SMM
using DataStructures
using DataFrames
using Plots
using Random

Random.seed!(1234)

trueValues = OrderedDict("mu1" => [-1.0], "mu2" => [1.0])
savePlots = false

pb = OrderedDict("p1" => [0.2,-3,3] , "p2" => [-0.2,-2,2])
moms = DataFrame(name=["mu1","mu2"],value=[trueValues["mu1"][], trueValues["mu2"][]], weight=ones(2))

@everywhere function objfunc_normal(ev::Eval; verbose = false)

    start(ev)

    # when running in parallel, display worker's id:
    #-----------------------------------------------
    if verbose == true
        if nprocs() > 1
          println(myid())
        end
    end

    # extract parameters from ev:
    #----------------------------
    mu  = collect(values(ev.params))

    # compute simulated moments
    #--------------------------
    # Monte-Carlo:
    #-------------
    ns = 1000 #number of i.i.d draws from N([mu], sigma)
    #initialize a multivariate normal N([mu], sigma)
    #sigma is set to be the identity matrix
    sigma = [1.0; 1.0]
    # draw ns observations from N([mu], sigma):
    randMultiNormal = SMM.MvNormal(mu,SMM.PDiagMat(sigma))
    # calculate the mean of the simulated data
    simM            = mean(rand(randMultiNormal,ns),2)
    # store simulated moments in a dictionary
    simMoments = Dict(:mu1 => simM[1], :mu2 => simM[2])

    # Calculate the weighted distance between empirical moments
    # and simulated ones:
    #-----------------------------------------------------------
    v = Dict{Symbol,Float64}()
    for (k, mom) in dataMomentd(ev)
        # If weight for moment k exists:
        #-------------------------------
        if haskey(SMM.dataMomentWd(ev), k)
            # divide by weight associated to moment k:
            #----------------------------------------
            v[k] = ((simMoments[k] .- mom) ./ SMM.dataMomentW(ev,k)) .^2
        else
            v[k] = ((simMoments[k] .- mom) ) .^2
        end
    end

    # Set value of the objective function:
    #------------------------------------
    setValue(ev, mean(collect(values(v))))

    # also return the moments
    #-----------------------
    setMoment(ev, simMoments)

    # flag for success:
    #-------------------
    ev.status = 1

    # finish and return
    finish(ev)

    return ev
end

mprob = MProb()
addSampledParam!(mprob,pb)
addMoment!(mprob,moms)
addEvalFunc!(mprob, objfunc_normal)

niter = 100
nchains = 3

opts = Dict("N"=>nchains,
        "maxiter"=>niter,
        "maxtemp"=> 5,
        "coverage"=>0.025,
        "sigma_update_steps"=>10,
        "sigma_adjust_by"=>0.01,
        "smpl_iters"=>1000,
        "parallel"=>true,
        "maxdists"=>[0.05 for i in 1:nchains],
        "mixprob"=>0.3,
        "acc_tuner"=>12.0,
        "animate"=>false)

MA = MAlgoBGP(mprob,opts)
@time SMM.run!(MA)
@show SMM.summary(MA)

It gives me the aforementioned warning in each evaluation and the final result is

 61.954719 seconds (55.40 M allocations: 2.734 GiB, 2.96% gc time)
SMM.summary(MA) = 3×5 DataFrame
│ Row │ id    │ acc_rate │ perc_exchanged │ exchanged_most_with │ best_val │
│     │ Int64 │ Float64  │ Float64        │ Int64               │ Float64  │
├─────┼───────┼──────────┼────────────────┼─────────────────────┼──────────┤
│ 1   │ 1     │ 0.01     │ 0.0            │ 0                   │ -1.0     │
│ 2   │ 2     │ 0.01     │ 0.0            │ 0                   │ -1.0     │
│ 3   │ 3     │ 0.01     │ 0.0            │ 0                   │ -1.0     │

Haven't worked on this in a while (you can tell).

Does the serial example work at least? Just remove the addprocs call to check.

hi @zxjroger and @tyleransom , this was automatically closed - reopening now. i just merged a new version into the master branch, can you checkout the master branch and try again?

@tyleransom : the package was so slow because I was callign GC.gc() after each evaluation. that used to be a good idea to deal with memory usage getting too large (v0.6 ?) but is terrible now. gone.

@zxjroger the examples notebook is gone. it's out of date and i dont' know what's going on with this example. please run the examples in src/Examples.jl instead. all the unit tests run those examples.

both: expect the unit tests to fail on github actions because i'm testing whether parallel also works - i am not sure how that is goign to turn out on their machines. i'll fix (turn off that test) when i have next time.

docs are slightly out of date at this point, but not too much. sorry.

(you get the master branch by doing

]
add SMM#master

)

Hi @zxjroger and @floswald, here is the code that I typically use to test the package. (I ran it on a single processor, so I believe that the parallelNormal() is actually done serially.)

Everything ran fine both before and after the switch to master.

using SMM, DataFrames
MA = SMM.parallelNormal() # Note: this line may take up to 5 minutes to execute
dc = SMM.history(MA.chains[1])
dc = dc[dc[:accepted].==true, :]
println(describe(dc))

Before updating the master branch, this gave

9×8 DataFrame
│ Row │ variable  │ mean      │ min         │ median      │ max     │ nunique │ nmissing │ eltype   │
│     │ Symbol    │ Float64   │ Real        │ Float64     │ Real    │ Nothing │ Nothing  │ DataType │
├─────┼───────────┼───────────┼─────────────┼─────────────┼─────────┼─────────┼──────────┼──────────┤
│ 1   │ iter      │ 100.51100.5200     │         │          │ Int64    │
│ 2   │ value     │ 0.1085170.0004954930.05146231.41041 │         │          │ Float64  │
│ 3   │ accepted  │ 0.59501.01       │         │          │ Bool     │
│ 4   │ curr_val  │ 0.04730510.0004954930.02986021.41041 │         │          │ Float64  │
│ 5   │ best_val  │ 0.01625440.0004954930.0004954931.41041 │         │          │ Float64  │
│ 6   │ prob      │ 0.6506950.0003606670.7843691.0     │         │          │ Float64  │
│ 7   │ exchanged │ 0.20500.03       │         │          │ Int64    │
│ 8   │ p1        │ -0.938373-1.7664-0.9683980.2     │         │          │ Float64  │
│ 9   │ p2        │ 0.964953-0.2062710.984831.67074 │         │          │ Float64  │

and after updating the branch (and a bunch of other packages), I had to modify the dataframe subset code to

using SMM, DataFrames
MA = SMM.parallelNormal() # Note: this line may take up to 5 minutes to execute
dc = SMM.history(MA.chains[1])
dc = dc[dc[!,:accepted].==true, :]
println(describe(dc))

and it gave

9×7 DataFrame
 Row │ variable   mean        min          median        max      nmissing  eltype
     │ Symbol     Float64     Real         Float64       Real     Int64     DataType
─────┼───────────────────────────────────────────────────────────────────────────────
   1 │ iter       90.8051               1  84.5              199         0  Int64
   2 │ value       0.0591056  0.000352792   0.0331604    1.41041         0  Float64
   3 │ accepted    1.0               true   1.0             true         0  Bool
   4 │ curr_val    0.0591056  0.000352792   0.0331604    1.41041         0  Float64
   5 │ best_val    0.0243325  0.000352792   0.000434421  1.41041         0  Float64
   6 │ prob        0.897093      0.184345   1.0              1.0         0  Float64
   7 │ exchanged   0.372881             0   0.0                2         0  Int64
   8 │ p1         -0.96461       -1.48955  -0.962146         0.2         0  Float64
   9 │ p2          0.9827            -0.2   1.00905      1.55239         0  Float64

As i explain on the readme, you shouldn't use that example any more. There is too much overhead. I provide a slow keyword argument to that test function now that makes it ... slow. Try to run that example instead!

Also, the parallel option to the algorithm is without any consequence now, because I just left a pmap in there. If you have worker processes attached, it does in parallel, otherwise, not.

Thank you so much for the update. I am able to run the example now.

I ran the following code and it worked fine.

using SMM, DataFrames
MA = SMM.serialNormal(2,2_000)

However, I can't seem to figure out how to return the estimates of p1 and p2. If I type MA[1] I simply get

BGP Algorithm with 3 BGPChains
============================

Algorithm
---------
Current iteration: 2000
Number of params to estimate: 2
Number of moments to match: 2

I didn't see in the example how to return the parameter estimates. Nevertheless, the code runs fine and you can probably close this issue now.

that example function returns a tuple where entries 2 and 3 are plots. you get the parameters from the first chain with

params(MA[1].chains[1])

in this case.