stan-dev/math

Eigen assertion failure in `hcubature`

WardBrian opened this issue · 10 comments

Description

Reported on the forums: https://discourse.mc-stan.org/t/using-the-wiener-lpdf-function-in-the-cmdstanr-package-to-implement-a-diffusion-model-the-construction-of-a-markov-chain-was-unsuccessful/35213/3

Example

Running the above model leads to an assert with the following backtrace

Details

level1: stan/lib/stan_math/lib/eigen_3.4.0/Eigen/src/Core/DenseCoeffsBase.h:410: Eigen::DenseCoeffsBase<Derived, 1>::Scalar& Eigen::DenseCoeffsBase<Derived, 1>::operator[](Eigen::Index) [with Derived = Eigen::Matrix<double, -1, 1>; Eigen::DenseCoeffsBase<Derived, 1>::Scalar = double; Eigen::Index = long int]: Assertion `index >= 0 && index < size()' failed.

Program received signal SIGABRT, Aborted.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=140737352369984) at ./nptl/pthread_kill.c:44
44      ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737352369984) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140737352369984) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140737352369984, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff7842476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff78287f3 in __GI_abort () at ./stdlib/abort.c:79
^[[B^[[A#5  0x00007ffff782871b in __assert_fail_base (fmt=0x7ffff79dd130 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x555555713be4 "index >= 0 && index < size()", file=0x555555713110 "stan/lib/stan_math/lib/eigen_3.4.0/Eigen/src/Core/DenseCoeffsBase.h", line=410, function=<optimized out>) at ./assert/assert.c:92
#6  0x00007ffff7839e96 in __GI___assert_fail (assertion=0x555555713be4 "index >= 0 && index < size()", file=0x555555713110 "stan/lib/stan_math/lib/eigen_3.4.0/Eigen/src/Core/DenseCoeffsBase.h", line=410, function=0x555555714348 "Eigen::DenseCoeffsBase<Derived, 1>::Scalar& Eigen::DenseCoeffsBase<Derived, 1>::operator[](Eigen::Index) [with Derived = Eigen::Matrix<double, -1, 1>; Eigen::DenseCoeffsBase<Derived, 1>::Scalar = doub"...) at ./assert/assert.c:101
#7  0x00005555555b6ee4 in Eigen::DenseCoeffsBase<Eigen::Matrix<double, -1, 1, 0, -1, 1>, 1>::operator[] (index=<optimized out>, this=0x55555584eed0) at stan/lib/stan_math/lib/eigen_3.4.0/Eigen/src/Core/DenseCoeffsBase.h:410
#8  _ZN4stan4math9hcubatureIZZNS0_8internal17wiener7_integrateILNS2_12GradientCalcE0ELS4_0EZNS0_11wiener_lpdfILb0EdddddNS0_9var_valueIdvEES7_S7_EEDaRKT0_RKT1_RKT2_RKT3_RKT4_RKT5_RKT6_RKT7_RKdEUlDpOT_E11_RdJRSt5tupleIJdddddddddEEiRN5Eigen6MatrixIdLin1ELi1ELi0ELin1ELi1EEES19_RKiSX_dEEEDaSD_OSE_DpOT3_ENKUlS10_E_clIJS15_RiS19_S19_S1B_SX_S12_EEEDaS10_EUlOT_OS8_OSB_S1C_OSH_OSK_OSN_OSQ_OST_OT8_E_ddS14_ddEEDaRKS1J_SG_iRKNS17_IS8_Lin1ELi1ELi0ELin1ELi1EEERKNS17_ISB_Lin1ELi1ELi0ELin1ELi1EEEiSH_SK_ (integrand=..., pars=std::tuple containing = {...}, dim=dim@entry=1, a=..., b=..., max_eval=max_eval@entry=6000, reqAbsError=reqAbsError@entry=0, reqRelError=reqRelError@entry=4.5000000000000003e-05) at stan/lib/stan_math/stan/math/prim/functor/hcubature.hpp:460
#9  0x00005555555ba715 in _ZZN4stan4math8internal17wiener7_integrateILNS1_12GradientCalcE0ELS3_0EZNS0_11wiener_lpdfILb0EdddddNS0_9var_valueIdvEES6_S6_EEDaRKT0_RKT1_RKT2_RKT3_RKT4_RKT5_RKT6_RKT7_RKdEUlDpOT_E11_RdJRSt5tupleIJdddddddddEEiRN5Eigen6MatrixIdLin1ELi1ELi0ELin1ELi1EEES18_RKiSW_dEEEDaSC_OSD_DpOT3_ENKUlSZ_E_clIJS14_RiS18_S18_S1A_SW_S11_EEEDaSZ_ (__closure=<synthetic pointer>) at stan/lib/stan_math/stan/math/prim/prob/wiener_full_lpdf.hpp:141
#10 _ZN4stan4math8internal23estimate_with_err_checkILm0ELm8ELNS1_12GradientCalcE0ELb1ERKZNS1_17wiener7_integrateILS3_0ELS3_0EZNS0_11wiener_lpdfILb0EdddddNS0_9var_valueIdvEES7_S7_EEDaRKT0_RKT1_RKT2_RKT3_RKT4_RKT5_RKT6_RKT7_RKdEUlDpOT_E11_RdJRSt5tupleIJdddddddddEEiRN5Eigen6MatrixIdLin1ELi1ELi0ELin1ELi1EEES19_RKiSX_dEEEDaSD_OSE_DpOT3_EUlS10_E_S12_JS15_RiS19_S19_S1B_SX_S12_EEEDaOSH_OSK_DpOT5_ (err=@0x7fffffffb638: -15.431956022972571, functor=<synthetic pointer>...) at stan/lib/stan_math/stan/math/prim/prob/wiener5_lpdf.hpp:633
#11 _ZN4stan4math8internal17wiener7_integrateILNS1_12GradientCalcE0ELS3_0EZNS0_11wiener_lpdfILb0EdddddNS0_9var_valueIdvEES6_S6_EEDaRKT0_RKT1_RKT2_RKT3_RKT4_RKT5_RKT6_RKT7_RKdEUlDpOT_E11_RdJRSt5tupleIJdddddddddEEiRN5Eigen6MatrixIdLin1ELi1ELi0ELin1ELi1EEES18_RKiSW_dEEEDaSC_OSD_DpOT3_ (hcubature_err=@0x7fffffffb638: -15.431956022972571, wiener7_functor=...) at stan/lib/stan_math/stan/math/prim/prob/wiener_full_lpdf.hpp:162
#12 stan::math::wiener_lpdf<false, double, double, double, double, double, stan::math::var_value<double, void>, stan::math::var_value<double, void>, stan::math::var_value<double, void> > (y=<optimized out>, a=@0x5555558281b0: 1, t0=@0x5555558281d8: 0.29999999999999999, w=@0x5555558281b8: 0.45000000000000001, v=@0x555555831550: 3.5, sv=..., sw=..., st0=..., precision_derivatives=@0x7fffffffba70: 0.0001) at stan/lib/stan_math/stan/math/prim/prob/wiener_full_lpdf.hpp:600
#13 0x00005555555cfe5b in level1_model_namespace::level1_model::log_prob_impl<true, true, Eigen::Matrix<stan::math::var_value<double, void>, -1, 1, 0, -1, 1>, Eigen::Matrix<int, -1, 1, 0, -1, 1>, (void*)0, (void*)0, (void*)0> (this=0x555555828120, params_r__=..., params_i__=..., pstream__=<optimized out>) at level1.hpp:413
#14 0x00005555555d06d3 in level1_model_namespace::level1_model::log_prob<true, true, stan::math::var_value<double, void> > (pstream=<optimized out>, params_r=..., this=<optimized out>) at level1.hpp:678
#15 stan::model::model_base_crtp<level1_model_namespace::level1_model>::log_prob_propto_jacobian (this=<optimized out>, theta=..., msgs=<optimized out>) at stan/src/stan/model/model_base_crtp.hpp:132
#16 0x000055555566d792 in stan::model::model_base::log_prob<true, true, stan::math::var_value<double, void> > (msgs=<optimized out>, params_r=..., this=<optimized out>) at stan/src/stan/model/model_base.hpp:329
#17 stan::model::model_functional<stan::model::model_base>::operator()<stan::math::var_value<double, void> > (x=..., this=0x7fffffffbc50) at stan/src/stan/model/model_functional.hpp:21
#18 stan::math::gradient<stan::model::model_functional<stan::model::model_base> > (f=..., x=..., fx=@0x7fffffffcb40: -882.67991549259148, grad_fx=...) at stan/lib/stan_math/stan/math/rev/functor/gradient.hpp:51
#19 0x000055555566d99a in stan::model::gradient<stan::model::model_base> (model=..., x=..., f=@0x7fffffffcb40: -882.67991549259148, grad_f=..., logger=...) at stan/src/stan/model/gradient.hpp:27
#20 0x000055555566e012 in stan::mcmc::base_hamiltonian<stan::model::model_base, stan::mcmc::diag_e_point, boost::random::mixmax_engine<17, 36u, 0l> >::update_potential_gradient (this=0x7fffffffcb60, z=..., logger=...) at stan/src/stan/mcmc/hmc/hamiltonians/base_hamiltonian.hpp:63
#21 0x000055555566e381 in stan::mcmc::expl_leapfrog<stan::mcmc::diag_e_metric<stan::model::model_base, boost::random::mixmax_engine<17, 36u, 0l> > >::update_q (this=<optimized out>, z=..., hamiltonian=warning: RTTI symbol not found for class 'stan::mcmc::diag_e_metric<stan::model::model_base, boost::random::mixmax_engine<17, 36u, 0l> >'
..., epsilon=-0.018627259759257088, logger=...) at stan/src/stan/mcmc/hmc/integrators/expl_leapfrog.hpp:25
#22 0x00005555555e4c89 in stan::mcmc::base_leapfrog<stan::mcmc::diag_e_metric<stan::model::model_base, boost::random::mixmax_engine<17, 36u, 0l> > >::evolve (this=0x7fffffffcb58, z=..., hamiltonian=warning: RTTI symbol not found for class 'stan::mcmc::diag_e_metric<stan::model::model_base, boost::random::mixmax_engine<17, 36u, 0l> >'
..., epsilon=-0.018627259759257088, logger=...) at stan/src/stan/mcmc/hmc/integrators/base_leapfrog.hpp:20
#23 0x000055555570fc55 in stan::mcmc::base_nuts<stan::model::model_base, stan::mcmc::diag_e_metric, stan::mcmc::expl_leapfrog, boost::random::mixmax_engine<17, 36u, 0l> >::build_tree (this=this@entry=0x7fffffffcb00, depth=0, z_propose=..., p_sharp_beg=..., p_sharp_end=..., rho=..., p_beg=..., p_end=..., H0=H0@entry=-881.53588851820155, sign=sign@entry=-1, n_leapfrog=@0x7fffffffc1f4: 0, log_sum_weight=@0x7fffffffc208: -inf, sum_metro_prob=@0x7fffffffc200: 0, logger=...) at stan/src/stan/mcmc/hmc/nuts/base_nuts.hpp:254
#24 0x0000555555710bb4 in stan::mcmc::base_nuts<stan::model::model_base, stan::mcmc::diag_e_metric, stan::mcmc::expl_leapfrog, boost::random::mixmax_engine<17, 36u, 0l> >::transition (this=this@entry=0x7fffffffcb00, init_sample=..., logger=...) at stan/src/stan/mcmc/hmc/nuts/base_nuts.hpp:150
#25 0x0000555555711740 in stan::mcmc::adapt_diag_e_nuts<stan::model::model_base, boost::random::mixmax_engine<17, 36u, 0l> >::transition (this=0x7fffffffcb00, init_sample=..., logger=...) at stan/src/stan/mcmc/hmc/nuts/adapt_diag_e_nuts.hpp:26
#26 0x00005555556605fb in stan::services::util::generate_transitions<stan::model::model_base, boost::random::mixmax_engine<17, 36u, 0l> > (sampler=warning: RTTI symbol not found for class 'stan::mcmc::adapt_diag_e_nuts<stan::model::model_base, boost::random::mixmax_engine<17, 36u, 0l> >'
..., num_iterations=num_iterations@entry=250, start=start@entry=0, finish=finish@entry=500, num_thin=num_thin@entry=1, refresh=refresh@entry=1, save=true, warmup=true, mcmc_writer=..., init_s=..., model=..., base_rng=..., callback=..., logger=..., chain_id=1, num_chains=1) at stan/src/stan/services/util/generate_transitions.hpp:70
#27 0x00005555556795ab in stan::services::util::run_adaptive_sampler<stan::mcmc::adapt_diag_e_nuts<stan::model::model_base, boost::random::mixmax_engine<17, 36u, 0l> >, stan::model::model_base, boost::random::mixmax_engine<17, 36u, 0l> > (sampler=warning: RTTI symbol not found for class 'stan::mcmc::adapt_diag_e_nuts<stan::model::model_base, boost::random::mixmax_engine<17, 36u, 0l> >'
..., model=..., cont_vector=std::vector of length 3, capacity 3 = {...}, num_warmup=num_warmup@entry=250, num_samples=num_samples@entry=250, num_thin=num_thin@entry=1, refresh=1, save_warmup=true, rng=..., interrupt=..., logger=..., sample_writer=..., diagnostic_writer=..., metric_writer=..., chain_id=1, num_chains=1) at stan/src/stan/services/util/run_adaptive_sampler.hpp:78
#28 0x0000555555679c65 in stan::services::sample::hmc_nuts_diag_e_adapt<stan::model::model_base> (model=..., init=..., init_inv_metric=..., random_seed=random_seed@entry=2741745723, chain=chain@entry=1, init_radius=init_radius@entry=2, num_warmup=num_warmup@entry=250, num_samples=250, num_thin=1, save_warmup=true, refresh=1, stepsize=stepsize@entry=1, stepsize_jitter=stepsize_jitter@entry=0, max_depth=10, delta=delta@entry=0.80000000000000004, gamma=gamma@entry=0.050000000000000003, kappa=kappa@entry=0.75, t0=t0@entry=10, init_buffer=75, term_buffer=50, window=25, interrupt=..., logger=..., init_writer=..., sample_writer=..., diagnostic_writer=..., metric_writer=...) at stan/src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp:103
#29 0x000055555567c1ed in stan::services::sample::hmc_nuts_diag_e_adapt<stan::model::model_base, std::shared_ptr<stan::io::var_context>, stan::callbacks::writer, stan::callbacks::unique_stream_writer<std::basic_ofstream<char, std::char_traits<char> >, std::default_delete<std::basic_ofstream<char, std::char_traits<char> > > >, stan::callbacks::unique_stream_writer<std::basic_ofstream<char, std::char_traits<char> >, std::default_delete<std::basic_ofstream<char, std::char_traits<char> > > >, stan::callbacks::json_writer<std::basic_ofstream<char, std::char_traits<char> >, std::default_delete<std::basic_ofstream<char, std::char_traits<char> > > > > (model=..., num_chains=num_chains@entry=1, init=std::vector of length 1, capacity 1 = {...}, random_seed=random_seed@entry=2741745723, init_chain_id=init_chain_id@entry=1, init_radius=init_radius@entry=2, num_warmup=num_warmup@entry=250, num_samples=250, num_thin=1, save_warmup=true, refresh=1, stepsize=stepsize@entry=1, stepsize_jitter=stepsize_jitter@entry=0, max_depth=10, delta=delta@entry=0.80000000000000004, gamma=gamma@entry=0.050000000000000003, kappa=kappa@entry=0.75, t0=t0@entry=10, init_buffer=75, term_buffer=50, window=25, interrupt=..., logger=..., init_writer=std::vector of length 1, capacity 1 = {...}, sample_writer=std::vector of length 1, capacity 1 = {...}, diagnostic_writer=std::vector of length 1, capacity 1 = {...}, metric_writer=std::vector of length 1, capacity 1 = {...}) at stan/src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp:560
#30 0x00005555555e20ce in cmdstan::command (argc=<optimized out>, argv=<optimized out>) at src/cmdstan/command.hpp:684
#31 0x00005555555e4239 in main (argc=<optimized out>, argv=<optimized out>) at src/cmdstan/main.cpp:6

It identifies the failure as coming from this line:

auto w = (box.b_[box.kdiv_] - box.a_[box.kdiv_]) / 2;

Expected Output

Current Version:

v4.9.0

The matrices in box.b_ and box.a_ have data that appears to point to nullptr

Here is the reproducible example in a slightly easier to use format, independent of the R code:
https://gist.github.com/WardBrian/c6d37114248be76948552fa2ed9c7770

This will fail shortly after iteration 5 of warmup.

So, what does it mean? Does the iterator kdiv_ run out of range? Or does it mean something else? This error did never appear in all the tests and with simulated data.

@Franzi2114 yes, it means that line is trying to index outside the matrix. In particular, it looked like the matrix that was failing had size 0 and no data

When doing some local testing, I can reliably reproduce when the loop:

for (i in 1:N) {

is only looping from 50 -> 52:

for (i in 50:52) {

Hopefully that helps simplify/speedup debugging somewhat

@andrjohns thanks, that does speed it up considerably.

If it matters, the specific line that is failing is the "left stimulus, right response" line
target += wiener_lpdf(rt[i] | a, t0, 1 - w, -v[cnd[i]], sv, sw, st0);

Logged the values until it failed, here's a c++ mre:

#include <stan/math.hpp>
#include <iostream>

int main() {
  using stan::math::internal::GradientCalc;
  using stan::math::internal::wiener7_integrate;
  using stan::math::internal::wiener5_density;

  auto params_st = std::make_tuple(0.553273, 1, -3.5, 0.55, 0.553161,
                                    1.06423, 0.0941929, 0.0, -28.3242);
  Eigen::VectorXd xmin = Eigen::VectorXd::Zero(2);
  Eigen::VectorXd xmax = Eigen::VectorXd::Ones(2);

  auto f = wiener7_integrate<GradientCalc::OFF, GradientCalc::OFF>(
              [](auto&&... args) { return wiener5_density<GradientCalc::ON>(args...); },
              -18.3029, params_st, 1, xmin, xmax, 6000, 0, 4.5e-05);
}

I think it is because of the use of std::move on box.a_ and box.b_ on lines 482 and 483. My understanding is this leaves behind null data in the current box, which if it ever gets re-selected at the top of the loop leads to this error.

@Franzi2114 does that sound like the kind of thing that would happen? I'm not sure I understand the criteria by which the current box is selected enough to know if it can happen more than once

These moves were introduced in a68fafb

Changing 482 and 483 to

    box_t box1(std::move(ma), box.b_, result_1, kdivide_1);
    box_t box2(box.a_, std::move(mb), result_2, kdivide_2);

seems to resolve the issue - at least for that MRE and the 50:52 simplification of the original model. Running the original model will be slow but probably worth doing at this point

Thanks! Yes, this sounds reasonable.