ConFooBio/gmse

gmse_apply: Error in eval(placing_vals[[i]]) : object 'sim_paras' not found

jejoenje opened this issue ยท 11 comments

I have come across this before, but thought it was me doing something odd... I am now thinking it might be a wider issue...?

Basically, when looping gmse_apply() through a number of (nested) iterations, all works as expected when using "unwrapped" loops, but when I wrap the same loops and code into a convenience functions (aiming to allow easy re-runs of simulation sets with different parameters etc), gmse_apply() fails with the error:

Error in eval(placing_vals[[i]]) : object 'sim_paras' not found

where sim_paras is a named list with parameter values.

Is this may be to do with Issue #53?

Code (with comments) to reproduce the error below:

library(GMSE)

# Save a list of parameter values for GMSE runs; I want this for convenience so I can store/use common paras
# safely. Note I just arbitrarily choose two here (rest kept as default).
gmse_paras = list("observe_type" = 2, "get_res" = "Full")

# Set number of replicates (sims) and time steps (years).
sims = 5
years = 5

# Create an empty list to store output in.
res = list()

# Loop through sims and years:

for(sim in 1:sims) {
  
  # For a given sim run, create an empty list with NA values, equal to each year.
  res_year = as.list(rep(NA, years))
  
  # Set starting point (sim_old) for a given simulation run.
  # Note this extracts parameter values from the list stored above.
  sim_old = gmse_apply(get_res = gmse_paras$get_res, 
                       observe_type = gmse_paras$observe_type)
  
  # Now step through the years (time steps) within year:
  for(year in 1:years) {
    
    # Single GMSE run using the previous sim_old list as starting point.
    # Note I want full output, hence specifying "get_res" explicitly.    
    sim_new = gmse_apply(get_res="Full", old_list = sim_old)
  
    # Save this time steps' results as a list element
    res_year[[year]] = sim_new
    # Reset "sim_old" for the next time step
    sim_old <- sim_new
    # Print some output to monitor progress
    print(sprintf("Sim %d, year %d", sim, year))
  }
  
  # Once all years are finished, store the sims' list (all years) as the 
  # next element in the "full" output
  res[[sim]] = res_year
  
}

# The above loop works fine, and seems to get the expected results.
# However, when wrapping the above into a function (so we can conveniently 
# specify and re-run simulations given numbers of years, sims and parameter 
# sets), this throughs the error 
# "Error in eval(placing_vals[[i]]) : object 'sim_paras' not found".
# Otherwise, the code is the same.

gmse_sims = function(s, y, sim_paras) {
  res = list()
  
  for(sim in 1:s) {
    
    res_year = as.list(rep(NA, y))
    
    # Note because the function is passed the parameter list as 'sim_paras', 
    # we here refer to this list instead of 'gmse_paras' in the above.
    sim_old = gmse_apply(get_res = sim_paras$get_res, 
                         observe_type = sim_paras$observe_type)
    
    for(year in 1:y) {
      sim_new = gmse_apply(get_res="Full", old_list = sim_old)
      res_year[[year]] = sim_new
      sim_old <- sim_new
      print(sprintf("Sim %d, year %d", sim, year))
    }
    res[[sim]] = res_year
    
  }
  
  return(res)

}

gmse_sims(s = 5, y = 5, sim_paras = gmse_paras)

Pretty stuck with this - debugs seems to point to place_args() in gmse_apply() but sill unsure whether this is a bug or my own stupidity...

@jejoenje I have reproduced the error message. I'm not entirely sure what is going on, but I suspect it is something to do with the details of how arguments are passed within gmse_apply. It might be best to get place_args to print out some lines for when it works, then when it doesn't?

place_args <- function(all_names, placing_vals, arg_list){
    # ==================== Check placing_vals?
    print(placing_vals); # What does this look like now?
    # ========================
    placing_names <- names(placing_vals);
    empty         <- identical(placing_names, NULL);
    if(empty == TRUE){
        return(arg_list);
    }
    for(i in 1:length(placing_vals)){
        place_name <- placing_names[i];
        if(place_name %in% all_names){
            place_pos <- which(all_names == place_name);
            # ============================ <- Here seems like the problem?
            print(eval(placing_vals[[i]]));                        # What's the value?
            print(deparse(substitute(placing_vals[[i]])); # What's the name?
            # ==============================
            arg_eval  <- eval(placing_vals[[i]]);
            if(is.null(arg_eval) == FALSE){
                arg_list[[place_pos]] <- eval(placing_vals[[i]]);
            }
        }
    }
    return(arg_list);
}

I'm happy to debug a bit too here, if I can help. My fear is that there is something not being passed correctly in the PARAS vector within gmse_apply. This has always been the bane of coding gmse_apply for me. Essentially, the vector needs to sometimes be rewritten, but doing so can obviously cause problems. If bad information is put into PARAS (due to user specification), then it will, I believe, opt to rewrite PARAS using everything it can rather than crash due to memory management issues (which bring all of R down with it). It isn't obvious to me why this would happen just because you wrapped things into a function -- might be a name that gmse_apply is trying to handle that differs because the call is within the function?

It would be good to nail this down, as I think we really want gmse_apply to be wrappable, as you have done.

I should add @jejoenje -- a lot of the reason PARAS is so finicky is that it is a sort of master vector that holds the necessary values for things like array dimensions in C. If those array dimensions are contradictory in some way (e.g., PARAS says 50 resources, but the array size is wrong for this), then C will often just crash due to bad memory allocation and bring all of R down with it. Let me know how I can help, or if you have any insights.

Thanks v much for looking at this @bradduthie. Much appreciated.

Yes, I'd been trying to step through place_args() in much the same way - I need to take a closer look at it again, but is it possible that the value and the name of the value are confused, somehow? As you say though, it seems odd that it works when directly looping, but not when wrapped. At first I thought it will probably be my wrapper function (possibly the parameter list not being passed correctly), but I can get gmse_sims() to print either the full sim_paras list or its elements, before gmse_apply() is actually called...

I'll keep at it and see if I can post some of the debugs.

I'm now wondering if there is some weird scope issue going on here...
Did some more debugging and made some notes inside place_args() my last commit. Essentially, it does appear as if eval(placing_vals[[i]]) returns the expected value (i.e. "Full" for get_res, for example) when not executed inside a wrapper function, but that the same statement returns the value `"get_res" when gmse_apply() is called from inside a wrapper function...

More to come.

Thanks for narrowing this down @jejoenje -- yeah, that's kind of what I feared. Off hand, I have no idea why the full value is returned when not inside the wrapper, but the "get_res" is returned when wrapped. I suspect that there is a very ugly way to solve this (e.g., have the function first check if it is in a wrapper, then adjust accordingly if so), but it would probably be better if we could understand why this comes up in the first place and refactor the code a bit with a better understanding in mind. Many thanks for digging into the issue!

Some further progress, and not, with this.
I've managed to get my proposed wrapper function working by doing two things -

  1. Ensure to include optional parameters (...) in the wrapper function call as well as both gmse_apply() calls inside the wrapper function (see wrapper code).

  2. Make a two-line change to the place_args() function in gmse_apply(). The specific change is in this commit in branch jeroen.

I've uploaded the code with the proposed wrapper and test runs to the notebook folder in the repo.
As per the comments in the code, the good news is, with two specific custom parameters (specifically observe_type and res_consume), the wrapper seems to run as expected. However, when land_dim_1 is added, the wrapper crashes R.

When I repeat this using the adapted gmse_apply() but using unwrapped loops (code to follow), R does not crash but produces error messages:

Error in -1 * res_consume : non-numeric argument to binary operator
In addition: Warning message:
In is.na(arg_list[[ot_pos]][1]) :
  is.na() applied to non-(list or vector) of type 'language'
Called from: set_interaction_array(arg_list)

It appears that this "solution" has only shifted the issue to set_interaction_array()??

@jejoenje Can't think of anything off hand, except that maybe there is an issue on L1994 and the collect_itb_ini function in trying to make the landscape?

Then again, there is only one line where is.na(arg_list[[ot_pos]][1]) appears, and it's where the observation defaults are added. If gmse_apply cannot find an observation type (or several other arguments), it just fills in the defaults. I'm wondering if it could be doing this for observe_type and res_consume. But the landscape change is somehow causing a crash because there is a contradiction between the 120 land dimension and the 100 to which gmse_apply defaults. I'll check this out, but it might be worth doing some debugging to see if the correct observation and resource consumption are truly being modelled in the wrapper code.

I've unfortunately run into this issue again, so coming back to this now.

Broadly speaking the overall issue is that is appears gmse_apply cannot be "wrapped" in a function that passes external parameter values.

To make returning to this issue easier, here is the shortest reproducable code I can manage to show the problem:

rm(list=ls())

global_ld1 = 111 # "global" parameter

# This obviously works fine:
m = gmse_apply(get_res = "Full", land_dim_1 = global_ld1)
dim(m$LAND[,,1]) # Check of para used

### Now "wrapping" GMSE in trivial function.
### Note we can reference the global para just fine (as expected) and gmse_apply works as expected:
wrapper1 = function() {
    print(global_ld1)
    gmse_apply(get_res = "Full", land_dim_1 = global_ld1)
}
m2 = wrapper1()
dim(m2$LAND[,,1])

### Now we try to use a "local" para defined inside the wrapper:
wrapper2 = function() {
    local_ld1 = 222
    print(local_ld1)
    gmse_apply(get_res = "Full", land_dim_1 = local_ld1)
}
wrapper2()
### Now the gmse_apply call fails even though we __can__ reference `local_ld1` fine inside the wrapper.

### The problem remains even when we try to include a parameter to pass with wrapper2:
wrapper3 = function(ld1) {
    print(ld1)
    gmse_apply(get_res = "Full", land_dim_1 = ld1)
}
wrapper3(ld1 = 123)
### And __even__ when we try to pass the (previously working!) global var to the new function parameter.
wrapper3(ld1 = global_ld1)

I'm now pretty sure this is a scope issue; the line arg_eval <- eval(placing_vals[[i]]); in place_args() seems work OK if the attempted evaluation is of something in the GLOBAL environment, but not in the CALLING environment? I will do some further testing to see if I can suggest something.

Thanks @jejoenje -- I'll try to take a closer look and see if anything comes to mind. There might be something clever that we can do with the memory allocation to make it work? I'm really, really hoping to dig into the code again toward the end of the month with a new version.

@bradduthie in response to your query here, yes this issue has been resolved by 99cad1d. Thank you!
I'll close this one.