JuliaPOMDP/DESPOT.jl

Random Numbers

Closed this issue · 15 comments

If we want DESPOT to be able to be used out of the box with a variety of problems, we need to have a more sophisticated random number generation system.

For all POMDPs to work with DESPOT, the rng we feed to the interface functions should support all of the random number generation funcitons, e.g rand(rng, UInt32), randn(rng), rand!(rng, A), etc.

There are four ways that I can think of to do this:

  1. write all of the rand() functions that we can think of for DESPOTRandomNumber
  2. try to identify some minimal internal interface in julia/base/random.jl that we can implement for DESPOTRandomNumber to get all of the versions of the rand function automagically
  3. make RandomStreems a list of MersenneTwisters instead of floats and feed copies of those to the POMDP model
  4. make problem writers who want compatibility with DESPOT implement versions of their model functions that use a DESPOTRandomNumber.

@ebalaban did you have something in mind for this? What do you think we should do?

Here's my thoughts:
4 is just pushing the challenge to the problem-writer's side, which will make DESPOT essentially unusable for a lot of problems.
It would take a great deal of thinking to efficiently map a single float to a large set of uncorrelated random numbers, so I think 1 and 2 are not feasible.
Thus, I think 3 is the only viable option.

Hopefully this won't cause a prohibitive performance hit. I think a copy! of a MersenneTwister takes about a microsecond (not sure how that compares to other operations), but it looks like it might take hundreds of megabytes to store 10000 MersenneTwisters, so that is suboptimal. We should also provide the option for the user to use a DESPOTRandomNumber if they only need that and want higher performance.

@mykelk @rejuvyesh @etotheipluspi any advice? It might be hard to make this both efficient and flexible.

I'm okay with 3.

@zsunberg, the way random numbers are generated in this implementation of DESPOT is indeed a bit of a limitation. The main reason I decided to go that route is to maintain full compatibility with the C++ implementation, at least on POSIX.1-2001 compatible systems (where rand_r() function can be used). In a newer solver that I am working on, I junked all of that and implemented something close to your option 3. I don't think there is a need to store 10000s of MersenneTwisters. One can just reuse the same rng object to resume a particular stream.

################### Finite Streams ##################
type FiniteRandomStreams
    number::Int64           # number of streams
    length::Int64           # *fixed* length for finite streams
    width::Int64            # when multiple random numbers are needed per scenario
    master_seed::Int64
    seeds::Vector{Int64}
    rngs::Vector{MersenneTwister}
    streams::Array{Float64}
    
    # default constructor
    function FiniteRandomStreams(
                                n::Int64,
                                l::Int64,
                                w::Int64,
                                seed::UInt32)
                           
        this = new()
        this.number = n
        this.length = l
        this.width  = w
        this.master_seed = seed        
        this.rngs = [MersenneTwister(this.master_seed $ i) for i=1:n]
        this.streams = Array(Float64,this.number,this.length,this.width)
        
        for number in 1:this.number
            for depth in 1:this.length
                this.streams[number,depth,:]=rand(this.rngs[number], this.width)
            end
        end
        return this                
    end                           
end

################### Infinite Streams ##################

type InfiniteRandomStreams
    number::Int64           # number of streams
    length::Vector{Int64}   # current length of each stream (needed for debugging and verification of infinite streams)
    width::Int64            # when multiple random numbers are needed per scenario
    rngs::Vector{MersenneTwister}
    master_seed::Int64
    seeds::Vector{Int64}
    
    # default constructor
    function InfiniteRandomStreams(n::Int64,
                           w::Int64,
                           seed::UInt32)
                           
           this = new()
           this.number = n
           this.width  = w
           this.master_seed = seed
           init_streams!(this)                
           return this                
     end                           
end

# this function is separate from the constructor to allow re-initialization without
# InfiniteRandomStreams object reallocation
function init_streams!(rs::InfiniteRandomStreams)
    rs.length = zeros(rs.number)
    rs.rngs = [MersenneTwister(rs.master_seed $ i) for i=1:rs.number]
    
    return nothing
end

function get_rand_vector(rs::InfiniteRandomStreams, stream_id::Int64)::Vector{Float64}
    rs.length[stream_id] += 1 #TODO: decide whether to keep this tracking
    return rand(rs.rngs[stream_id], rs.width)
end

Oh, I like the suggestion from @ebalaban.

I am happy to put the above into DESPOT, we can perhaps just use a flag to switch between the original RNG system and the new one (although that will make the code a bit more messy). The only reason to keep the original one is to V&V and benchmark against the C++ implementation.

I think it will be fairly straightforward to keep both, so it is worth doing. BTW right now, I am doing a slight refactor making DESPOTRandomNumber immutable and adding a few other things that should make it possible to use DESPOT with any belief updater. Will let you review those changes when I'm done.

In your above solution, is there a way for a problem writer to call randn(rng) a variable number of times within generate_s? That would be required for, for instance, a multilane driving scenario. One could, of course generate numbers from any distribution with a vector from unif(0,1), but it would probably be easier to give them a MersenneTwister that already has implementations to generate whatever they want. That's why I think we would need to store one for every step of every scenario.

I think we should create multiple types of streams and allow people to pass in whichever one they want (or even write their own) - we just need to lay down a small interface so that the right rngs get passed to generate_ at the right times.

I think maybe the only function we would need is get_rng(::RNGStream, scenario::Int, step::Int)::AbstractRNG and maybe init_stream!(::RNGStream, ::AbstractRNG) to be called at the beginning of creating a tree

(I was interpreting that your above code would only be able to give the user a vector of unif(0,1) numbers - maybe I'm wrong about that)

@zsunberg, maybe I need to understand better what you mean by "call randn(rng) a variable number of times within generate_s" and "...give them a MersenneTwister that already has implementations to generate whatever they want". The key idea in DESPOT (the "D", for determinized) is to use a fixed set of sampled scenarios (defined by preallocated streams of random numbers) to build the tree.

In the original implementation only one random number per step per scenario is provided (i.e. the streams are "2D"). I decided to extend that for more complex models and add the "width" parameter (i.e. make the streams "3D") to allow for a vector of random numbers to be passed to generate_. Each element of the vector is always used in the same place of the model. In some of my use cases I found that having preallocated streams of fixed length is inconvenient, so I added InfiniteRandomStreams and get_rand_vector(), but the idea is still the same and the streams will be the same each time.

If that's what you basically mean by your get_rng() function, then I think we are in agreement. All I was trying to ensure is that the streams are repeatable.

By the way, regarding randn(), in some of my models I then use the supplied unif(0,1) number from the streams to sample from a normal distribution.

Yeah, I think that we are in agreement. The key is for them to be repeatable/"Determinized". I think it is useful in this case to think of the random variables in the problem as functions mapping from a probability space, \Omega, to real numbers (https://en.wikipedia.org/wiki/Random_variable#Definition). Then (\phi_1, \phi_2, ...) is just a particular outcome, \omega, from \Omega. I suspect this is what the authors had in mind when they wrote the DESPOT paper, but they chose to talk about each \phi as a random number to avoid talking about random variables as functions. If we represent each \phi in \omega as a MersenneTwister state instead of a single float or fixed-length vector of floats, then it is easy to generate as many numbers as needed from any distribution. Specifically randn(rng) etc. are already implemented, and you can use as many as you need i.e. generate_s(pomdp, [3,1], :left, rng) might require 2 random numbers and generate_s(pomdp, [3,1], :right, rng) might require 60 for some problem.

(there is, of course, the danger of problem-writers inadvertently introducing correlation between the action and noise, but that's not DESPOT's job to prevent)

Out of curiosity, what method did you use to sample from a normal distribution starting with a float, just inverse transform sampling?

I guess I'll go ahead and start implementing a stream that returns MersenneTwisters, and then we can iron out the details when we have some actual code to look at. Sound good?

@zsunberg, I thought about getting rid of explicitly-generated random number streams and doing what you proposed, i.e. just calling rand or randn whenever I needed, but then decided to not do that just yet. Having to deal with pre-generated random streams can at times be cumbersome, but I think it still has some advantages:

  • Debugging: it's easier for me to look at the sequences numbers that are causing certain things to happen (or not) than to look at MersenneTwister states

  • Peace of mind: Most of my models use fewer than 10 random variables. I define enums mapping each element of the random vector to the part of the model where it is used. This way I don't need to worry about ensuring repeatability in case some piece of code does not always get executed or gets executed a variable number of times. I just always use a full-length random vector per algorithm step.

  • Ease of saving, replaying, and sharing scenarios. Can probably do the same using your approach, but might be harder to implement.

Yeah, I think we should definitely keep both. We can have streams that return a single DESPOTRandomNumber or a fixed length vector, or a MersenneTwister. For wide compatibility, we definitely need the option to pass an rng that has a wide variety of rand functions like a MersenneTwister. Maybe this will only get implemented in the feature branch discussed in #10

This is fixed from a capabilities standpoint now that #12 is merged in. I think that we may want to make the MersenneTwister streams the default eventually, but that is just a convenience question; the capability is in place.