Need method to randomly sample SimpleCombination
Opened this issue · 2 comments
GoogleCodeExporter commented
I would email but don't know your email address.
I am the main developer for MDR - Multifactor Dimensionality Reduction which is
a fairly simple classifier used in bioinformatics. See
http://sourceforge.net/projects/mdr/
Normally, users do an exhaustive search over attribute combinations but we also
have a random timed mode. Currently I make no effort from to prevent the random
from testing the same combination more than once. I would like a tool that
would allow me to get combinations without replacement. The issue is that it
must be extremely efficient and fast -- it is not worthwhile to just maintain a
list of tested combinations. It would be okay and perhaps even preferable if
the progression were 'psuedo-random' in such a way that all attributes were
tested fairly equally.
Another related, but harder problem, is using evolutionary algorithms to test
combinations. I have found that this tends to test the same combinations many
many times, since I already use elitism so I don't lose track of winners, this
is a big waste of processing time. If I had a good way to know what has been
sampled previously I could prevent waste -- this would also act as a form of
'novelty' seeking which has been shown in some evolutionary algorithm contexts
to be helpful.
Thanks,
Peter Andrews
Norich, Vermont USA
Original issue reported on code.google.com by PeterVermont
on 21 Dec 2012 at 4:00
GoogleCodeExporter commented
Original comment by d.pau...@gmail.com
on 31 Jan 2013 at 1:49
- Added labels: Type-Enhancement
- Removed labels: Type-Defect
GoogleCodeExporter commented
You can randomly select which results from the Generator to return. The
Generator is an Iterable, so you could wrap it and return a RandomIterator
which randomly returns results from the underlying Iterator. Get as fancy as
you want in the RandomIterator.
The weakness here is that you are still dependent upon the underlying sequence
of the Generator, which is not necessarily random between runs. The early
results are more probable than the later results.
Original comment by ryan.gus...@gmail.com
on 14 Mar 2014 at 4:39