n-length unorderd unique operations
jakobhuss opened this issue · 3 comments
Hi!
I am trying to understand how one would represent a unordered operation with n length where also each element is unique. Perhaps there is an example of this but i cant find it.
For instance where SUM( 3 + 5 + 7 )
is the same as SUM( 5 + 3 + 7 )
and SUM can take 2 - n elements. No element should be able to occur more than once.
Is this possible with GeneticEngine? I'm pretty new to genetic programming so i hope that question makes sense.
Thank you for your interest!
Short answer: The grammar you are describing is not Context-Free (you can probably write an equivalent CFG, but that would take dozens of lines). A few common solutions:
- Allow any order and duplicates in the grammar. Give a really bad fitness if there are duplicates, and before you evaluate your tree, you can lexicographically order the elements.
- If you identify duplicates, you can remove/replace them. This is usually called individual fixing (Note that this is usually only possible using a tree-based representation, and not grammatical evolution or similar approaches)
Having said that, we are introducing dependent types in Genetic Engine, which would allow you to do this. Support for dependent types is not complete, and does not support your use case right now (I don't have a timeline for that, but probably no soon than 3 months). The solution would be based on this feature:
@dataclass
class List3StrictlyOrderedElements:
a: int
b: Annotated[int, SMT("b > a")]
c: Annotated[int, SMT("c > b")]
You can find in the smt_test file an example of this. There are a few issues right now: It relies on SMT solving (i.e., do not expect random values) and it does not allow arbitrary lengths, like you desire. We plan to improve support for both in the coming months.
(If anyone reading is interested in helping with this topic, please contact me).
Thank you, i was thinking of injecting sorted(set(my_elements))
somewhere. Which i still might try. I just wanted to know if i was missing something obvious.
Since your elements are unordered, another possibility is to represent them as a Base element plus a list of non zero offsets. The set {2,5,10} would be represented as (2, [3, 5]). You do lose quite a bit of locality with this approach.