Keeping a Substructure Preserved During Mols Generation
Closed this issue · 3 comments
Hi,
Is it possible to keep a certain substructure (or atom_ids) of the starting mol intact (unchanged) during the generation process? If not, can you please provide me guide lines on how this might be implemented?
Thanks
Hi @OmarAlAttraqDev, thank you for using mol_ga!
There is no option for this in mol_ga. If you want this, I think you would need to do two things:
- Ensure that the desired substructure is present in at least one molecule of the starting population (via this argument)
- Create an "offspring generation function" which preserves the substructure (via this argument)
This is the function used to generate offspring by default:
mol_ga/mol_ga/graph_ga/gen_candidates.py
Line 64 in cf92b96
You could either replace this with your own function (accepting the same arguments), or wrap this function with another function which rejects molecules that do not contain your desired substructure. Does that make sense?
@AustinT Thank you very much for your help, it was really useful.
I added a substruct argument and passed it to the mutate / crossover funcs. I added a simple substructure check at the mutate and crossover in the check for new mols generated from running the reaction, such as https://github.com/AustinT/mol_ga/blob/cf92b96f6a5252b2efa7e1982740628d26135666/mol_ga/graph_ga/mutate.py#L147C1-L148C1, when checking the mol is ok, I also check the mol has the substructure present.
I thought this way I avoid wasting cycles if the check was performed after a mol is selected from the reaction results. Currently, it seems to work as expected and the substructure is preserved.
Thanks
Great! I'll close this issue for now. If you want to see this feature supported out of the box by this package feel free to open a PR at some point 🙂