UC-Davis-molecular-computing/nuad

generalized dependent domains

Opened this issue · 0 comments

Allow one to specify a general way in which one domain "depends" on another.

Currently there is a field dependent, but it refers specifically to the notion of subdomains, for example a dependent parent domain depends on its children. We can either rename that field, or come up with a different word than "dependent" to describe the concept in this issue.

This issue is not about generalizing that necessarily, but generalizing lots of other ways that domains can depend on each other even without a parent/subdomain relationship.

The general idea would be to do something like this. The following shows how we would specify mismatches (as in #250) that domains d and d2 should be the same, except for d2 should have a different base at position 5:

def allowed_seqs(seq: str) -> List[str]:
    if seq[5] == 'A':
        return [
            seq[:5] + 'T' + seq[6:],
            seq[:5] + 'C' + seq[6:],
            seq[:5] + 'G' + seq[6:],
        ]
    elif seq[5] == 'C':
        return [
            seq[:5] + 'T' + seq[6:],
            seq[:5] + 'A' + seq[6:],
            seq[:5] + 'G' + seq[6:],
        ]
    elif seq[5] == 'G':
        return [
            seq[:5] + 'T' + seq[6:],
            seq[:5] + 'A' + seq[6:],
            seq[:5] + 'C' + seq[6:],
        ]
    elif seq[5] == 'T':
        return [
            seq[:5] + 'C' + seq[6:],
            seq[:5] + 'A' + seq[6:],
            seq[:5] + 'G' + seq[6:],
        ]

d = Domain('d')
d2 = d.create_dependent('d2', allowed_seqs)

Then, when the search replaces the sequence of d with seq, it also replaces the sequence of d2, by calling allowed_seqs(seq) and choosing at random one of the strings in the list returned by allowed_seqs to assign to d2.

Another situation this could handle (besides mismatches) is stating that two toehold domains should be reverses of each other (say, to balance forward and reverse strand displacement rates in a toehold exchange reaction), e.g., if t is AAGA., then t2 is AGAA.

UPDATE: It may be better to pass the function a RNG and have it use that to return one single string to assign to the dependent Domain. That way, if there are a huge number of potential sequences, we don't pay the cost of enumerating all of them only to pick one at random and throw the rest away. This would replace the above code with:

replace = {
  'A': ['C', 'G', 'T'],
  'C': ['A', 'G', 'T'],
  'G': ['A', 'C', 'T'],
  'T': ['A', 'C', 'G'],
}

def pick_dependent_seq(seq: str, rng: numpy.random.Generator) -> str:
    base_to_replace = seq[5]
    available_bases = replace[base_to_replace]
    new_base = rng.choice(available_bases)
    return seq[:5] + new_base + seq[6:]

d = Domain('d')
d2 = d.create_dependent('d2', allowed_seqs)