Difficulty with msms result when using reduced subpopulation size
Closed this issue · 3 comments
Dear Dr. Ewing,
Please allow me the following question. I’m working on sequence variation of a gene involved in a dispersal adaptation. To better understand the evolutionary history of this locus, we are using your program MSMS. However, we have come across a result that we find difficult to explain and understand.
Basically, we assume two populations in which an allele evolves and is selected in opposing directions as follows:
./msms 10 1 -N 100000 -t 15 -I 2 5 5 -n 1 1 -ma x 100 100 x -Sc 0 1 1000 -500 -1000 -Sc 0 2 -1000 -500 1000 -Smu 0.01 -SI 1 2 0 0 –Smark
(./msms 10 1 -N 100000 -t 15 -I 2 5 5 -n 1 0.01 -ma x 100 100 x -Sc 0 1 1000 -500 -1000 -Sc 0 2 -1000 -500 1000 -Smu 0.01 -SI 1 2 0 0 –Smark)
In this command we vary both the selection time (-SI) and subpopulation size (-n). I wrote a script to parse the ancestral and derived (selected) sequences and calculate a.o. pairwise differences (Pd) between the sequences. Some results are included in the attached graph. As expected, when the selection time increases, Pd in the derived (selected) allele increases, unless the subpopulation size of the population in which the derived allele is selected is reduced. However, what I do not understand is that a small population size of the population in which the allele is selected, also affects the Pd of the ancestral allele. So when I set –n 1 0.01, variation is strongly reduced in the ancestral allele. This is also seen just by the length of the outputted sequences and a short coalescence time. Do you have any idea what could be the reason of this? We would be very grateful for any help.
Best regards,
Steven
Have been away. Will look at this sometime this week. Sorry.
On Wed, Jan 8, 2014 at 1:56 PM, StevenVB12 notifications@github.com wrote:
Dear Dr. Ewing,
Please allow me the following question. I’m working on sequence variation
of a gene involved in a dispersal adaptation. To better understand the
evolutionary history of this locus, we are using your program MSMS.
However, we have come across a result that we find difficult to explain and
understand.Basically, we assume two populations in which an allele evolves and is
selected in opposing directions as follows:./msms 10 1 -N 100000 -t 15 -I 2 5 5 -n 1 1 -ma x 100 100 x -Sc 0 1 1000
-500 -1000 -Sc 0 2 -1000 -500 1000 -Smu 0.01 -SI 1 2 0 0 –Smark
(./msms 10 1 -N 100000 -t 15 -I 2 5 5 -n 1 0.01 -ma x 100 100 x -Sc 0 1
1000 -500 -1000 -Sc 0 2 -1000 -500 1000 -Smu 0.01 -SI 1 2 0 0 –Smark)In this command we vary both the selection time (-SI) and subpopulation
size (-n). I wrote a script to parse the ancestral and derived (selected)
sequences and calculate a.o. pairwise differences (Pd) between the
sequences. Some results are included in the attached graph. As expected,
when the selection time increases, Pd in the derived (selected) allele
increases, unless the subpopulation size of the population in which the
derived allele is selected is reduced. However, what I do not understand is
that a small population size of the population in which the allele is
selected, also affects the Pd of the ancestral allele. So when I set –n 1
0.01, variation is strongly reduced in the ancestral allele. This is also
seen just by the length of the outputted sequences and a short coalescence
time. Do you have any idea what could be the reason of this? We would be
very grateful for any help.[image: pd_sim]https://f.cloud.github.com/assets/6349171/1868198/121fe9e6-7864-11e3-9c46-960511a3f125.jpg
Best regards,
Steven—
Reply to this email directly or view it on GitHubhttps://github.com//issues/32
.
I have no special talents. I am only passionately curious.
--Albert Einstein
I was hoping a new version i am working would make this all easier to deal
with. However its talking a little long. Sorry. But you are not forgotten.
On Mon, Jan 13, 2014 at 12:21 PM, Greg Ewing greg.ewing@gmail.com wrote:
Have been away. Will look at this sometime this week. Sorry.
On Wed, Jan 8, 2014 at 1:56 PM, StevenVB12 notifications@github.comwrote:
Dear Dr. Ewing,
Please allow me the following question. I'm working on sequence variation
of a gene involved in a dispersal adaptation. To better understand the
evolutionary history of this locus, we are using your program MSMS.
However, we have come across a result that we find difficult to explain and
understand.Basically, we assume two populations in which an allele evolves and is
selected in opposing directions as follows:./msms 10 1 -N 100000 -t 15 -I 2 5 5 -n 1 1 -ma x 100 100 x -Sc 0 1 1000
-500 -1000 -Sc 0 2 -1000 -500 1000 -Smu 0.01 -SI 1 2 0 0 -Smark
(./msms 10 1 -N 100000 -t 15 -I 2 5 5 -n 1 0.01 -ma x 100 100 x -Sc 0 1
1000 -500 -1000 -Sc 0 2 -1000 -500 1000 -Smu 0.01 -SI 1 2 0 0 -Smark)In this command we vary both the selection time (-SI) and subpopulation
size (-n). I wrote a script to parse the ancestral and derived (selected)
sequences and calculate a.o. pairwise differences (Pd) between the
sequences. Some results are included in the attached graph. As expected,
when the selection time increases, Pd in the derived (selected) allele
increases, unless the subpopulation size of the population in which the
derived allele is selected is reduced. However, what I do not understand is
that a small population size of the population in which the allele is
selected, also affects the Pd of the ancestral allele. So when I set -n 1
0.01, variation is strongly reduced in the ancestral allele. This is also
seen just by the length of the outputted sequences and a short coalescence
time. Do you have any idea what could be the reason of this? We would be
very grateful for any help.[image: pd_sim]https://f.cloud.github.com/assets/6349171/1868198/121fe9e6-7864-11e3-9c46-960511a3f125.jpg
Best regards,
StevenReply to this email directly or view it on GitHubhttps://github.com//issues/32
.I have no special talents. I am only passionately curious.
--Albert Einstein
I have no special talents. I am only passionately curious.
--Albert Einstein
Sorry, I did indeed miss this for a while.
I don't see any issue here. I would expect tree height to be reduced and hence Pd to be reduced. Any pair of lineages in the smaller deme will coalesce much more rapidly. Even more so with selection.
Also note that some simulations may not even have any selected allele. While others my fix in the selected deme very close to the introduction of the allele.
Please reopen if i am missing the point.