When snapshot sda2, does sda2 need to change its mrf if it's already changed?
nickchen-cpu opened this issue · 3 comments
Hi
Disk conf: sda1
, sda2
Steps:
- First snapshot
sda1
; - In
__tracer_transition_tracing
,sda1->sda->queue->make_request
is replaced withtracing_mrf
, and so issda2->sda->queue->make_request
; - Snapshot the
sda2
, here's the point: We still run__tracer_transition_tracing
for sda2, which will again:
(1) Freeze thesda
;
(2) Changesda2->sda->queue->make_request
totracing_mrf
. but it already is in step 2.
Could we do some precheck to prevent this?
Best Regards,
Nick
The double assignment isn't a problem. However, when you'd go to destroy those devices we'd certainly have a problem. When we snapshot sda2
, we record the original mrf so that we can restore it later; this causes the problem since at this point sda2
original will have been tracing_mrf
. If the destroys come in the same sequence as the snapshots, then sda
will end up with tracing_mrf
as its callback after there are no snap devices associated with it. In summary, the following would happen:
- snapshot sda1 -
orig_mrf
is recorded normally,tracing_mrf
is swapped in - snapshot sda2 -
orig_mrf
is set totracing_mrf
,tracing_mrf
is swapped in - destroy sda1 -
orig_mrf
(which is the actual fn pointer) is swapped back intosda->queue->make_request
- destroy sda2 -
orig_mrf
(which istracing_mrf
is swapped back intosda->queue->make_request
And this small problem gets larger if the module is unloaded!
Good catch! Initial idea for resolving would be a kind of reference count for mrf swaps. We could create a couple of functions gendisk_mrf_get
and gendisk_mrf_put
and a map that associate gendisk*
and orig_mrf
s along with how many times they've been accessed.
snapshot sda1 - orig_mrf is recorded normally, tracing_mrfis swapped in
snapshot sda2 - orig_mrf is set to tracing_mrf, tracing_mrf is swapped in
destroy sda1 - orig_mrf (which is the actual fn pointer) is swapped back into sda->queue->make_request
After sda1 is out of tracing(because make_requset is assigned back to orig_mrf), the tracing function of sda2 is also disabled, because they share the same device queue, right?
Indeed. Thank you @nickchen-cpu