Deadlock between physmem_lock and pv_list_lock
Closed this issue · 3 comments
There is no strict order of lock acquisition between physmem_lock
and pv_list_lock
. This results in a deadlock.
Consider the scenario with two threads:
Thread 1:
#5 vm_page_alloc (npages=npages@entry=0x1) at /root/mimiker/sys/kern/vm_physmem.c:200
#6 pmap_pagealloc () at /root/mimiker/sys/mips/pmap.c:175
#7 pmap_add_pde (pmap=pmap@entry=0xc0019230, vaddr=vaddr@entry=0x400000) at /root/mimiker/sys/mips/pmap.c:184
#8 pmap_pte_write (pmap=pmap@entry=0xc0019230, vaddr=vaddr@entry=0x400000, pte=0xf8d8, pte@entry=0xf8c0, flags=flags@entry=0x0) at /root/mimiker/sys/mips/pmap.c:223
#9 pmap_enter (pmap=0xc0019230, va=va@entry=0x400000, pg=0xc01b18cc, prot=<optimized out>, flags=flags@entry=0x0) at /root/mimiker/sys/mips/pmap.c:336
#10 vm_page_fault (map=0xc00181c8, fault_addr=fault_addr@entry=0x40038c, fault_type=<optimized out>) at /root/mimiker/sys/kern/vm_map.c:433
pv_list_lock
is acquired in pmap_enter
and then in vm_page_alloc
thread 1 tries to lock physmem_lock
.
Thread 2:
#5 pmap_page_remove (pg=pg@entry=0xc01ace44) at /root/mimiker/sys/mips/pmap.c:387
#6 pm_free_from_seg (seg=<optimized out>, page=page@entry=0xc01ace44) at /root/mimiker/sys/kern/vm_physmem.c:242
#7 vm_page_free (page=0xc01ace44) at /root/mimiker/sys/kern/vm_physmem.c:257
#8 vm_object_remove_page_nolock (obj=obj@entry=0xc00325e8, page=<optimized out>) at /root/mimiker/sys/kern/vm_object.c:63
#9 vm_object_free (obj=0xc00325e8) at /root/mimiker/sys/kern/vm_object.c:92
#10 vm_segment_free (seg=seg@entry=0xc00335c8) at /root/mimiker/sys/kern/vm_map.c:132
#11 vm_segment_destroy (map=map@entry=0xc0018148, seg=0xc00335c8) at /root/mimiker/sys/kern/vm_map.c:161
#12 vm_map_delete (map=0xc0018148) at /root/mimiker/sys/kern/vm_map.c:197
physmem_lock
is locked in vm_page_free
and in pv_list_lock
thread 2 wants to lock pv_list_lock
.
I couldn't find obvious solution, so for now I leave the problem here.
Here's another instance of this dead-lock: https://github.com/cahirwpz/mimiker/pull/974/checks?check_run_id=1861738645
First code inspection reveals that there's responsibility overload in pm_free_from_seg
.
In the same module we use pmap
in vm_boot_alloc
and that's seems ok. However pm_free_from_seg
tries to do too much. We would like it to be a simple and well-defined tool, which we use to perform more complex actions. However it breaks abstraction by calling pmap_page_remove
and that's the source of our problem.
EDIT: In other words – we would like pm_free_from_seg
to be leaf-call.
We have about a dozen of uses of vm_page_free. Each one of them should be called only if vm_page_t
reference counter (which we do not maintain) drops to zero.
Most likely pmap_page_remove
should be renamed to pmap_remove_all as in FreeBSD. We should add plain pmap_remove
and maybe pmap_remove_pages
. Perhaps introduction of these functions will help us to remove call to pmap_page_remove
from pm_free_from_seg
.