Use MONITOR/MWAIT when available rather than HLT

Question

Use MONITOR/MWAIT when available rather than HLT

Opened this issue 4 years ago · 2 comments

At the moment, the x86 executor puts the CPU to a halt when they have nothing to do, and waking them up requires an interrupt, which unfortunately comes at an expensive cost.

See chapter 8.10.4 of volume 3.

Answer 1 · 2020-07-25T18:19:51.000Z

Here is a very basic patch that doesn't take into account processor support nor the size of the watched memory lines:

diff --git a/kernel/standalone/src/arch/x86_64/executor.rs b/kernel/standalone/src/arch/x86_64/executor.rs
index f9c0b10..d58bd6a 100644
--- a/kernel/standalone/src/arch/x86_64/executor.rs
+++ b/kernel/standalone/src/arch/x86_64/executor.rs
@@ -73,13 +73,6 @@ impl Executor {
                 // `Future`s that were waiting for an interrupt to happen.
                 interrupts::process_wakers();
 
-                debug_assert!(x86_64::instructions::interrupts::are_enabled());
-                x86_64::instructions::interrupts::disable();
-
-                // We store `true` in `need_ipi` before checking `woken_up`, otherwise there could
-                // be a state where `need_ipi` is `false` but we've already checked `woken_up`.
-                local_wake.need_ipi.store(true, atomic::Ordering::SeqCst);
-
                 if local_wake
                     .woken_up
                     .compare_and_swap(true, false, atomic::Ordering::SeqCst)
@@ -90,10 +83,21 @@ impl Executor {
                     break;
                 }
 
-                // An `sti` opcode only takes effect after the *next* opcode, which is `hlt` here.
-                // It is not possible for an interrupt to happen between `sti` and `hlt`.
-                x86_64::instructions::interrupts::enable();
-                x86_64::instructions::hlt();
+                // TODO: must be write-back cache
+                unsafe {
+                    asm!(
+                        "monitor",
+                        in("eax") local_wake.woken_up.as_mut_ptr(),
+                        in("ecx") 0, in("edx") 0,
+                        options(readonly, nostack, preserves_flags)
+                    );
+                }
+
+                if !local_wake.woken_up.load(atomic::Ordering::SeqCst) {
+                    unsafe {
+                        asm!("mwait", in("eax") 0, in("ecx") 0, options(readonly, nostack, preserves_flags));
+                    }
+                }
             }
         }
     }
@@ -123,17 +127,5 @@ struct Waker {
 impl ArcWake for Waker {
     fn wake_by_ref(arc_self: &Arc<Self>) {
         arc_self.woken_up.store(true, atomic::Ordering::SeqCst);
-
-        if arc_self
-            .need_ipi
-            .compare_and_swap(true, false, atomic::Ordering::SeqCst)
-        {
-            if arc_self.processor_to_wake != arc_self.apic.current_apic_id() {
-                arc_self.apic.send_interprocessor_interrupt(
-                    arc_self.processor_to_wake,
-                    arc_self.interrupt_vector,
-                );
-            }
-        }
     }
 }
diff --git a/kernel/standalone/src/main.rs b/kernel/standalone/src/main.rs
index 34c3cb1..cb9efbf 100644
--- a/kernel/standalone/src/main.rs
+++ b/kernel/standalone/src/main.rs
@@ -19,6 +19,7 @@
 #![no_main]
 #![feature(allocator_api)] // TODO: https://github.com/rust-lang/rust/issues/32838
 #![feature(alloc_error_handler)] // TODO: https://github.com/rust-lang/rust/issues/66741
+#![feature(atomic_mut_ptr)] // TODO: https://github.com/rust-lang/rust/issues/66893
 #![feature(asm)] // TODO: https://github.com/rust-lang/rust/issues/72016
 #![feature(const_if_match)] // TODO: https://github.com/rust-lang/rust/issues/49146
 #![feature(llvm_asm)] // TODO: replace all occurrences of `llvm_asm!` with `asm!`

Answer 2 · 2020-08-02T13:09:44.000Z

The most difficult requirement is:

The address range provided to the MONITOR instruction must be of write-back caching type. Only write-back
memory type stores to the monitored address range will trigger the monitor hardware. If the address range is not
in memory of write-back type, the address monitor hardware may not be set up properly or the monitor hardware
may not be armed.

At the moment, this isn't possible.
Depends on #400