04_safe_globals unicode support
mumblingdrunkard opened this issue · 2 comments
I went a bit off track while going through this after I noticed a c as u8
where c: char
. This didn't sit right with me so I decided to give it proper unicode support:
I don't know if you want to use this exact code, but this worked great for me. Using .encode_utf8()
gives a string slice that can be iterated over with .bytes()
. Since I'm no longer writing chars, I decided to update it to bytes_written
instead.
diff --git a/04_safe_globals/src/bsp/raspberrypi/console.rs b/04_safe_globals/src/bsp/raspberrypi/console.rs
index f340d94..38561ba 100644
--- a/04_safe_globals/src/bsp/raspberrypi/console.rs
+++ b/04_safe_globals/src/bsp/raspberrypi/console.rs
@@ -15,7 +15,7 @@ use core::fmt;
///
/// The mutex protected part.
struct QEMUOutputInner {
- chars_written: usize,
+ bytes_written: usize,
}
//--------------------------------------------------------------------------------------------------
@@ -39,16 +39,20 @@ static QEMU_OUTPUT: QEMUOutput = QEMUOutput::new();
impl QEMUOutputInner {
const fn new() -> QEMUOutputInner {
- QEMUOutputInner { chars_written: 0 }
+ QEMUOutputInner { bytes_written: 0 }
}
/// Send a character.
fn write_char(&mut self, c: char) {
- unsafe {
- core::ptr::write_volatile(0x3F20_1000 as *mut u8, c as u8);
+ let mut buffer = [0u8; 4]; // char can be up to 4 bytes
+ let sequence = c.encode_utf8(&mut buffer);
+ for b in sequence.bytes() {
+ unsafe {
+ core::ptr::write_volatile(0x3F20_1000 as *mut u8, b);
+ }
}
- self.chars_written += 1;
+ self.bytes_written += sequence.len();
}
}
@@ -110,7 +114,7 @@ impl console::interface::Write for QEMUOutput {
}
impl console::interface::Statistics for QEMUOutput {
- fn chars_written(&self) -> usize {
- self.inner.lock(|inner| inner.chars_written)
+ fn bytes_written(&self) -> usize {
+ self.inner.lock(|inner| inner.bytes_written)
}
}
diff --git a/04_safe_globals/src/console.rs b/04_safe_globals/src/console.rs
index 658cf66..32d7b1b 100644
--- a/04_safe_globals/src/console.rs
+++ b/04_safe_globals/src/console.rs
@@ -21,7 +21,7 @@ pub mod interface {
/// Console statistics.
pub trait Statistics {
/// Return the number of characters written.
- fn chars_written(&self) -> usize {
+ fn bytes_written(&self) -> usize {
0
}
}
diff --git a/04_safe_globals/src/main.rs b/04_safe_globals/src/main.rs
index 82262ea..9056770 100644
--- a/04_safe_globals/src/main.rs
+++ b/04_safe_globals/src/main.rs
@@ -126,11 +126,11 @@ mod synchronization;
unsafe fn kernel_init() -> ! {
use console::interface::Statistics;
- println!("[0] Hello from Rust!");
+ println!("[0] Hello from Rust! 🦀");
println!(
"[1] Chars written: {}",
- bsp::console::console().chars_written()
+ bsp::console::console().bytes_written()
);
println!("[2] Stopping here.");
Hi, thanks for sharing this.
To be honest, I am not planning to change to UTF-8 for the serial for now. A debug serial is supposed to be a very low overhead vehicle to transport debug information, so having to potentially transmit multiple bytes for a single character is a bit problematic.
Also, I don't think that the receiver side expects UTF-8 from these ancient interfaces.
I hope you understand.