rust-embedded/cortex-m

Run program from SRAM

Closed this issue · 5 comments

Hello,

Is it at all possible to run programs built with cortex-m-rt from anything other than FLASH? More specifically from SRAM of the STM32H750IB? Asking to see if it is worth going down the linker script rabbit hole, or if it is simply not supported by this crate. Thanks!

...

I'm using the Daisy Seed development platform. Their boards use the small 128 kB STM32H7 chips. However, they compensate for it by providing a bootloader. You would first copy the firmware to on an-board QSPI flash storage. The bootloader then copies the firmware to SRAM on boot and runs the program from there. Effectively increasing the maximum firmware size from 128 kB to 512 kB.

The linker script their C++ based SDK uses is available here https://github.com/electro-smith/libDaisy/blob/faddc7402a0e91729cf3dc49e43623a3553159d8/core/STM32H750IB_sram.lds. It places .text and .isr_vector to SRAM.

I tried using it with my Rust program based on cortex-m and stm32h7xx-hal. I added an alias REGION_ALIAS(RAM, DTCMRAM); to overcome the first compilation issue, but sadly it even then fails:

          ERROR(cortex-m-rt): The .text section must be placed inside the FLASH memory.
          Set _stext to an address smaller than 'ORIGIN(FLASH) + LENGTH(FLASH)'

          rust-lld: error:
          ERROR(cortex-m-rt): The .text section must be placed inside the FLASH memory.
          Set _stext to an address smaller than 'ORIGIN(FLASH) + LENGTH(FLASH)'

          rust-lld: error:
          ERROR(cortex-m-rt): The .text section must be placed inside the FLASH memory.
          Set _stext to an address smaller than 'ORIGIN(FLASH) + LENGTH(FLASH)'

          rust-lld: error: section .text virtual address range overlaps with .text
          >>> .text range is [0x24000000, 0x240190AF]
          >>> .text range is [0x24000000, 0x24000073]

          rust-lld: error: section .text load address range overlaps with .text
          >>> .text range is [0x24000000, 0x240190AF]
          >>> .text range is [0x24000000, 0x24000073]

Best regards,
Petr

I'm not familiar with you particular use case, but this is what you need to do normally (using probe-rs or gdb to flash) to do everything in RAM.

First, you need to enable the set-vtor feature flag so the VTOR is initialized correctly in the Reset handler.
Then You need to provide a custom linker script which use only the RAM section.
This is the link.x generated by the cortex-m-rt crate, where simply all the FLASH occurrences are replaced with RAM. (I just copy pasted it, so maybe some cleanup is necessary)
(the last assertion is arch dependent, so check it)

/* # Developer notes

- Symbols that start with a double underscore (__) are considered "private"

- Symbols that start with a single underscore (_) are considered "semi-public"; they can be
  overridden in a user linker script, but should not be referred from user code (e.g. `extern "C" {
  static mut __sbss }`).

- `EXTERN` forces the linker to keep a symbol in the final binary. We use this to make sure a
  symbol is not dropped if it appears in or near the front of the linker arguments and "it's not
  needed" by any of the preceding objects (linker arguments)

- `PROVIDE` is used to provide default values that can be overridden by a user linker script

- On alignment: it's important for correctness that the VMA boundaries of both .bss and .data *and*
  the LMA of .data are all 4-byte aligned. These alignments are assumed by the RAM initialization
  routine. There's also a second benefit: 4-byte aligned boundaries means that you won't see
  "Address (..) is out of bounds" in the disassembly produced by `objdump`.
*/

/* Provides information about the memory layout of the device */
/* This will be provided by the user (see `memory.x`) or by a Board Support Crate */
INCLUDE memory.x

/* # Entry point = reset vector */
EXTERN(__RESET_VECTOR);
EXTERN(Reset);
ENTRY(Reset);

/* # Exception vectors */
/* This is effectively weak aliasing at the linker level */
/* The user can override any of these aliases by defining the corresponding symbol themselves (cf.
   the `exception!` macro) */
EXTERN(__EXCEPTIONS); /* depends on all the these PROVIDED symbols */

EXTERN(DefaultHandler);

PROVIDE(NonMaskableInt = DefaultHandler);
EXTERN(HardFaultTrampoline);
PROVIDE(MemoryManagement = DefaultHandler);
PROVIDE(BusFault = DefaultHandler);
PROVIDE(UsageFault = DefaultHandler);
PROVIDE(SecureFault = DefaultHandler);
PROVIDE(SVCall = DefaultHandler);
PROVIDE(DebugMonitor = DefaultHandler);
PROVIDE(PendSV = DefaultHandler);
PROVIDE(SysTick = DefaultHandler);

PROVIDE(DefaultHandler = DefaultHandler_);
PROVIDE(HardFault = HardFault_);

/* # Interrupt vectors */
EXTERN(__INTERRUPTS); /* `static` variable similar to `__EXCEPTIONS` */

/* # Pre-initialization function */
/* If the user overrides this using the `pre_init!` macro or by creating a `__pre_init` function,
   then the function this points to will be called before the RAM is initialized. */
PROVIDE(__pre_init = DefaultPreInit);

/* # Sections */
SECTIONS
{
  PROVIDE(_ram_start = ORIGIN(RAM));
  PROVIDE(_ram_end = ORIGIN(RAM) + LENGTH(RAM));
  PROVIDE(_stack_start = _ram_end);

  /* ## Sections in FLASH but now in RAM */
  /* ### Vector table */
  .vector_table ORIGIN(RAM) :
  {
    __vector_table = .;

    /* Initial Stack Pointer (SP) value.
     * We mask the bottom three bits to force 8-byte alignment.
     * Despite having an assert for this later, it's possible that a separate
     * linker script could override _stack_start after the assert is checked.
     */
    LONG(_stack_start & 0xFFFFFFF8);

    /* Reset vector */
    KEEP(*(.vector_table.reset_vector)); /* this is the `__RESET_VECTOR` symbol */

    /* Exceptions */
    __exceptions = .; /* start of exceptions */
    KEEP(*(.vector_table.exceptions)); /* this is the `__EXCEPTIONS` symbol */
    __eexceptions = .; /* end of exceptions */

    /* Device specific interrupts */
    KEEP(*(.vector_table.interrupts)); /* this is the `__INTERRUPTS` symbol */
  } > RAM

  PROVIDE(_stext = ADDR(.vector_table) + SIZEOF(.vector_table));

  /* ### .text */
  .text _stext :
  {
    __stext = .;
    *(.Reset);

    *(.text .text.*);

    /* The HardFaultTrampoline uses the `b` instruction to enter `HardFault`,
       so must be placed close to it. */
    *(.HardFaultTrampoline);
    *(.HardFault.*);

    . = ALIGN(4); /* Pad .text to the alignment to workaround overlapping load section bug in old lld */
    __etext = .;
  } > RAM

  /* ### .rodata */
  .rodata : ALIGN(4)
  {
    . = ALIGN(4);
    __srodata = .;
    *(.rodata .rodata.*);

    /* 4-byte align the end (VMA) of this section.
       This is required by LLD to ensure the LMA of the following .data
       section will have the correct alignment. */
    . = ALIGN(4);
    __erodata = .;
  } > RAM

  /* ## Sections in RAM */
  /* ### .data */
  .data : ALIGN(4)
  {
    . = ALIGN(4);
    __sdata = .;
    *(.data .data.*);
    . = ALIGN(4); /* 4-byte align the end (VMA) of this section */
  } > RAM AT>RAM
  /* Allow sections from user `memory.x` injected using `INSERT AFTER .data` to
   * use the .data loading mechanism by pushing __edata. Note: do not change
   * output region or load region in those user sections! */
  . = ALIGN(4);
  __edata = .;

  /* LMA of .data */
  __sidata = LOADADDR(.data);

  /* ### .gnu.sgstubs
     This section contains the TrustZone-M veneers put there by the Arm GNU linker. */
  /* Security Attribution Unit blocks must be 32 bytes aligned. */
  /* Note that this pads the FLASH usage to 32 byte alignment. */
  .gnu.sgstubs : ALIGN(32)
  {
    . = ALIGN(32);
    __veneer_base = .;
    *(.gnu.sgstubs*)
    . = ALIGN(32);
  } > RAM
  /* Place `__veneer_limit` outside the `.gnu.sgstubs` section because veneers are
   * always inserted last in the section, which would otherwise be _after_ the `__veneer_limit` symbol.
   */
  . = ALIGN(32);
  __veneer_limit = .;

  /* ### .bss */
  .bss (NOLOAD) : ALIGN(4)
  {
    . = ALIGN(4);
    __sbss = .;
    *(.bss .bss.*);
    *(COMMON); /* Uninitialized C statics */
    . = ALIGN(4); /* 4-byte align the end (VMA) of this section */
  } > RAM
  /* Allow sections from user `memory.x` injected using `INSERT AFTER .bss` to
   * use the .bss zeroing mechanism by pushing __ebss. Note: do not change
   * output region or load region in those user sections! */
  . = ALIGN(4);
  __ebss = .;

  /* ### .uninit */
  .uninit (NOLOAD) : ALIGN(4)
  {
    . = ALIGN(4);
    __suninit = .;
    *(.uninit .uninit.*);
    . = ALIGN(4);
    __euninit = .;
  } > RAM

  /* Place the heap right after `.uninit` in RAM */
  PROVIDE(__sheap = __euninit);

  /* Place stack end at the end of allocated RAM */
  PROVIDE(_stack_end = __euninit);

  /* ## .got */
  /* Dynamic relocations are unsupported. This section is only used to detect relocatable code in
     the input files and raise an error if relocatable code is found */
  .got (NOLOAD) :
  {
    KEEP(*(.got .got.*));
  }

  /* ## Discarded sections */
  /DISCARD/ :
  {
    /* Unused exception related info that only wastes space */
    *(.ARM.exidx);
    *(.ARM.exidx.*);
    *(.ARM.extab.*);
  }
}

/* Do not exceed this mark in the error messages below                                    | */
/* # Alignment checks */
ASSERT(ORIGIN(RAM) % 4 == 0, "
ERROR(cortex-m-rt): the start of the RAM region must be 4-byte aligned");

ASSERT(__sdata % 4 == 0 && __edata % 4 == 0, "
BUG(cortex-m-rt): .data is not 4-byte aligned");

ASSERT(__sidata % 4 == 0, "
BUG(cortex-m-rt): the LMA of .data is not 4-byte aligned");

ASSERT(__sbss % 4 == 0 && __ebss % 4 == 0, "
BUG(cortex-m-rt): .bss is not 4-byte aligned");

ASSERT(__sheap % 4 == 0, "
BUG(cortex-m-rt): start of .heap is not 4-byte aligned");

ASSERT(_stack_start % 8 == 0, "
ERROR(cortex-m-rt): stack start address is not 8-byte aligned.
If you have set _stack_start, check it's set to an address which is a multiple of 8 bytes.
If you haven't, stack starts at the end of RAM by default. Check that both RAM
origin and length are set to multiples of 8 in the `memory.x` file.");

ASSERT(_stack_end % 4 == 0, "
ERROR(cortex-m-rt): end of stack is not 4-byte aligned");

ASSERT(_stack_start >= _stack_end, "
ERROR(cortex-m-rt): stack end address is not below stack start.");

/* # Position checks */

/* ## .vector_table
 *
 * If the *start* of exception vectors is not 8 bytes past the start of the
 * vector table, then we somehow did not place the reset vector, which should
 * live 4 bytes past the start of the vector table.
 */
ASSERT(__exceptions == ADDR(.vector_table) + 0x8, "
BUG(cortex-m-rt): the reset vector is missing");

ASSERT(__eexceptions == ADDR(.vector_table) + 0x40, "
BUG(cortex-m-rt): the exception vectors are missing");

ASSERT(SIZEOF(.vector_table) > 0x40, "
ERROR(cortex-m-rt): The interrupt vectors are missing.
Possible solutions, from most likely to less likely:
- Link to a svd2rust generated device crate
- Check that you actually use the device/hal/bsp crate in your code
- Disable the 'device' feature of cortex-m-rt to build a generic application (a dependency
may be enabling it)
- Supply the interrupt handlers yourself. Check the documentation for details.");

/* ## .text */
ASSERT(ADDR(.vector_table) + SIZEOF(.vector_table) <= _stext, "
ERROR(cortex-m-rt): The .text section can't be placed inside the .vector_table section
Set _stext to an address greater than the end of .vector_table (See output of `nm`)");

ASSERT(_stext > ORIGIN(RAM) && _stext < ORIGIN(RAM) + LENGTH(RAM), "
ERROR(cortex-m-rt): The .text section must be placed inside the RAM memory.
Set _stext to an address within the RAM region.");

/* # Other checks */
ASSERT(SIZEOF(.got) == 0, "
ERROR(cortex-m-rt): .got section detected in the input object files
Dynamic relocations are not supported. If you are linking to C code compiled using
the 'cc' crate then modify your build script to compile the C code _without_
the -fPIC flag. See the documentation of the `cc::Build.pic` method for details.");
/* Do not exceed this mark in the error messages above                                    | */

ASSERT(SIZEOF(.vector_table) <= 0x400, "
There can't be more than 240 interrupt handlers. This may be a bug in
your device crate, or you may have registered more than 240 interrupt
handlers.");

/* Provides weak aliases (cf. PROVIDED) for device specific interrupt handlers */
/* This will usually be provided by a device crate generated using svd2rust (see `device.x`) */
INCLUDE device.x

You can leave the memory.x file as is, or you can remove the FLASH section entirely:

MEMORY
{
  RAM   : ORIGIN = 0x20000000, LENGTH = 64K
}

Then in build.rs, you need to specify the custom linker script as follow (assuming the custom link_ram file is in the same dir a build.rs):

 ...
 File::create(out.join("link_ram.x"))
        .unwrap()
        .write_all(include_bytes!("link_ram.x"))
        .unwrap();
 println!("cargo:rustc-link-search={}", out.display());
 ...
 println!("cargo:rustc-link-arg=-Tlink_ram.x");
 ...

That's brilliant, thanks a lot @eulerdisk for taking the time to write that down. I'll try it out and report back

Incredible, it worked! I took the generated link.x from /target/thumbv7em-none-eabihf/release/build/cortex-m-rt-675d4734194ffbbb/out/link.x, replaced FLASH with SRAM, updated build.rs and now I'm able to us the bootloader. Thanks again.

@phoracek Actually, if you don't need further customization and the standard link.x provided by cortex-m-rt works for you, then you can probably get by with just modifying the memory.x like this:

MEMORY
{
  RAM : ORIGIN = 0x20000000, LENGTH = 64K
}

REGION_ALIAS("FLASH", RAM);

@eulerdisk that's much neater, thanks so much! In case any Daisy users stumble upon this - I summarized the process together with an example specific to the Daisy platform here: https://github.com/zlosynth/daisy/tree/main/examples/bootloader