ARM-software/CMSIS-FreeRTOS

CMSIS::RTOS2::FreeRTOS doesn't work if MicroLIB is not used

escherstair opened this issue ยท 19 comments

I found a problem while playing a little bit with FTP Server example from Oryx-Embedded on STM32H743I-EVAL board.
This example works out-of-the-box using RTOS2 interface, implemented through RTX5.
Since the example needs File System from Keil.MDK-Middleware, MicroLIB cannot be used.

If I change RTOS2 implementation from RTX5 to FreeRTOS (downloaded through its own CMSIS pack), the example doesn't work anymore.
The most visible issue is that void HAL_IncTick(void) is never called and the interrupt stays in pending state forever.
Has someone an idea on what could be the reason?

For this reason I investigated deeper inside the sources to find the differences between RTX5 and FreeRTOS implementations behind RTOS2 interface.

I found some strange things inside clib_arm.c (used by CMSIS-FreeRTOS) compared to rtx_lib.c (used by RTX5).
I summarize what I see:

Point 1
This comes from rtx_lib.c
https://github.com/ARM-software/CMSIS_5/blob/b0c7f1933926c805292b0368eb19e55b6646d18b/CMSIS/RTOS2/RTX/Source/rtx_lib.c#L623-L630
where I see osKernelInitialize() is called.
This comes from clib_arm.c

#ifndef __MICROLIB
__WEAK
void _platform_post_stackheap_init (void);
void _platform_post_stackheap_init (void) {
/* Initialize OS, memory, etc. */
#if defined(RTE_Compiler_EventRecorder)
EvrFreeRTOSSetup(0);
#endif
}
#endif /* __MICROLIB */

where I see that nothing is node (apart from Event Recording, which is NOT defined in the FTP Server example).

Point 2
This comes from rtx_lib.c
https://github.com/ARM-software/CMSIS_5/blob/b0c7f1933926c805292b0368eb19e55b6646d18b/CMSIS/RTOS2/RTX/Source/rtx_lib.c#L690-L693
and the define is different from what I see from clib_arm.c

#if (( defined(__CC_ARM) || (defined(__ARMCC_VERSION) && (__ARMCC_VERSION >= 6010050))) && !defined(__MICROLIB))

since RTX_NO_MULTITHREAD_CLIB is missing from this one.

Point 3
This comes from rtx_lib.c
https://github.com/ARM-software/CMSIS_5/blob/b0c7f1933926c805292b0368eb19e55b6646d18b/CMSIS/RTOS2/RTX/Source/rtx_lib.c#L701-L707
where I can see that the variables are placed in a bss.os.libspace region.
This comes from clib_arm.c

/* Libspace memory pool */
static uint32_t os_libspace[OS_THREAD_LIBSPACE_NUM+1][LIBSPACE_SIZE/sizeof(uint32_t)];
/* Array of Threads (IDs) using libspace */
static TaskHandle_t os_libspace_id[OS_THREAD_LIBSPACE_NUM];

where no explicit placement is done.

Point 4
Functions from rtx_lib.c
https://github.com/ARM-software/CMSIS_5/blob/b0c7f1933926c805292b0368eb19e55b6646d18b/CMSIS/RTOS2/RTX/Source/rtx_lib.c#L710-L794
call RTOS2 functions (osKernelGetState(), osThreadGetId(), ...)
On the other side, functions from clib_arm.c

/* OS Kernel state checking */
static uint32_t os_kernel_is_active (void) {
if (xTaskGetSchedulerState() == taskSCHEDULER_RUNNING) {
return 1U;
} else {
return 0U;
}
}
/* Provide libspace for current thread */
void *__user_perthread_libspace (void);
void *__user_perthread_libspace (void) {
TaskHandle_t id;
uint32_t n;
if (!os_kernel_is_active()) {
return (void *)&os_libspace[OS_THREAD_LIBSPACE_NUM][0];
}
id = xTaskGetCurrentTaskHandle();
for (n = 0U; n < OS_THREAD_LIBSPACE_NUM; n++) {
if (os_libspace_id[n] == NULL) {
os_libspace_id[n] = id;
return (void *)&os_libspace[n][0];
}
if (os_libspace_id[n] == id) {
return (void *)&os_libspace[n][0];
}
}
return (void *)&os_libspace[n][0];
}
/*----------------------------------------------------------------------------*/
#if (OS_MUTEX_CLIB_NUM > 0)
static StaticSemaphore_t clib_mutex_cb[OS_MUTEX_CLIB_NUM];
static SemaphoreHandle_t clib_mutex_id[OS_MUTEX_CLIB_NUM];
#endif
/* Define mutex object and function prototypes */
typedef void *mutex;
__USED int _mutex_initialize(mutex *m);
__USED void _mutex_acquire (mutex *m);
__USED void _mutex_release (mutex *m);
__USED void _mutex_free (mutex *m);
/* Initialize mutex */
int _mutex_initialize(mutex *m) {
#if (OS_MUTEX_CLIB_NUM > 0)
uint32_t i;
#endif
*m = NULL;
#if (OS_MUTEX_CLIB_NUM > 0)
for (i = 0U; i < OS_MUTEX_CLIB_NUM; i++) {
if (clib_mutex_id[i] == NULL) {
/* Create mutex using static memory */
clib_mutex_id[i] = xSemaphoreCreateMutexStatic(&clib_mutex_cb[i]);
/* Return mutex id */
*m = clib_mutex_id[i];
return 1;
}
}
#endif
if (os_kernel_is_active()) {
/* Create mutex using dynamic memory */
*m = xSemaphoreCreateMutex();
}
if (*m == NULL) {
return 0;
}
return 1;
}
/* Acquire mutex */
void _mutex_acquire(mutex *m) {
if (os_kernel_is_active()) {
xSemaphoreTake(*m, portMAX_DELAY);
}
}
/* Release mutex */
void _mutex_release(mutex *m) {
if (os_kernel_is_active()) {
xSemaphoreGive(*m);
}
}
/* Free mutex */
void _mutex_free(mutex *m) {
#if (OS_MUTEX_CLIB_NUM > 0)
uint32_t i;
#endif
vSemaphoreDelete(*m);
#if (OS_MUTEX_CLIB_NUM > 0)
/* Check if mutex was using static memory */
for (i = 0U; i < OS_MUTEX_CLIB_NUM; i++) {
if (*m == clib_mutex_id[i]) {
clib_mutex_id[i] = NULL;
break;
}
}
#endif
}

call a lot of native FreeRTOS functions (xTaskGetSchedulerState(), xTaskGetCurrentTaskHandle(), ...) and I don't understand if the do exactly what is expected from them.

Does clib_arm.c needs an upgrade to be aligned with rtx_lib.c?

Hi,

I briefly looked at the example structure and HAL_IncTick is called in startup_stm32h743xx.s where default weak reference to SysTick_Handler was modified and renamed to _SysTick_Handler.
If you updated startup files then your updated file does not contain this modification anymore and this could be a reason that HAL_IncTick is never called.

Hi @VladimirUmek
I'm not sure I understood what you explained.
I see that HAL_IncTick is called in startup_stm32h743xx.s inside _SysTick_Handler.
But _SysTick_Handler is never called when I enabled FreeRTOS, and the SysTick interrupt stays in pending state (never serviced).

In my previous response I was just searching for a reason that HAL_IncTick is never called. And if you would update startup_stm32h743xx.s then this could be one option - because with the update, modification in startup that calls HAL_IncTick would be gone.
But as you said, you still have this code and SysTick interrupt is apparently never serviced.
Is SysTick interrupt enabled? What is its priority?
Another option would be that interrupts are masked, check BASEPRI value.

Unfortunately I cannot test the example because I don't have any appropriate board, but I'll investigate further and check what could go wrong.

Thanks to your suggestion I get some additional info.
SysTick is enabled and has priority set to 15
image
But BASEPRI is not what is expected.
It's set to 0x50 (which is the value used for the OS critical regions) somewhere in the scatterload process, before FreeRTOS scheduler has been configured.
I'm going to investigate, but feel free to share your thoughts.

Hi @VladimirUmek
I was able to see the reason why this happens, and I think this confirmed my doubts on clib_arm.c implementation.
BASEPRI is set to 0x50 during __rt_entry() which calls _init_alloc that calls _mutex_initialize (see below call stack)
image
It calls explicitly xSemaphoreCreateMutexStatic() but FreeRTOS has not been created yet.
Following the call stack, xQueueGenericReset() is called

BaseType_t xQueueGenericReset( QueueHandle_t xQueue,
BaseType_t xNewQueue )

and it calls
taskENTER_CRITICAL();

setting BASEPRI to 0x50
The problem is that when
taskEXIT_CRITICAL();

is called, this function is mapped to vPortExitCritical()
void vPortExitCritical( void )
{
configASSERT( uxCriticalNesting );
uxCriticalNesting--;
if( uxCriticalNesting == 0 )
{
portENABLE_INTERRUPTS();
}
}

But variable uxCriticalNesting has not been set yet and it has the default value of
static UBaseType_t uxCriticalNesting = 0xaaaaaaaa;

The reason is because it's initilized by

when FreeRTOS scheduler is started.

Do you agree to my analysis?
Do you think that a fix to clib_arm.c is necessary?
Maybe the implementation for RTOS2::RTX5 can be used as a guideline (rtx_lib.c)?

Many thanks again for the detailed analysis! This helps a lot and I understand the problem now.

I don't think this is the problem of implementation in clib_arm.c but I would say that it is the problem of port (initialization) implementation in FreeRTOS:
Before you start the kernel you can create threads and other objects like mutexes, semaphores etc. If you would have an application that only contains

int main(void) {
  xSemaphoreCreateMutexStatic(...);

  xTaskCreate(...);
  
  vTaskStartScheduler();
}

the situation after the call of xSemaphoreCreateMutexStatic would be exactly the same as in your application:
vPortEnterCritical disables interrupts and uxCriticalNesting prevents vPortExitCritical to re-enable them. And function xPortStartScheduler is the only point where uxCriticalNesting is set to zero.

This means that any application that creates RTOS objects before the kernel is started has its interrupts blocked (up to BASEPRI).

In case of our application, pre-main calls xSemaphoreCreateMutexStatic thus disables interrupts and later on hangs in a call to HAL_GetTick because SysTick handler cannot execute.

You can take a look at examples (Blinky is fine) in Keil.STM32H7xx_DFP for one possible solution to this problem (re-implement HAL_GetTick and enable it for the case when the kernel is not yet started).

Thanks for your help.
Let me summarize with different words what I understood, so that you can confirm if I understood...

  • creating threads, mutexes, semaphores, ... before the kernel is started is a supported scenario (i.e. a scenario that should work)
  • the port (initialization) of FreeRTOS is buggy, because of the behavior I described in my previous post
  • for this reason, every function relying over HAL_Delay called before the kernel is started, would hang forever (becasue SysTick handler cannot execute)
  • a "quick-and-dirty" workaround would be reimplementing HAL_GetTick so that id "emulates" the job of SysTick handler while the kernel has not been started yet

Am I right?
If this si the case, I think that the best long-term solution would be fixing the FreeRTOS porting.
Do you agree?
Who is in charge for this porting? CMSIS or FreeRTOS?
So that I know where to open an issue

  • Yes
  • Such port initialization of FreeRTOS is most likely a design choice in order to prevent interrupt handlers from calling FreeRTOS API function and therefore keeping FreeRTOS kernel variables and state machine consistent (because the kernel is not running yet)
  • Yes. It is possible to execute interrupt if its priority is above BASEPRI (see https://www.freertos.org/RTOS-Cortex-M3-M4.html), but I would not recommend doing this in this case (if HAL_IncTick would be the only function called in interrupt handler this would be ok, but in our case also FreeRTOS tick handler is called which changes kernel state machine)
  • Reimplementing HAL_GetTick as in Blinky from Keil.STM32H7xx_DFP also work after the kernel is started.

As I said, I think that such behavior is a design choice to keep the kernel state machine consistent until the kernel is started. FreeRTOS maintains the ports but I don't think this is a bug although in some cases it may represent a problem for application designer...

You are correct that the FreeRTOS behaviour is intended. Every port behaves this way, and has done from very early versions of FreeRTOS where a primary cause of support requests were interrupts starting to use the FreeRTOS API before the kernel had been started - a particularly difficult issue to debug. See point 4 "Interrupts are not executing" on this page. A simple work around is to manually clear the primask register when necessary - but then you may need additional code to prevent the FreeRTOS API trying to context switch in the systick before the kernel is running - which will be inefficient as the additional code will run every time the systick executes.

Hello @RichardBarry,
first of all thank you very much for the detailed information.
The behavior is quite odd, and so it's not easy understanding what happens.

I'm not sure I got your point:
as far as I understand, you say that FreeRTOS needs that interrupts don't use the FreeRTOS API before the kernel has been started.
Am I right?

If this is the case, I understand that a "quick-and-dirty" workaround has been taken from very early versions of FreeRTOS: disabling interrupts at all until the kernel is started.
It seems to me quite a "brutal" wrkaround (even if it's stated at point 4 here):

If a FreeRTOS API function is called before the scheduler has been started then interrupts will deliberately be left disabled, and
not re-enable again until the first task starts to execute. This is done to protect the system from crashes caused by interrupts
attempting to use FreeRTOS API functions during system initialisation
, before the scheduler has been started, and while the
scheduler may be in an inconsistent state.

It seems that a normal function (not an interrupt) could use FreeRTOS API functions during system initialisation.
And this is what happens inside __scatterload
Can you confirm that this usage is ok for FreeRTOS?

Then you write:

but then you may need additional code to prevent the FreeRTOS API trying to context switch in the systick before the kernel is running - which will be inefficient as the additional code will run every time the systick executes.

This seems to me exactly what CMSIS-FreeRTOS adoption layer does here

if (xTaskGetSchedulerState() != taskSCHEDULER_NOT_STARTED) {
/* Call tick handler */
xPortSysTickHandler();
}
}

It seems that CMSIS-FreeRTOS protects from the risk of "interrupts attempting to use FreeRTOS API functions during system initialisation" inside SysTick_Handler() and FreeRTOS itself protects with a heavy decision of disabling the interrupts.
And so I get both the downsides:

  • interrupts disabled (HAL_Delay() not working)
  • "inefficient" code in SysTick_Handler()

Are you sure is not possible handling this risk in a different way?

It seems to me quite a "brutal" wrkaround

Maybe - but very effective. It has probably saved many weeks of frustration for FreeRTOS users that otherwise would experience crashes caused outside of their view of the program execution - which can be tricky to track down.

This is rarely a problem for users who create all their RTOS objects then start the scheduler. Another common usage model is to create a task that performs the initialisation, then spawns the other tasks, so perform initialisation with the scheduler already running.

It seems that a normal function (not an interrupt) could use FreeRTOS API functions during system initialisation.
And this is what happens inside __scatterload
Can you confirm that this usage is ok for FreeRTOS?

That is not a problem - although I would question why interrupts are used in __scatterload().

This seems to me exactly what CMSIS-FreeRTOS adoption layer does here

Yes it looks like that is the case. I also wrote:

which will be inefficient as the additional code will run every time the systick executes

So in the code you posted every tick interrupt has the overhead of an extra function call and an extra test. Looks like fixing the symptom rather than the cause.

Are you sure is not possible handling this risk in a different way?

In FreeRTOS we try and avoid additional code that checks the system state every time (for example, see https://freertos.org/FAQ_API.html#IQRAPI - there are trade offs made for every decision) - we would welcome any ideas for preventing interrupts using the API before the scheduler is running without needing any additional code that will then also be present for the lifetime of the application.

Thank you a lot @RichardBarry
Now the whole situation is much more clear.

I fully agree to the fact that every decision requires some trade offs.

Based on what I learned from this lesson, I've been thinking on if/how something could be improved.
What happens with actual implementation of __scatterload in the case when microlib is not used (it leaves interrupts globally disabled after having created mutextes to protect libc streams) is that FreeRTOS users experience failures caused outside of their view of the program execution - and this is tricky to track down.
Based on your experience, do you (@VladimirUmek and @RichardBarry) think that __scatterload could/should re-enable interrupts as its last instruction?
In this way the FreeRTOS user doesn't find an unexpected situation as the starting point, and he's responsible to write code using FreeRTOS as designed.
I'm not sure who is responsible to evaluate this decision. I think someone in ARM.

The only idea to improve FreeRTOS side (if possible) is to inform the caller about the "side-effect" of the function call. I think this is a general best-practise.
I mean, if a FreeRTOS function doesn't re-enable interrupts because the scheduler has not been started yet, I think this is a side-effect of the call. And as every side effect, the caller should be somehow informed about it.
Is there a "standard" way to inform FreeRTOS callers about side-effects? A way already used, I mean.
@RichardBarry feel free to share your ideas on thsi suggestion.

I don't think there is right or wrong answer in this case.

One scenario that could happen in mentioned FTP demo application:

  1. Assume that after mutex creation with xSemaphoreCreateMutexStatic interrupts are re-enabled.
  2. Application arrives to main and HAL_Init configures SysTick and starts it.
  3. _SysTick_Handler executes, calls HAL_IncTick and calls SysTick_Handler.
  4. Assume that if (xTaskGetSchedulerState() != taskSCHEDULER_NOT_STARTED) check is removed from line 160, cmsis_os2.c.
  5. xPortSysTickHandler is executed

Now, application didn't start the kernel, but the kernel tick is already started and is incrementing. And we can start discussing why the kernel tick is incrementing if the kernel is not even started. Who started SysTick? Who should start SysTick? Why is SysTick used by both STM32Cube HAL and FreeRTOS? Etc, etc...

There is A LOT of existing code that is wrong. That's why interrupts are disabled after RTOS objects get created. That's why there is a check if (xTaskGetSchedulerState() != taskSCHEDULER_NOT_STARTED) in cmsis_os2.c.
We improve documentation but many developers don't read it. We add additional checking in the code and developers complain about overhead.

And, in mentioned application, which function hangs because HAL_IncTick is never called? Is it MX_SDMMC1_SD_Init? If it is, well, this function is not even necessary to be called in order for application to work...

Trade offs...

It is an ugly situation, I agree and can document it, but from my perspective, this is integration issue in application.

Not directly related: I understand why C startup code enables interrupts before calling main(), but I don't understand why that happens inside __scatterload(). As far as I know __scatterload() is responsible for initialising memory.

Yes and the only thing that happens inside __scatterload is memory initialization.
After memory regions are initialized, control is passed to __rt_entry where standard C library is initialized. C library uses mutexes to protect its internal variables and streams (default heap management + stdio streams). And this is where xSemaphoreCreateMutexStatic is called. That's it. And soon after that we end up in main.

Thank you @VladimirUmek.
You explained much better than me what happens during C-lib "initialization".

Now everything is clear, but what confuses me at the beginning was that:

  • if I use MicroLib, main starts with interrupts enabled
  • if I don't use MicroLib, main starts with interrupts disabled
    FreeRTOS is used in both cases.
    Now I understand why and I know how to write my code so that it works in both cases (and even if I switch to another CMSIS::RTOs2 implementation, I hope)

I agree that reaching main once with interrupts disabled and once with interrupts enabled is not a good solution. Therefore, I will look if clib_arm.c could re-enable interrupts as Richard suggested. If there is a clean way to add this and there will be no side effects, interrupt will be re-enabled.

We will document this behavior also in CMSIS-FreeRTOS documentation.

I apologize for asking as not part of the original conversation but I have had the same issue, in general. I have no problem with the reasoning of the start up procedure. It was just annoying having to trace down where there might be a an unstable irq or mutex state, especially when running into stack overflows/corruption. This is an old suggestion but I was reminded of it when I was going though some ddk function calls back in the Window 98 days where most functions had blank define annotations for function arguments. As a simple example

#define __AQUIRED
#define __RELEASED
__USED void _mutex_acquire   (__RELEASED mutex *m);
__USED void _mutex_release   (__AQUIRED mutex *m);

I hate to say, having the tags _HEAP_ONLY and _MUST_FREE has saved me many times in smaller projects where I had to do simple one off changes. With the almost mandatory need for IDE's to support IntelliSense, doing something like a full Doxygen implementation seems overkill. The point of the mark is not to notify somone who knows the deeper implementation, rather, somone having to do a quick stack trace with nothing other than a FUNCTION string.

This is just an unsolicited comment, its just nice to come across an interesting conversation. Especially after 4 hours of debugging a stack corruption error to find the pointer to the heap was swaped.

Sources and documentation were updated, hence closing the issue. Please reopen in case anything needs further attention.