FreeRTOS/FreeRTOS-Kernel

[BUG] Premature changing of pxCurrentTCB BEFORE actual context switch in Win32/64 Port

Vitaly1983 opened this issue · 1 comments

pxCurrentTCB is currently changed BEFORE newly selected user FreeRTOS thread (task) is resumed. This leads to sporadic errors with working of system objects, e.g., mutexes.

Target
Win32/64 FreeRTOS simulator port. FreeRTOS Kernel V10.5.1, 'port.c' source file

Host
Any

To Reproduce
This will lead to sporadic errors, in program on multicore CPUs. The more threads will be in program, and more often they will
be switched, the higher probability of errors occuring. Errors will be related to system objects (e.g., mutexes), which access pxCurrentTCB pointer.

The problem is that inside prvProcessSimulatedInterrupts()->vTaskSwitchContext()->taskSELECT_HIGHEST_PRIORITY_TASK()
->listGET_OWNER_OF_NEXT_ENTRY() there is a line:

( pxTCB ) = ( pxConstList )->pxIndex->pvOwner; there pxTCB == pxCurrentTCB

BUT, the thread which was assigned to the previously selected pxCurrentTCB, is NOT suspended during execution of vTaskSwitchContext(). If inside listGET_OWNER_OF_NEXT_ENTRY() a new thread will be selected for execution, then the previously executed will be suspended only a few lines below:

			vTaskSwitchContext();

			/* If the task selected to enter the running state is not the task
			that is already in the running state. */
			if( pvOldCurrentTCB != pxCurrentTCB )
			{
				/* Suspend the old thread.  In the cases where the (simulated)
				interrupt is asynchronous (tick event swapping a task out rather
				than a task blocking or yielding) it doesn't matter if the
				'suspend' operation doesn't take effect immediately - if it
				doesn't it would just be like the interrupt occurring slightly
				later.  In cases where the yield was caused by a task blocking
				or yielding then the task will block on a yield event after the
				yield operation in case the 'suspend' operation doesn't take
				effect immediately.  */
				pxThreadState = ( ThreadState_t *) *( ( size_t * ) pvOldCurrentTCB );
                -->>> SuspendThread( pxThreadState->pvThread );

Since there is a time gap between re-assigning pxCurrentTCB pointer and the actual suspension of previously active thread, if that thread checks for pxCurrentTCB variable in this same moment, the returned value will be incorrect.

I added the following quick fix to prvProcessSimulatedInterrupts() function:

            if (pvOldCurrentTCB)
            {
                pxThreadState = ( ThreadState_t *) *( ( size_t * ) pvOldCurrentTCB ); // <-- 1
                if (pxThreadState && pxThreadState->pvThread)
                    SuspendThread( pxThreadState->pvThread ); // <-- 2
            }

			/* Select the next task to run. */
			vTaskSwitchContext();

			/* If the task selected to enter the running state is not the task
			that is already in the running state. */
			if( pvOldCurrentTCB != pxCurrentTCB )
			{
				/* Suspend the old thread.  In the cases where the (simulated)
				interrupt is asynchronous (tick event swapping a task out rather
				than a task blocking or yielding) it doesn't matter if the
				'suspend' operation doesn't take effect immediately - if it
				doesn't it would just be like the interrupt occurring slightly
				later.  In cases where the yield was caused by a task blocking
				or yielding then the task will block on a yield event after the
				yield operation in case the 'suspend' operation doesn't take
				effect immediately.  */
				pxThreadState = ( ThreadState_t *) *( ( size_t * ) pvOldCurrentTCB );
                // SuspendThread( pxThreadState->pvThread );

				/* Ensure the thread is actually suspended by performing a
				synchronous operation that can only complete when the thread is
				actually suspended.  The below code asks for dummy register
				data.  Experimentation shows that these two lines don't appear
				to do anything now, but according to
				https://devblogs.microsoft.com/oldnewthing/20150205-00/?p=44743
				they do - so as they do not harm (slight run-time hit). */
				xContext.ContextFlags = CONTEXT_INTEGER;
				( void ) GetThreadContext( pxThreadState->pvThread, &xContext );

				/* Obtain the state of the task now selected to enter the
				Running state. */
				pxThreadState = ( ThreadState_t * ) ( *( size_t *) pxCurrentTCB );

				/* pxThreadState->pvThread can be NULL if the task deleted
				itself - but a deleted task should never be resumed here. */
				configASSERT( pxThreadState->pvThread != NULL );
				ResumeThread( pxThreadState->pvThread );
            }
            else
            {
                if (pvOldCurrentTCB && pxThreadState && pxThreadState->pvThread)
                    ResumeThread( pxThreadState->pvThread ); // <-- 2
            }

Some checks for non-NULL pointers may be redundant, but the idea should be clear. It may be even more clean not to use Resume/Suspend API in case if pxCurrentTCB was not changed. Maybe, by making vTaskSwitchContext() not to update pxCurrentTCB variable inside, but to return pointer to newly selected thread to run.

After this change, strange sporadic errors that occured before in my program, seems to disappear.

Would you please try with the following snippet and see that it addresses the issue you mentioned:

if( ulSwitchRequired != pdFALSE )
            {
                /* Suspend the old thread. */
                pxThreadState = ( ThreadState_t * ) *( ( size_t * ) pxCurrentTCB );
                SuspendThread( pxThreadState->pvThread );

                /* Ensure the thread is actually suspended by performing a
                 * synchronous operation that can only complete when the thread
                 * is actually suspended. The below code asks for dummy register
                 * data. Experimentation shows that these two lines don't appear
                 * to do anything now, but according to
                 * https://devblogs.microsoft.com/oldnewthing/20150205-00/?p=44743
                 * they do - so as they do not harm (slight run-time hit). */
                xContext.ContextFlags = CONTEXT_INTEGER;
                ( void ) GetThreadContext( pxThreadState->pvThread, &xContext );

                /* Select the next task to run. */
                vTaskSwitchContext();

                /* Obtain the state of the task now selected to enter the
                 * Running state. */
                pxThreadState = ( ThreadState_t * ) ( *( size_t * ) pxCurrentTCB );

                /* pxThreadState->pvThread can be NULL if the task deleted
                 * itself - but a deleted task should never be resumed here. */
                configASSERT( pxThreadState->pvThread != NULL );
                ResumeThread( pxThreadState->pvThread );
            }