It seems that this fork does not support STEP_PULSE_DELAY

Question

It seems that this fork does not support STEP_PULSE_DELAY

mstrens opened this issue 6 years ago · 20 comments

Normal (AVR) GRBL code supports #define STEP_PULSE_DELAY (in stepper.c)
This parameter allows to add a delay between setting the DIRECTION pins and the start of the STEP pulse in the stepper.c file.
This can be important because many stepper drivers ask for such a delay.
I would like to use some cheap clone of TB6600 drivers and I noticed that the used optocoupler on the DIR signal is much lower than the optocoupler on the STEP signal. I expect I need a delay of about 10usec to let the driver get the right direction when the pulse start.

Do you plan to support this parameter in the future?

Note: I think it could be done without using a new timer, just by using an interrupt on a compare of an existing timer.

Thanks in advance for your reply.

Answer 1 · 2018-07-13T23:11:19.000Z

Does this option not supported in the current version?
Or maybe it just working incorrectly?
config.h -- > #define STEP_PULSE_DELAY 10

Answer 2 · 2018-07-14T07:21:26.000Z

It seems me clear that the file stepper.c has not been adapted in this STM32 version to support this functionality. Current vesion would not take care of the parameter.
It would require adding some code.

Answer 3 · 2018-07-15T17:15:46.000Z

Ok. the problem is obvious.

Answer 4 · 2018-07-16T16:29:53.000Z

Ok, I hope I've done it (in native arduino GRBL style (idea) ).
I've tested it, and it's work.
Could you test it, please?
10us STEP_PULSE_DELAY

3us STEP_PULSE_DELAY

without STEP_PULSE_DELAY

steper.c
stepper.zip

Answer 5 · 2018-07-21T06:50:49.000Z

I did not test your code (no board currently available) but I checked the code.
It seems ok. I still can make some minor comments:
a )
in line 501, you have
st.step_bits = (GPIO_ReadOutputData(STEP_PORT) & ~STEP_MASK) | st.step_outbits; // Store out_bits to prevent overwriting.
and in line 759, you have
GPIO_Write(STEP_PORT, (st.step_bits & ~STEP_MASK) | st.step_outbits); // Begin step pulse.
Between the 2 lines there is some delay. So some bits from STEP_PORT could have changed.
It seems me safier to just save st.step_outbits in 501 and in 759 to read STEP_PORT again before applying some mask.
So
501 would be : st.step_bits = st.step_outbits; // Store out_bits to prevent overwriting.
759 would be : GPIO_Write(STEP_PORT, (GPIO_ReadOutputData(STEP_PORT) & ~STEP_MASK) | st.step_bits);

b)
in line 340 you have:
TIM3->CCR1 = (STEP_PULSE_DELAY - 1)*TICKS_PER_MICROSECOND;
I do not understand why you take 1 off STEP_PULSE_DELAY.
I can understand that you could take into account the time required by the STM32 to save the registers and enter the interrupt when TIM3 fires but I expect that this would require less then TICKS_PER_MICROSECOND cycles.
It seems me better to put " - 1" at the end (like in line 356)
So code would be TIM3->CCR1 = (STEP_PULSE_DELAY *TICKS_PER_MICROSECOND ) - 1;

Answer 6 · 2018-07-21T07:19:20.000Z

a) I will think about it.
b) there is 890ns delay witout using STEP_PULSE_DELAY. As we can see if we use STEP_PULSE_DELAY=3, we have 4us delay (with "my" -1 !). It would be better to use -2, but if user configure STEP_PULSE_DELAY=1, this will lead to an unpredictable result (and I didn't test with 1, perhaps if ccr1=0 it won't be work).
I agree that this is a bit tricky way. Because if we want to use optimization, this can change delays. But I'm sure there will not be a radical change.
All of this manipulation are very closer to microcontroller computational (and operations) speed limit.
A similar situation with the DEFAULT_STEP_PULSE_MICROSECONDS parameter. It takes about 1-1.5 us longer than set.

Answer 7 · 2018-07-21T19:13:15.000Z

It is very strange to get 890ns without using STEP_PULSE_DELAY because there is only one instruction between the instruction to set the direction bits at line 484 and the instruction to set the step bits at line 503.
Could it be that you tested using the "assert" option. This would explain that there are more lines of code being executed e.g. by
void TIM_ClearITPendingBit(TIM_TypeDef* TIMx, uint16_t TIM_IT)
{
/* Check the parameters /
assert_param(IS_TIM_ALL_PERIPH(TIMx));
assert_param(IS_TIM_IT(TIM_IT));
/ Clear the IT pending Bit */
TIMx->SR = (uint16_t)~TIM_IT;
}

Answer 8 · 2018-07-21T23:50:51.000Z

in my case:
#define SYSCLK_FREQ_72MHz 72000000
/* #define USE_FULL_ASSERT 1 */
Just in case I commented assert_param in TIM_ClearITPendingBit, GPIO_ReadOutputData and in GPIO_Write, but there is no difference.
Simple test
while(1){
GPIOB->ODR = 0b111;
GPIOB->ODR = 0b000;
}
gives 3+ MHz (it's about 150ns to set current GPIO bits).
I tried -Os and -Ofast - no difference.
So, I just used
TIM3->SR = (uint16_t)~TIM_IT_Update; instead of TIM_ClearITPendingBit(TIM3, TIM_IT_Update);
and
GPIOA->ODR = ((GPIOA->ODR & ~STEP_MASK) | st.step_outbits); instead of GPIO_Write(STEP_PORT, (GPIO_ReadOutputData(STEP_PORT) & ~STEP_MASK) | st.step_outbits);
And as expected, it works faster. 340ns (including 150nsec to switch port bits).
So it just 190ns for computation (is about 13-14 tacts at 72MHz).
It should be noted that in any case there are at least several instructions:
1 (=) TIM3->SR = (uint16_t)~TIM_IT_Update;
2 (&) GPIOA->ODR & ~STEP_MASK
3 (|) | st.step_outbits
4 (=) GPIOA->ODR =
So I have no doubt about these 890ns.

Answer 9 · 2018-07-22T12:58:20.000Z

About "a)":
Only 2 functions can change STEP_PORT outputs: "void TIM3_IRQHandler(void)" and "void st_reset()".
If st_reset() will be called during this delay (free time between tim2 and tim3 interrupts), then all STEP_PORT outputs will be set to default state.
Then in TIM3_IRQHandler() they will be overwritten (but should not!). Ok, this is not what we want.
In non STEP_PULSE_DELAY mode this also takes place, but STEP_PORT outputs default state is identical to the overwritten (in TIM3_IRQHandler()) bits, so it's not cause the error.

Answer 10 · 2018-07-22T18:55:02.000Z

about "a)":
STEP_PORT means e.g. GPIOA or GPIOB depending on the parameters in the config (or default) file.
GPIOA (and GPIOB) contains 16 bits.
Some of those bits could (at least in theory) be used as output for other purposes than step and dir (e.g. for water cooling, spindle, ...). In theory those bits could be changed by other functions than "void TIM3_IRQHandler(void)" and "void st_reset()".
It is safe that when a function (like TIM3_IRQHandler) wants to modifiy only some bits of a port, it should read the full port (16 bits) just before applying some mask and restoring the port. Othewise there is a risk of overwriting some bits.

The other way (even safier) would be to write in the SET and RESET registers provided by the STM.

Answer 11 · 2018-07-22T18:58:42.000Z

reasonably

Answer 12 · 2018-07-22T19:08:49.000Z

about 890 ns.
Thanks for your tests and explanations.
I understand.
I am disapointed by the way the compiler optimises the code. I had at least expected that basic functions like TIM_ClearITPendingBit() would have been automatically inlined by the compiler (to save time and memory).
I think it can be forced by adding "static inline" to the prototype of the function and moving the definition of the function into the .h file
Anyway 340 nsec (with the optimization that you did) seems me still quite long. I had expected that compiler could generate a more efficient assembler code.
It would be nice to look at the assembler code being generated but I do not know how to find it.

Answer 13 · 2018-07-22T19:13:40.000Z

thanks for the comments! New iteration in the archive. If there is an opportunity to see it, I will be very grateful.
stepper.zip

Answer 14 · 2018-07-23T06:34:19.000Z

about the new archive, I think it should work but I have a few comments:

at line 352, for safety, you could insert this line
TIM3->SR = ~TIM_SR_CC1IF; // clear CC1IF flag
at line 491, is it not more "normal" to replace
TIM2->SR = ~TIM_SR_UIF; // clear UIF flag
by
TIM2->SR &= ~TIM_SR_UIF; // clear UIF flag
at line 508, idem; so replacing
TIM3->SR = ~TIM_SR_UIF;
by
TIM3->SR &= ~TIM_SR_UIF; // clear UIF flag
or even
TIM3->SR &= ~( TIM_SR_UIF | TIM_SR_CC1IF) ; // clear UIF flag and CC1IF flag
at line 768, for consistency, replace
if ((TIM3->SR & 0x0001) != 0) // check interrupt source
by
if ((TIM3->SR & TIM_SR_UIF) != 0) // check interrupt source is UIF
lines 851 up to 853 are in a #ifdef STEP_PULSE_DELAY but change some TIM3 registers.
If seems me better to remove the #ifdef and to always perform the code.
before line 935, I think you should insert
TIM3->SR &= ~TIM_SR_CC1IF;

Answer 15 · 2018-07-23T15:31:22.000Z

Ok! Some critical changes done, and now it's work fine. Now we use interrupt priority.
look at some screens (D1 - step, D3 - dir, D7 - debug output, which indicates tim2 handler start/end)

STEP_PULSE_DELAY=1, DEFAULT_STEP_PULSE_MICROSECONDS=1

unfortunatly we can't get real step pulse microsecond less then 2.6us with STEP_PULSE_DELAY feature, and 2.4us without.
STEP_PULSE_DELAY=3, DEFAULT_STEP_PULSE_MICROSECONDS=3

Now I have to work with G10, reset button, B and C axis, then I'll share whole project.
stepper.zip
Important note! As we can see in the worst case, processing takes 9μs, so it's useless and dangerous try to make a frequency above 110kHz (for example 100kHz is about DEFAULT_n_STEPS_PER_MM*DEFAULT_n_MAX_RATE=6500000).
So, at best, we can get about 150 kHz but with an unstable signal frequency (between 150 and 110).

Answer 16 · 2018-07-23T20:23:25.000Z

in st_reset you begin with
#ifdef STM32F103C8
#ifdef STEP_PULSE_DELAY
TIM3->DIER &= ~TIM_DIER_CC1IE; //compare interrupt disable
#endif
#endif
I am not sure that this code is at the rigth place.
Should it not be better to move it after the while here after (to let tim3 always performs his task)
while(TIM3->CR1 & TIM_CR1_CEN); // wait for end of tim3 work to prevent cutoff last step pulse

Still I notice that you enable the interrupt on CC1 a few lines after the while with
#ifdef STEP_PULSE_DELAY
TIM3->DIER |= TIM_DIER_CC1IE; //compare interrupt enable
#endif

Does it really make sense?

Answer 17 · 2018-07-23T20:44:11.000Z

When STEP_PULSE_DELAY is not defined, I expect that you could reduce the width of the step pulse changing line
st.step_pulse_time = (settings.pulse_microseconds)*TICKS_PER_MICROSECOND;
by
st.step_pulse_time = (settings.pulse_microseconds - 1 )*TICKS_PER_MICROSECOND + 1;

If you want to get very short step delay and step width, you could add a new parameter and some lines of code in order to totally avoid TIM3 interrupt and so let stm32 generates the pulse totally in TIM2
interrupt.
Still I am not sure it will be ok because in some cases, TIM2 interrupt still performs a lot of computation and I am not sure that those computation can be performed in less than 10 usec (in order to get reliable 100khz)

Answer 18 · 2018-07-23T20:58:18.000Z

st.step_pulse_time = (settings.pulse_microseconds - 1 )*TICKS_PER_MICROSECOND + 1;
I've tried it today))), it does not work (in direct (may be some tricks needed?)). But it is not necessary in most cases, I suppose.

TIM2 interrupt takes 7us, in worst case 9us, may be in very rare cases it will take 12 or 15us, but this will not have a significant impact on the program execution time and will not lead to failure.

Answer 19 · 2018-07-23T21:20:16.000Z

Does it really make sense?

I have to think. May be it used to need for debug.

Answer 20 · 2019-02-03T09:04:01.000Z

If STEP_PULSE_DELAY is activated, each SETP pulse width will be + STEP_PULSE_DELAY time, so the maximum pulse frequency will be limited?

Can I add the judgment that DIR PIN starts STEP_PULSE_DELAY only when the direction changes?

This means that DIR PIN is a continuous SETP pulse with the original pulse width after switching.

Namely:
Judging st.dir_outbits
DIR change + STEP_PULSE_DELAY executes SETP pulse, DIR does not change that no STEP_PULSE_DELAY executes SETP pulse.

It doesn't really matter if the delay time increases after adding judgement statements, because STEP_PORT-> ODR output pulses are executed directly when st.dir_outbits do not change.

The delay time between DIR and SETP must be available. The DM556 driver manual I used shows that DIR must advance SETP by 5 microseconds to ensure that it does not lose its step.

This is what I think, but I'm not very good at programming code. I hope I can make some improvements in providing this advice. In addition, can I add STEP_PULSE_DELAY setting interface, 6-axis start-stop setting interface, synchronous axis setting, configuration interface, electronic handwheel, etc. instead of pre-compiling to open and configure?

Because performance and many functions can be added under STM32, unlike AVR328 resource constraints, these functional interfaces cannot be implemented.