[Q] Bad serial connection, advanced ok gives checksums, without printer freezes. Next steps?
Norvat opened this issue · 26 comments
I am running a TFT 35 E3.0 using firmware BIGTREE_GD_TFT35_V3.0_E3.27.x (Pre-Compiled)
Mainboard SKR 1.3 with tmc 2209 steppers in UART running marlin 2.1.2.2
I cannot get a stable connection between the tft and mainboard. See pictures bellow for errors.
What i have tried so far:
- Replaced the tx and rx wires going to the tft with a shielded and grounded one
- Updates both marlin and tft to latest releases
- Tested all different baudrates from 9600 to 1 000 000
- Changed the serial ports in firmware
- Tried different SD cards
- Using a external 5V regulator in case the inboard one was too small
- Tried turning on advanced ok and checksum, and the printer does work, but it gives out errors constantly. Sometimes lines of gcode is lost and the printer skips a line. Without this feature the print freezes, but can still be controlled from the tft. While frozen "pending gcodes" is 1, but nothing is transmitted.
It seems it has some serious character loss
All dependencies in marlin is included, and all features in the tft work as they should.
Zip of marlin if that is helpfull:
Marlin-2.1.2.2.zip
Running the printer from pronterface works and does not give errors.
I am currently at a loss about what i should try next to mitigate the problem, any help is greatly appreciated
we have the same problem with MKS TFT35 , MKS ROBIN Nano V3 on 25 machines
i have tested all the baudrates and ports same issue
the print always stops mid job , when printing from the TFTSD or the TFTUSB
we are using the marlin 2.1x bugfix with advanced_ok
active ,
@ChihebMadiouni Hi, its good to know im not alone with these issues. What troubleshooting steps have you tried?
@Norvat @ChihebMadiouni
Guys, I highly recommend try the firmware from my repository.
@kisslorand running a test print now, will report back
if you don't have a reliable connection, simply disable advanced_ok and command_checksum.
@digant73 Disabling those features causes the printer to freeze. Do you have any ideas how i could troubleshoot?
@kisslorand Your firmware returned a "unknown command" error 30 min in, but a lot better than how it was running with the stock firmware. Any tips to how i could troubleshoot the serial connection? I have made my own shielded cable so the cable should be good.
Baudrate is currently set at 57600
i tested a 1 hour print with @kisslorand FW version with 250k baudrate and the print finished no errors and no stops
however i will continue testing to be sure
@ChihebMadiouni Do you think running a lower baudrate is causing my problems? I set it lower during testing to make sure i wasn't overloading the tft or mainboard
Edit: Printer froze again, now with a "busy processing" error. Still possible to controll printer after stopping the print job, so not a hard crash.
It might be an issue with my hardware then? Any easy way to test?
Edit 2:
I have also moved the cable away from the rest of the wires going to the toolhead, without helping.
@Norvat
I think you have high level of EMI at your printer. I see serial communication problems in both directions (TFT->MB, MB->TFT).
I would check the pull-up resistors on both the TFT and the motherboard on the RX & TX line, if they are 10k or higher I would change them to 4.7k.
Later I will check some boards I have around, I remember something about BTT usually using fairly high values for pull-up resistors.
Later edit: I just checked a few boards and TFT, it seems BTT uses internal pull-up both on the MB and TFT, only MKS TFT has an external pull-up resistor on the TX pin.
@Norvat
i recommend you switch to marlin 2.1.x bugfix the release 2.1.2.2 may have some bugs
also some features i have enabled in marlin maybe it will help
#define ADVANCED_OK
#define TX_BUFFER_SIZE 32
#define RX_BUFFER_SIZE 1024
#define MAX_CMD_SIZE 96
#define BUFSIZE 4
#define SERIAL_OVERRUN_PROTECTION
#define SERIAL_DMA
#define EMERGENCY_PARSER
Thank you so much for suggestions
I will investigate further later this week
@digant73 Disabling those features causes the printer to freeze. Do you have any ideas how i could troubleshoot?
The printer freezes when the TFT doesn't receive an ACK from the mainboard (e.g. due to EMI). When the TFT has no more available TX slot (1 in case advanced_ok is disabled) (it means also the TFT has pending commands that will be no more acknoledged) then the TFT will not send any further command
@Norvat у меня были те же ошибки на B3
вот мои настройки. Все работает корректно.
#2910 (comment)
Did some more testing.
Tried to print without the heated bed, and printed for an hour without errors. I then set bed temp, and the printer ether stopped right away or did some weird moments that isn't in the gcode. It seems to be especially bad while heating up.
Its a 24V 500x500mm bed (Probably pulling 15A), and the controller is placed about 5cm away from it. The original controller board was in a metal enclosure but i have made a plastic one to fit the SKR board, so i think the issue is EMI from the heated bed. I tried to pack the electronics enclosure it in aluminum foil that was grounded, but it did not work. My wrapping might have been a bit shoddy.
I am going to try to move the electronics away from the printer next.
@Norvat Please be careful with Kisslorand's closed source firmware. It's known to be buggy, unreliable and slow. His FW is far behind what you can find here. There is a reason why his "contributions" are ignored https://github.com/bigtreetech/BIGTREETECH-TouchScreenFirmware/pulls. Also note that Kisslorand has claimed to have tested lots of things, but that turned out to be complete nonsense.
From your screenshots I can see that you have serial data corruption. Using checksum will detect and report this, disabling checksum does not solve the issue, it just doesn't tell you when a serial communication error occurred.
You already did a lot of useful test. Make sure your serial wires are not close (in parallel) to your stepper cables.
Could your power supply be faulty or overloaded? If possible measure the board voltages with a scope during printing.
BTW: I'm running a similar setup (BTT TFT35 GD 3.0 + LPC1768 but not SKR board) without issues at 1M baud.
I also have a bed that draws 12A at 36V, I keep these wires separated from all low signal wires like serial communication wires.
@rondlh Thank you for the helpful comment. I will do some measurements after i get access to my oscilloscope again in a week or two.
The psu is the original that came from tronxy (this is a tronxy x5sa 500 pro), so it might not be the best. I have a smaller spare psu, so i will try to run power to the bed separately from the controller.
If the psu is the issue, would it be due to a varying voltage or some frequency?
If the psu is the issue, would it be due to a varying voltage or some frequency?
That's difficult to predict. To me it's clear that there is something seriously going wrong. The PSU could be noisy, and/or insert high frequency noise or even dip down to a low voltage for a short time when overloaded. If the voltage would dip, then the MCU on the motherboard and TFT would probably freeze, so that's not very likely, but a scope will show you.
You could actually do some test without heating the bed, that might help to point you in the right direction. I recommend to leave the TFT checksum feature on, so you will be informed if any serial corruption takes place.
Another thing you could try is to add an external serial connection to listen to the serial data flow, make sure to only listen (RX of your added port, don't connect TX). On your computer you then can use Putty to see the data flow. So you connect your external RX to the RX or TX on your motherboard. If you connect it to RX you can see the data the motherboard receives from the TFT, if you connect it to TX you can see the data the motherboard send to the TFT.
digant73 has recently added the "info screen" that can help to diagnose the problem. He left a few debug lines at the end of Monitoring.c. You can uncomment the "if" and one of the first 3 "mustStoreCmd" lines (FOR TESTING ONLY!!!).
The code makes the TFT send commands to the motherboard as fast a possible, to which the motherboard responds. If everything is going well then you should see lots RX/TX commands and data in the info screen, the numbers should be relatively stable. If they go to 0 then the serial communication has broken down.
Small update:
I have a separate PSU to the controller, and that seems to have done the trick with the errors, and the printer runs happily.
BUT!
If i ground the cable from the controller to the tft, i get errors. I can start a print without it connected and the second i connect it i get errors.
I tried scoping both the ground on the original psu and i am getting ripples of +-3V. I see +-0.5V through the shielding on the tft cable while it is not grounded. Almost no ripple from the new PSU. I tried getting pictures of the readings but the oscilloscope is an old analog one, so not too easy.
So i think the issue is psu related as rondlh says
Will test a bit more scientifically later this week, and try the tips above
MKS Eagle marlin 2.0.9.3 and 2.1.2.2+ MKS35 TFT with last btt dev firmware. I compiled it.
Printer stops suddenly. Not hanging. With or without advanced_ok / crc.
Tried all.
Terminal is on for monitoring. No garbage or missing ok. They were in some configuration.
It seems as TFT simply stops sending next gcoode command.
Finally tried kisslorand's firmware - Success of14 hours print.
but speed above ~60mm/sec causes pause ~1sec
I flashed Marlin 2.1.2.3 yesterday + last BTT fw. Shortened cable to 20cm. Nothing changes. Sudden stop.
Tried to move to 2.0.7.3 but cannot compile for MKS EAGLE
As I can see, TFT continues to receive messages from board but stops to send anything. Pressing abort button fixes it.
On TFT terminal no garbage. Stops after receiving Ok
@karabas2011 Please be careful with Kisslorand's closed source firmware. It's known to be buggy, unreliable and slow. His FW is far behind what you can find here. There is a reason why his "contributions" are ignored https://github.com/bigtreetech/BIGTREETECH-TouchScreenFirmware/pulls. Kisslorand's closed source FW might damage your motherboard and/or your TFT display, so better stay clear.
Also, the topic you raise here seems to be unrelated to the issue raised by Norvat, please start a new issue if you need support.
As I can see, TFT continues to receive messages from board but stops to send anything. Pressing abort button fixes it. On TFT terminal no garbage. Stops after receiving Ok
on TFT side, try to disable as much features as possible such as advanced_ok, command_checksum, event_led, file_comment_parsing.
Also, when the TFT stops to send gcodes to mainboard what do you see in the TFT's stats page? do you see Free TX slots
to 0 and Pending gcodes
different than 0?