Too many "Transmission Error - EventCode = 4" and scheduling a different message size

Question

Too many "Transmission Error - EventCode = 4" and scheduling a different message size

Closed this issue 6 years ago · 5 comments

Hi.
I am working with the board DISCO-L072CZ-LRWAN1.
Recently, I tried to send bigger messages than the simple lorawan example. I changed the tx and rx buffers accordingly, but when the device is going to send the message, it throws many times "Transmission Error - EventCode = 4" and other times it schedules a different message size. Sometimes it sends the correct size and message.

Console:

Sending: {m_c=0.05,m_p=5.49,mx_p=5.49,t_r=92.92,t_f=94.04,d_st=1}

11 bytes scheduled for transmission

Message Sent to Network Server

Sending: {m_c=0.03,m_p=3.63,mx_p=3.63,t_r=33.20,t_f=77.77,d_st=0}

56 bytes scheduled for transmission

Transmission Error - EventCode = 4

Sending: {m_c=0.05,m_p=6.22,mx_p=6.22,t_r=102.42,t_f=98.56,d_st=1}

57 bytes scheduled for transmission

Message Sent to Network Server

Sending: {m_c=0.05,m_p=5.89,mx_p=5.89,t_r=96.54,t_f=96.86,d_st=1}

56 bytes scheduled for transmission

Message Sent to Network Server

Sending: {m_c=0.05,m_p=6.08,mx_p=6.08,t_r=94.04,t_f=95.90,d_st=1}

56 bytes scheduled for transmission

Transmission Error - EventCode = 4

Sending: {m_c=0.05,m_p=5.51,mx_p=5.51,t_r=114.67,t_f=102.67,d_st=1}

58 bytes scheduled for transmission

Transmission Error - EventCode = 4

Sending: {m_c=0.05,m_p=5.66,mx_p=5.66,t_r=91.63,t_f=97.43,d_st=1}

53 bytes scheduled for transmission

Transmission Error - EventCode = 4

Sending: {m_c=0.05,m_p=5.40,mx_p=5.40,t_r=118.70,t_f=104.60,d_st=1}

11 bytes scheduled for transmission

Message Sent to Network Server

Sending: {m_c=0.05,m_p=5.62,mx_p=5.62,t_r=128.05,t_f=108.39,d_st=1}

58 bytes scheduled for transmission

Message Sent to Network Server

Sending: {m_c=0.05,m_p=6.05,mx_p=6.05,t_r=126.60,t_f=107.26,d_st=1}

58 bytes scheduled for transmission

Message Sent to Network Server

Sending: {m_c=0.05,m_p=5.64,mx_p=5.64,t_r=96.38,t_f=95.98,d_st=1}

56 bytes scheduled for transmission

Transmission Error - EventCode = 4

Sending: {m_c=0.05,m_p=5.46,mx_p=5.46,t_r=121.68,t_f=106.53,d_st=1}

58 bytes scheduled for transmission

Transmission Error - EventCode = 4

Sending: {m_c=0.05,m_p=5.79,mx_p=5.79,t_r=91.06,t_f=0.32,d_st=1}

53 bytes scheduled for transmission

Transmission Error - EventCode = 4

Sending: {m_c=0.05,m_p=5.46,mx_p=5.46,t_r=104.68,t_f=60.28,d_st=1}

11 bytes scheduled for transmission

Transmission Error - EventCode = 4

Note that even if the device schedules a smaller message size (11 bytes) the transmission error still occurs, so, the size may not be the problem.

Thanks in advance for your help and have happy holidays.

Answer 1 · 2018-12-28T22:14:00.000Z

ARM Internal Ref: IOTCELL-1620

Answer 2 · 2018-12-31T09:21:12.000Z

@sagilar Event Code 4 means TX_ERROR which would mean (in case of MSG_CONFIRMED) that you did not receive an ack for your uplink. It's harder to pin-point the exact error with the trace you have provided. Please turn on stack traces https://github.com/ARMmbed/mbed-os-example-lorawan#optional-adding-trace-library and share your trace.

About buffers: The tx_buffer[30] and rx_buffer[30] in the application are just for example application. If you wish to change the stack level outgoing buffer size, you can configure this parameter https://github.com/ARMmbed/mbed-os/blob/master/features/lorawan/mbed_lib.json#L44 in your mbed_app.json. Stack level RX buffer is set to the maximum LoRaWAN MTU (255 bytes) and is not configurable.

In addition to that please keep in mind that the actual size of data that you can send depend upon the data rate being used and the region of operation, please check LoRaWAN Specification 1.0,2 for more details.

Let's have a detailed dissection of the issue when you have a detailed trace and please list your setup, like Mbed OS version, Region of operation, data rate, network server, intended message type, message length etc.

Thanks & Regards

Answer 3 · 2019-01-02T15:43:13.000Z

@hasnainvirk Thank you for your quick reply.
I used the mbed trace to acquire more information as you requested. The log file is too big, so I attached it to the issue. It looks to have 2 kind of problems: 1 for scheduling, and 2 for not receiving ACK.

Scheduling:

> 53 bytes scheduled for transmission 
> 
> �[2K�[90m[DBG ][LSTK]: Transmission completed�[0m
> 
> �[2K�[90m[DBG ][LSTK]: Awaiting ACK�[0m
> 
> �[2K�[90m[DBG ][LMAC]: RX1 slot open, Freq = 904900000�[0m
> 
> �[2K�[90m[DBG ][LMAC]: RX2 slot open, Freq = 923300000�[0m
> 
> �[2K�[90m[DBG ][LMAC]: ACK_TIMEOUT Elapses, Retrying ...�[0m
> 
> �[2K�[90m[DBG ][LMAC]: Trading datarate for range�[0m
> 
> �[2K�[90m[DBG ][LMAC]: TX: Channel=61, TX DR=0, RX1 DR=10�[0m
> 
> �[2K�[31m[ERR ][LMAC]: Allowed FRMPayload = 11, FRMPayload = 53, MAC commands pending = 0�[0m
> 
> �[2K�[31m[ERR ][LSTK]: Failed to schedule transmission�[0m
> 
>  Transmission Error - EventCode = 4 
> 
>  Sending: {m_c=0.00,m_p=0.52,mx_p=0.52,t_r=30.46,t_f=21.03,d_st=0} 
> 
> �[2K�[39m[INFO][LMAC]: Cannot transmit 56 bytes. Possible TX Size is 11 bytes�[0m
> 
> �[2K�[39m[INFO][LMAC]: RTS = 11 bytes, PEND = 45, Port: 15�[0m
> 
> �[2K�[90m[DBG ][LMAC]: Frame prepared to send at port 15�[0m
> 
> �[2K�[90m[DBG ][LMAC]: TX: Channel=28, TX DR=0, RX1 DR=10�[0m

ACK:


>  11 bytes scheduled for transmission 
> 
> �[2K�[90m[DBG ][LSTK]: Transmission completed�[0m
> 
> �[2K�[90m[DBG ][LSTK]: Awaiting ACK�[0m
> 
> �[2K�[90m[DBG ][LMAC]: RX1 slot open, Freq = 914300000�[0m
> 
> �[2K�[90m[DBG ][LMAC]: RX2 slot open, Freq = 923300000�[0m
> 
> �[2K�[90m[DBG ][LMAC]: ACK_TIMEOUT Elapses, Retrying ...�[0m
> 
> �[2K�[90m[DBG ][LMAC]: Trading datarate for range�[0m
> 
> �[2K�[90m[DBG ][LMAC]: TX: Channel=7, TX DR=0, RX1 DR=10�[0m
> 
> �[2K�[90m[DBG ][LSTK]: Transmission completed�[0m
> 
> �[2K�[90m[DBG ][LMAC]: RX1 slot open, Freq = 912300000�[0m
> 
> �[2K�[90m[DBG ][LMAC]: RX2 slot open, Freq = 923300000�[0m
> 
> �[2K�[90m[DBG ][LMAC]: ACK_TIMEOUT Elapses, Retrying ...�[0m
> 
> �[2K�[90m[DBG ][LMAC]: TX: Channel=52, TX DR=0, RX1 DR=10�[0m
> 
> �[2K�[90m[DBG ][LSTK]: Transmission completed�[0m
> 
> �[2K�[90m[DBG ][LMAC]: RX1 slot open, Freq = 903700000�[0m
> 
> �[2K�[90m[DBG ][LMAC]: RX2 slot open, Freq = 923300000�[0m
> 
> �[2K�[90m[DBG ][LMAC]: ACK_TIMEOUT Elapses, Retrying ...�[0m
> 
> �[2K�[90m[DBG ][LMAC]: Trading datarate for range�[0m
> 
> �[2K�[90m[DBG ][LMAC]: TX: Channel=55, TX DR=0, RX1 DR=10�[0m
> 
> �[2K�[90m[DBG ][LSTK]: Transmission completed�[0m
> 
> �[2K�[90m[DBG ][LMAC]: RX1 slot open, Freq = 912700000�[0m
> 
> �[2K�[90m[DBG ][LMAC]: RX2 slot open, Freq = 923300000�[0m
> 
> �[2K�[31m[ERR ][LSTK]: Retries exhausted for Class A device�[0m
> 
>  Transmission Error - EventCode = 4

As you suggested, I changed the tx buffer size in the mbed_app.json, but it still throws the errors.
The changes in the config file are:

"target_overrides": {
        "*": {
            "platform.stdio-convert-newlines": true,
            "platform.stdio-baud-rate": 115200,
            "platform.default-serial-baud-rate": 115200,
            "lora.over-the-air-activation": true,
            "lora.duty-cycle-on": true,
            "lora.phy": "US915",
            "lora.tx-max-size": 255,
            "lora.device-eui": "{ my device eui }",
            "lora.application-eui": "{ my app eui }",
            "lora.application-key": "{ my app key }",
            "mbed-trace.enable": true
        },

I also set "main_stack_size": 2048, due to the ST Board RAM (and I have a thread working too).

Regarding to the additional specifications:

The Mbed OS I'm not sure the specific version but I forked it from master Mbed OS just 12 days ago. I did the max_sys_rx_error hack posted in the board problem, because the targets were presenting this error.
Region of operation: US 915. I am located in Colombia, LA.
Network Server: Private LoRaServer. (It works fine)
Message type: Common uplink.
Message length: around 55-60 bytes.
Data rate and other parameters: not yet defined, I have only 1-4 nodes connected to my Laird gateway, so, no restrictions set yet.

Thank you in advance and happy new year.

lorawan-log-255.log

Answer 4 · 2019-01-03T19:08:58.000Z

@sagilar There are a few things that I wish to highlight here.

Problem 1 As you are in US, you must check how many channels your Base Station (Laird Gateway) supports.

I can see from the trace that you are hopping over all 72 channels. I doubt that there is a Base Station that supports all 72 channels at the moment. A quick google search tells me that most of the Laird models support 8 channels. So the first thing you need to do is to check what FSB (Frequency sub-band) Base Station is configured with. Then you need to set that same FSB to the device. Check this text to learn more about setting an FSB to Mbed LoRaWAN stack https://github.com/ARMmbed/mbed-os/blob/master/features/lorawan/FSB_Usage.txt

Problem 2 I would not trust setting max-sys-error to 100ms or something. This value is too large and doesn't make any sense. Currently we use 5ms because that's the timing error what we expect
due to the use of event queue. This value is used in an algorithm that calculates the times to open RX windows. If you give too large values, the radio starts to receive way too early and remains in receive mode for a very long period of time defeating all the purposes of a low powered WAN. Plus this also means that we essentially don't really know what exactly the timing drift is for our system and may ruin all time sensitive operations. The sole purpose of this value being configurable is that cheaper crystals cannot be precise so we can handle a ms or two in software that's it. We can't rely on it to mask out faulty/un-calibrated crystals in their entirety.

I would recommend use the default value and sort your crystal issue rather than masking it with too large max-sys-error. You got to get your hands dirty here :), pick up an oscilloscope and measure how much timing drift your crystal is causing and adjust it accordingly. I suppose you are using DISCO_LRWAN1 board. So follow the same link you gave me for correction.

Problem 3 Data rate has gone too low sometimes, i.e., DR0 and your packet cannot be sent as at DR0 you can send only 11 bytes.

Let me explain what's happening here. Most of the times, the device did not hear from the Network Server, no ack (I suppose you sent at a wrong channel, nobody was listening). And you have ADR on. So the device tries to get more range as it thinks that it is out of range. It acquires range by lowering data rate but then it cannot send your message out because the current data rate doesn't allow. Unfortunately, at the moment there is no Fragmentation support in the stack (it's coming :), spoiler alert ) .
You can disable adaptive data rate and set a fixed data rate that you are sure that your message will go out. The algorithm that trades data rate for range, doesn't apply if you have ADR disabled.
Disable ADR and set a custom data rate after Joining.

If you have anymore questions, please ping me freely and Happy new year to you too :)

Answer 5 · 2019-01-03T19:28:33.000Z

@hasnainvirk Thank you for your help.
I am going to do the things you suggest me (and get my hands dirty with an oscilloscope. I was trying to avoid it at max).
If any extra doubt emerges, I will put in contact again. Meanwhile, I will close the issue.

Best Regards,
Santiago.

Edited:
I have corrected the Problem 1 and Problem 3, and the messages are being sent right always!
Obviously, the chrystal fix is required and I will do it on the next days.

Many many thanks @hasnainvirk .

What a good support that ARM has! (y)