obgm/libcoap

[BUG]: Missing block number (Blockwise Transfer)

Closed this issue · 9 comments

Environment

  • Build System: [Make]
  • Operating System: [Linux]
  • Hosted Environment: [None)]

libcoap Configuration Summary

libcoap package version        : "4.3.4"
    libcoap package source         : "v4.3.4-dirty"
    libcoap API version            : "3"
    libcoap ABI version            : "3.1.1"
    libcoap libtool SO version     : "4.1.1"
    libcoap DTLS lib extn          : "-notls"
    host system                    : "x86_64-pc-linux-gnu"
      build with server support      : "yes"
      build with client support      : "yes"
      build with IPv4 support        : "yes"
      build with IPv6 support        : "yes"
      build with Unix socket support : "yes"
      build with TCP support         : "yes"
      build DTLS support             : "no"
      add default names              : "yes"
      build Observe Persist          : "yes"
      build using epoll              : "yes"
      enable small stack size        : "no"
      enable separate responses      : "yes"
      enable OSCORE support          : "no"
      enable Q-Block support         : "yes"
      enable max logging level       : "none"
      build doxygen pages            : "yes"
          --> Doxygen around         : "yes" (/usr/bin/doxygen 1.9.1)
             -->  dot around         : "yes" (/usr/bin/dot 2.43.0)
      build man pages                : "no"
      build unit test binary         : "no"
      build examples                 : "yes"
      install examples source        : "yes"
      build with gcov support        : "no"
      build shared library           : "yes"
      build static library           : "yes"

Problem Description

The server fails to responds with the appropriate return code in case of inconsistent block numbers. The problem is already reported previously here.

Expected Behavior

According to RFC7959:

'''If not all previous blocks are available at the server at the time of processing the final block, the transfer fails and error code 4.08 (Request Entity Incomplete, Section 2.9.2) MUST be returned. '''

Actual Behavior

The resource is created on the server instead of returning the expected error code.

Steps to reproduce

if 4 blocks of data with the following block numbers are sent to the server, the server fails to return the expected error code.

Block number 1: 0
Block number 2: 13
Block number 3: 2
Block number 4: 15

You can find the wireshark log attached here.
missing-blocks.zip

Debug Logs

Apr 12 17:38:00.690 DEBG created UDP  endpoint 0.0.0.0:5683
Apr 12 17:38:17.548 DEBG ***0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : session 0x449060: new incoming session
Apr 12 17:38:17.548 DEBG ***EVENT: COAP_EVENT_SERVER_SESSION_NEW
v:1 t:CON c:PUT i:4b2d {01} [ Uri-Path:example_data, Content-Format:application/link-format, Block1:0/M/32, Size1:111, Request-Tag:0xa9fa532b ] :: 'helllohellohellollllooooollloooo'
Apr 12 17:38:17.549 DEBG ** 0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : lg_srcv 0x44adb0 initialized
Apr 12 17:38:17.549 DEBG call custom handler for resource 'example_data' (3)
created cache_entry
Apr 12 17:38:39.589 DEBG *  0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : netif: sent    8 bytes
v:1 t:ACK c:2.31 i:4b2d {01} [ Block1:0/M/32 ]
v:1 t:CON c:PUT i:4b2e {02000000000003} [ Uri-Path:example_data, Content-Format:application/link-format, Block1:13/M/32, Size1:111, Request-Tag:0xa9fa532b ] :: 'oeeeeeeedasfdsafdsafdsfdsdsadsad'
Apr 12 17:38:54.154 DEBG call custom handler for resource 'example_data' (3)
Apr 12 17:38:54.155 DEBG *  0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : netif: sent   14 bytes
v:1 t:ACK c:2.31 i:4b2e {02000000000003} [ Block1:13/M/32 ]
v:1 t:CON c:PUT i:4b2f {03000000000003} [ Uri-Path:example_data, Content-Format:application/link-format, Block1:2/M/32, Size1:111, Request-Tag:0xa9fa532b ] :: 'sadsdasdsadsadsadsadsadsadsadsad'
Apr 12 17:39:03.297 DEBG call custom handler for resource 'example_data' (3)
Apr 12 17:39:03.297 DEBG *  0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : netif: sent   14 bytes
v:1 t:ACK c:2.31 i:4b2f {03000000000003} [ Block1:2/M/32 ]
v:1 t:CON c:PUT i:4b30 {04000000000003} [ Uri-Path:example_data, Content-Format:application/link-format, Block1:15/_/32, Size1:111, Request-Tag:0xa9fa532b ] :: 'sadsadsadsadsad'
Apr 12 17:39:12.970 DEBG call custom handler for resource 'example_data' (3)
Apr 12 17:39:12.970 DEBG *  0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : netif: sent   11 bytes
v:1 t:ACK c:2.01 i:4b30 {04000000000003} [ ]
Apr 12 17:39:14.356 DEBG ** 0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : lg_srcv 0x44adb0 released
Apr 12 17:39:14.356 DEBG ***0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : session 0x449060: closed

Are you running release code for 4.3.4, or code in the latest develop branch? It appears that you are just running 4.3.4 with no patches.

Fix #1288 is post release 4.3.4

Thanks for the prompt reply @mrdeep1.

Checking against the develop branch (c88c783), I have the following exchange:

if 4 blocks of data with the following block numbers are sent to the server, the server fails to return the expected error code and return 2.01 code.

Block number 1: 10
Block number 2: 0
Block number 3: 10
Block number 4: 10

Please also check the wireshark log from server.

missing-blocks.zip

Debug Logs

Apr 15 11:13:11.685 DEBG created UDP  endpoint 0.0.0.0:5683
Apr 15 11:13:30.528 DEBG ***0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : session 0x44a060: new incoming session
Apr 15 11:13:30.528 DEBG ***EVENT: COAP_EVENT_SERVER_SESSION_NEW
v:1 t:CON c:PUT i:4b2d {01} [ Uri-Path:example_data, Content-Format:application/link-format, Block1:10/M/32, Size1:111, Request-Tag:0xa9fa532b ] :: 'helllohellohellollllooooollloooo'
Apr 15 11:13:30.528 DEBG call custom handler for resource 'example_data' (3)
Apr 15 11:13:30.528 WARN No cache entry available for the non-first BLOCK
Apr 15 11:13:30.529 DEBG *  0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : netif: sent    5 bytes
v:1 t:ACK c:5.00 i:4b2d {01} [ ]
v:1 t:CON c:PUT i:4b2e {02000000000003} [ Uri-Path:example_data, Content-Format:application/link-format, Block1:0/M/32, Size1:111, Request-Tag:0xa9fa532b ] :: 'oeeeeeeedasfdsafdsafdsfdsdsadsad'
Apr 15 11:13:57.113 DEBG ** 0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : lg_srcv 0x44d020 initialized
Apr 15 11:13:57.113 DEBG call custom handler for resource 'example_data' (3)
created cache_entry
Apr 15 11:13:57.113 DEBG *  0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : netif: sent   14 bytes
v:1 t:ACK c:2.31 i:4b2e {02000000000003} [ Block1:0/M/32 ]
v:1 t:CON c:PUT i:4b2f {03000000000003} [ Uri-Path:example_data, Content-Format:application/link-format, Block1:10/M/32, Size1:111, Request-Tag:0xa9fa532b ] :: 'sadsdasdsadsadsadsadsadsadsadsad'
Apr 15 11:15:15.175 DEBG ** 0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : lg_srcv 0x44d020 released
Apr 15 11:15:15.175 DEBG *  0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : netif: sent   25 bytes
v:1 t:ACK c:4.08 i:4b2f {03000000000003} [ ] :: 'Missing block'
v:1 t:CON c:PUT i:4b30 {04000000000003} [ Uri-Path:example_data, Content-Format:application/link-format, Block1:10/_/32, Size1:111, Request-Tag:0xa9fa532b ] :: 'sadsadsadsadsad'
Apr 15 11:15:26.286 DEBG call custom handler for resource 'example_data' (3)
Apr 15 11:15:26.286 DEBG *  0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : netif: sent   11 bytes
v:1 t:ACK c:2.01 i:4b30 {04000000000003} [ ]
Apr 15 11:15:28.434 DEBG ***0.0.0.0:5683 <-> 0.0.0.0:9090 UDP : session 0x44a060: closed

Unfortunately, the pcap trace is only for packets coming from the server, not packets from the client to the server so am not easily able to reproduce your situation without crafting my own packets which takes time. A pcap of the client to server packets would help a lot here.

This does look like an application level programming issue (within the example coap-server code) possible confused by what is happening at the libcoap layer.

It would be interesting to see what effect using the -L3 option is for the coap-server. This enables the libcoap layer to do all of the block handling before it is passed up to the application layer for processing the entire body of the blocked transfer in 1 go.

Hello,

Here are the pcap trace for the client generated packets.

client-data.zip

I am not using coap-server. I have my own test harness. The harness reads the packet from file and calls coap_handle_dgram() for each packet.

Thanks for the client files.

In your test harness, you can choose whether libcoap is going to be doing the Block1 handling to assemble the body of data, or your test harness (which is the application layer) to do the assembly of the body of data, sending back the appropriate response codes. To get libcoap to handle everything, then you need to set COAP_BLOCK_SINGLE_BODY.

If you add in the line

   coap_context_set_block_mode(COAP_BLOCK_USE_LIBCOAP | COAP_BLOCK_SINGLE_BODY);

into your test harness, this will get libcoap to do the assembly of the body of data for you, returning error codes as appropriate.

That said, with COAP_BLOCK_SINGLE_BODY enabled, response 5.00 is returned for packet 1 and 4 which is returned by the application layer of my coap-server which needs to be looked at.

Please see #1362 where the libcoap library coap_handle_request_put_block() has been re-written / updated to handle the corner cases that your packets were testing.

Please note that if coap_context_set_block_mode() is not called at all in your server test harness, or just coap_context_set_block_mode(COAP_BLOCK_USE_LIBCOAP) is called, all request packets will get passed to the application layer where it is the responsibility of the application layer to decide what to do with missing or re-ordered packets.

With coap_context_set_block_mode(COAP_BLOCK_USE_LIBCOAP | COAP_BLOCK_SINGLE_BODY), the libcoap layer is doing all he necessary checking.

Please update us with any issues found with #1362.

Thank you for the quick fix. I will look into this and let you know as soon as possible.

@bathooman Can this be closed now?

Yes. Let's close this for now. Thanks.