EK80 Raw files - "Something went wrong"
Closed this issue · 5 comments
This file - and the other in the directory linked to below fails during processing.
https://oceaninsightscience.file.core.windows.net/hidata/cruise_data/2021/D20210811-T093822.raw
Error message: "ERROR: Something went wrong when reading the RAW file: /datain/D20210813-T093822.raw (<class 'ValueError'>)
None"
The command was run on a windows machine like this:
docker run -it --name test_pyechopreprocess -v d:\WRK:/datain -v d:\WRK\OUT:/dataout --security-opt label=disable --env OUTPUT_TYPE=zarr --env MAIN_FREQ=38000 --env MAX_RANGE_SRC=500 --env OUTPUT_NAME=S2020842 --env WRITE_PNG=0 crimac/preprocessor
The 200kHz on Statsraad Lehmkul is a single beam echosounder whereas the 38kHz is a split beam. The split beam echosounder provides the split beam angles in the data strcutures. This will be missing in the single beam. Could that be a possible explanation?
The 200KHz also has sequential pinging between FM and CS, and since the preprocessor do not handle the FM data yet, this may also cause the crash. A possible patch is to make a version where we can opt out on a transducer during the conversion, and rerun for all channels when we have code that can handle FM.
It seems that the raw file contains a RAW4
datagram format that is not (yet?) defined anywhere (cf. https://www.simrad.online/ek80/interface_en/default.htm).
Pyecholab refuses to process this file because of the unknown format and that leads to the error in the preprocessor.
Alright, did some trial and error runs with this.
Firstly, tried to ignore the RAW4
datagram format completely (iambaim/pyEcholab@af2f278). However, this leads to the 200kHz CW channel contained only a single ping, while the the 38kHz CW and 200kHz FM channels had 7436 and 3600 pings, respectively.
Secondly, tried to treat RAW4
datagram as another RAW3
datagram with a different header (iambaim/pyEcholab@98d4d98). And now we have 7436, 3836, and 3600 pings for 38kHz CW, 200kHz CW, and 200kHz FM channels, respectively.
The latter fits with Nils Olav's explanation in #20, where we should have the time-interleaved 200kHz CW and 200kHz FM channels data in a single raw file produced by the OneOcean ship.
I'll re-open the issue for the time being so that others can give comments.
Ibrahim's second fix is the way to go. Since the unpacking seems to work without issue, the RAW4 datagram must be identical through the count field. New fields, if any, would have to come after count. I've pulled those changes into my branch.
When we get more information on the format of the RAW4 datagram, we can extend the header definition if required and determine what changes would need to be made to the parser.