pathfinder-for-autonomous-navigation/FlightSoftware

Downlink Shift Crashes FSW on HITL and x86 Linux

shihaocao opened this issue · 5 comments

Using this specific uplink, because #863 is a confounding issue/issue ticket, we observe that certain downlink shifts cause a std::move to segfault the spacecraft.

Note that shifting flow 36 with flow 2 does not cause a segfault!! Therefore, some parts of flow shifting work...
Note further that this issue affects HITL, and x86 Linux.

Linux Version:

(venv) shihao@shihao-T490V2:~/Code/PAN/FlightSoftware$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.3 LTS
Release:	20.04
Codename:	focal
(venv) shihao@shihao-T490V2:~/Code/PAN/FlightSoftware$ uname -r
5.10.0-1057-oem
(venv) shihao@shihao-T490V2:~/Code/PAN/FlightSoftware$ 

Observed on EDU running fsw_teensy36_hitl_leader

THIS ISSUE DOES NOT AFFECT x86 MacOS???

Example uplink that crashes FSW:

[
    {
    "field": "radio.max_wait",
    "value": 10420
    },
    {
    "field": "radio.max_transceive",
    "value": 10
    },
    {
    "field": "adcs_cmd.havt_disable18",
    "value": true
    },
    {
    "field": "gomspace.piksi_off",
    "value": true
    },
    {
    "field": "dcdc.disable_cmd",
    "value": true
    },
    {
    "field": "gomspace.power_cycle_output4_cmd",
    "value": true
    },
    {
    "field": "downlink.shift_id1",
    "value": 19
    },
    {
    "field": "downlink.shift_id2",
    "value": 2
    },
    {
    "field": "pan.state",
    "value": 1  
    },
    {
    "field": "adcs.state",
    "value": 5
    },
    {
    "field": "adcs_monitor.wheel1_fault.suppress",
    "value": true
    },
    {
    "field": "adcs_monitor.wheel2_fault.suppress",
    "value": true
    },
    {
    "field": "adcs_monitor.wheel3_fault.suppress",
    "value": true
    },
    {
    "field": "adcs_monitor.wheel_pot_fault.suppress",
    "value": true
    },
    {
    "field": "gomspace.low_batt.suppress",
    "value": true
    },
    {
    "field": "attitude_estimator.fault.suppress",
    "value": true      
    }
]

We are seeing the below after applying the above uplink

[2022-02-09 22:36:23.518867] (ERROR) Post update fields

[2022-02-09 22:36:23.580867] (ERROR) DP pre execute
[2022-02-09 22:36:23.580867] (ERROR) DP pre flows
[2022-02-09 22:36:23.580867] (ERROR) Pre shift flows
[2022-02-09 22:36:23.580867] (ERROR) B1
[2022-02-09 22:36:23.580867] (ERROR) B2
[2022-02-09 22:36:23.580867] (ERROR) idx1 18, idx2 1
[2022-02-09 22:36:23.580867] (ERROR) loop1
[2022-02-09 22:36:23.580867] (ERROR) i: 18, i-1: 17
[2022-02-09 22:36:23.580867] (ERROR) pre_swap sr i: 19, i-1 18
[2022-02-09 22:36:23.580867] (ERROR) Copy operator called.
[2022-02-09 22:36:23.580867] (ERROR) Attempting deserialize
[2022-02-09 22:36:23.580867] (ERROR) Assign is_active
[2022-02-09 22:36:23.580867] (ERROR) Assign with std::move
[2022-02-09 22:36:23.580867] (ERROR) Assign field_list
[2022-02-09 22:36:23.580867] (ERROR) Copy operator finished.
[2022-02-09 22:36:23.580867] (ERROR) post_swap sr i: 18, i-1 19
[2022-02-09 22:36:23.580867] (ERROR) i: 17, i-1: 16
[2022-02-09 22:36:23.580867] (ERROR) pre_swap sr i: 19, i-1 17
[2022-02-09 22:36:23.580867] (ERROR) Copy operator called.
[2022-02-09 22:36:23.580867] (ERROR) Attempting deserialize
[2022-02-09 22:36:23.580867] (ERROR) Assign is_active
[2022-02-09 22:36:23.580867] (ERROR) Assign with std::move
Device FlightController exited with status -6.
Device FlightController exited with status -6.
Device FlightController exited with status -6.
Device FlightController exited with status -6.
Device FlightController exited with status -6.
Device FlightController exited with status -6.
Device FlightController exited with status -6.

Corresponding code in DowlinkProducer:

Flow& operator=(const Flow& rhs) {

            printf(debug_severity::error, "Copy operator called.");
            unsigned char flow_id;
            printf(debug_severity::error, "Attempting deserialize");

            rhs.id_sr.deserialize(&flow_id);
            printf(debug_severity::error, "Assign is_active");

            is_active = rhs.is_active;

            printf(debug_severity::error, "Assign with std::move");
            id_sr = std::move(rhs.id_sr);

            printf(debug_severity::error, "Assign field_list ");

            field_list = rhs.field_list;
            printf(debug_severity::error, "Copy operator finished.");

            return *this;

Failing in the std::move assignment

Downlink Producer Code

 if (idx1>idx2) {
        printf(debug_severity::error, "loop1");

        for (size_t i = idx1; i > idx2; i--) {
            printf(debug_severity::error, "i: %d, i-1: %d", i, i - 1);

            unsigned char i_id, i2_id;
            flows[i].id_sr.deserialize(&i_id);
            flows[i-1].id_sr.deserialize(&i2_id);      

            printf(debug_severity::error, "pre_swap sr i: %d, i-1 %d", i_id, i2_id);
            std::swap(flows[i], flows[i-1]);
            
            flows[i].id_sr.deserialize(&i_id);
            flows[i-1].id_sr.deserialize(&i2_id);
            printf(debug_severity::error, "post_swap sr i: %d, i-1 %d", i_id, i2_id);

        }
    }

Debugging tooling pushed to bugfix/uplink

Testing with: python -m ptest runsim -c ptest/configs/hootl.json -t SingleSatDetumbleCase --clean
Using current branch condition, testing uplinks using localhost:8001/swagger