mtbsteve/redtail

Unable to establish MAVLink communication Apsync.

rbachtell opened this issue · 28 comments

This is in response to the comment I left in the wrong place. First of all I thought I was adequately specific and detailed in my comment but the short story is that I have refreshed and subsequently loaded Apsync six times following the wiki accurately and am unable to establish communication through the serial port on the Development board. All components are known to work properly and connection is accurate. My comment in the wrong place detailed that I have been successfully using Apsync since the beginning of the Apsync project and all equipment has previously been tested.

Randy, it’s not possible for me to identify the issue w/o further information about what you have installed and what wiki you are following. The Redtail wiki instructions assume you have a working APsync layer installed upfront- it’s not included in this repo. Redtail doesn’t change nor overwrite anything related to APsync.

Also, It doesn’t matter what version of APsync - You may use my version the original version of peterbarker or your version of APsync you had up and running before and you are familiar with. As a bare minimum the AP must work and you must be able to communicate with the FC through mavproxy:

ssh into device
ssh apsync@10.0.1.128

Check "screen -list" includes mavlink-router, DataFlashLogger, cherrypy:
screen -list

screen -list
There are screens on:
1754.cherrypy (12/08/2017 03:00:54 AM) (Detached)
1733.DataFlashLogger (12/08/2017 03:00:54 AM) (Detached)
1704.mavlink-router (12/08/2017 03:00:54 AM) (Detached)

Check APWeb is running:
sudo screen -list
There is a screen on:
1765.apweb (12/08/2017 03:00:54 AM) (Detached)

ensure mavproxy starts on the apsync image:
mavproxy.py --master :14550 --source-system=56
make sure you can fetch parameters:
param fetch
param status

If this doesn’t work, check your env, in particular the setting for the telemetry serial port. (I assume you have checked the wiring between the serial port and the Pixhawk already). If you are using the J17 connector on the Nvidia TX2 development board
then edit the config.env file and set
export TELEM_SERIAL_PORT=/dev/ttyTHS2

If it still doesn’t work, either your hardware has an issue or something in your installation of mavlink got screwed.

If it is working, THEN you can continue with installing the Redtail components, ROS, the DNN’s etc as explained in the wiki of this repo.

OK - if you can't even ssh into the TX2, your problem is within the very first installation steps.
What is the error message when you run ssh apsync@10.0.1.128 from a connected client PC?
In order to help, can you provide the following infos:

  1. L4T release:
    Run: head -n 1 /etc/nv_tegra_release and paste the output.

  2. User setup: TX2 should boot and automatically logon as apsync. If this is not the case, let me know the following infos:

  • Run: cat /etc/hosts and paste the output.
  • Run: groups apsync and paste the output
  • Run: cat /etc/gdm3/custom.conf and paste the output.
  1. Network setup: The TX2 should initiate an WIFI access point with the SSID ardupilot and the password ardupilot. If you can't see the AP from another computer and if you cannot connect to the ardupilot network, provide the following infos:
  • Run: nmcli device wifi and paste the output
  • Run: cat /etc/modprobe.d/bcmdhd.conf and paste the output
  • Run: cat /home/apsync/.profile | grep nmcli and paste the output
  1. ssh:
  • Connect to the ardupilot AP from another computer and Run ssh apsync@10.0.1.128
    When you try to connect for the first time, you will get a Authenticity warning. Just confirm to continue with yes

Ha that’s an easy one. That’s basic Linux networking and has nothing to do with APsync. Google knows that problem very well 😊
Here is a solution:
https://stackoverflow.com/questions/20840012/ssh-remote-host-identification-has-changed

Reason: you connected your Mac with an AP “ardupilot” before. Now you’re trying it again, but since it’s a different system image the host key is different and doesn’t match the one stored on your Mac. ssh sees a potential security threat and blocks access.

just to add - thanks for sharing the other information. That all looks good. So once you got the ssh stuff working we can look at mavlink.

OK Randy lets take a step by step approach.

  1. can you ssh in now from any other machine including your Mac?
    If not, you need to run on the respective machine ssh-keygen -R 10.0.1.128 as mentioned above in order to remove the previous host key stored eg on your Mac.

  2. On the TX2, please run: head -n 1 /etc/nv_tegra_release and paste the output.

  3. On the TX2, what does screen -list display? You should get:

apsync@apsync:~$ screen -list
There are screens on:
	5959.cherrypy	(12.03.2020 08:40:23)	(Detached)
	5870.DataFlashLogger	(12.03.2020 08:40:23)	(Detached)
	5759.mavlink-router	(12.03.2020 08:40:23)	(Detached)
3 Sockets in /run/screen/S-apsync.

If can not see those 3 screens, please run cat /etc/rc.local and paste the output.

  1. Check for apweb process: on the TX2, please run ps -ef | grep [w]eb_server and paste the output.

  2. mavlink:
    Can you please run:
    ps -ef | grep [r]outerd and paste the result, then run:
    head -n 20 ~/start_mavlink-router/screenlog.0 and paste the output.

ok thanks. None of the mandatory apsync processes are running after startup on your TX2, including mavlink-router. That explains why mavproxy is not working since it cant establish the connection (link 1 down). Since /etc/rc.local is empty which would initiate all processes at boot, something fundamental must have gone south during the basic installation steps. However /etc/rc.local exists, that indicates you had completed up to line 51 of the install script.

To resolve, please logon to the TX2 as apsync user and EXACTLY follow line by line the steps listed in the install instructions from line 55 onwards:
https://github.com/mtbsteve/companion/blob/7e150e238f933a2699ec20f1d53f998a40f4e371/Nvidia_JTX2_JP42/Ubuntu/1_create_base_image_tx2_JP42.txt#L55

Execute each line one by one manually and check for any potential error messages. As written in the instructions at line 68, set the TELEM_SERIAL_PORT=/dev/ttyTHS2 for the development board. (by default its set to ttyTHS1 for the Auvidea carrier board).

Run through all the listed steps and then retest. In case there appear any errors during install, pls stop and post the error here.

Thanks. Thats the content of your ~/GitHub/companion/Nvidia_JTX2_JP42/Ubuntu directory. There was a typo in your ls -l command. I wanted to check the content of the directory /home/apsync/start_mavlink-router to see if at least something got installed on your machine.

Much better. As it looks you got now a screenlog.0 file which indicates some activity. Coming back to what I raised before, could you run on the TX2 :
screen -list and paste the output
cat /etc/rc.local and paste the output.
ps -ef | grep [r]outerd and paste the result, then run:
head -n 20 ~/start_mavlink-router/screenlog.0 and paste the output.

Good. Thats what I said - w/o the services launched from /etc/rc.local it simply cannot work. I am closing this issue.