scottmuc/infrastructure

Rebuild Raspbery PI - Post git.scottmuc.com and Pippin rename

Closed this issue · 8 comments

Yay for Repaving!

As much as possible is documented inline in this issue template. In case of problems you may find help by viewing
all the previous repave issues. Have fun!

Things to do with the existing build

  • Enable DHCP on the router, remove port mapping and statically assign network to PC

    Instructions

    Insert screenshots here ;-)

  • Shutdown PI

    Instructions

    Make sure the USB drive has spun down before doing any work.

    sudo shutdown -h now

  • Create SD card with the latest Raspberry Pi OS

    Instructions

    Using the SD card in the now powered down PI.

    The new installer has options to enable SSH and create a user.

    installer download

    note check if the underlying Debian distribution is changing as this might result
    in some issues in the playbook execution.

    The Bookworm 64-bit lite image seems to work for now. note as of v1.8.4 of
    the Imager software, ensure to not select no filtering in the Raspberry Pi Device
    filter.

Post OS install steps on desktop

  • Ensure a working ansible enviroment

    Instructions

    This will exercise the asdf setup.

  • Turn on the PI and note the IP obtained from the Router

  • Clean up old host keys

    Instructions

    The new instance will have new host keys so to ensure host key warning messages don't
    distract us from the repaving, run the following:

    ssh-keygen -R 192.168.2.10
    ssh-keygen -R pi
    ssh-keygen -R pi.home.scottmuc.com
    
  • Transfer local public ssh key to PI

    Instructions

    In order to avoid the use of sshpass, copy the current sessions public ssh key to
    to ./ssh/authorized_keys of the pi user on the PI. This user is only necessary to
    run the bootstrap playbook (which creates an admin ansible user) and will be subsequently
    cleaned up.

    ssh-copy-id pi@<pi ip>

  • Bootstrap with Ansible

    Instructions

    ./ansible.sh and select the bootstrap-playbook.yml

  • Add the PI port forwarding

    Instructions

    Needed for the certbot ACME challenge in the next step.

  • Complete full configuration

    Instructions

    ./ansible.sh and select the main-playbook.yml

  • Reboot PI

  • Re-add port mapping to the static IP

  • Disable DHCP on the router

  • Deploy goodenoughmoney.com

  • Clean up host key for ephemeral IP

    Instructions

    Remove host key reference to the temporary IP that was used to bootstrap the
    device. This cleanup will ensure that an error won't occur in the next refresh
    if the same IP is used again.

    ssh-keygen -R <ephemeral IP>
    
  • Make this template slightly better

How Do I Know I Am Done?

Preliminary Context

This is the first repave after #68, #69, and #71. Verifying that the self-hosted Git server is working will be something to watch for. Also, as part of #72, the device will now have the hostname of pippin, so I'll be on the lookout for anything that trips me up on that front.

Attached is the output from ssh ansible@192.168.2.10 -- "cat /etc/os-release; uname -a; dpkg -l" > state.txt

state.txt

First Error

TASK [Mount music share from Windows PC] *************************************************************************
fatal: [192.168.2.102]: FAILED! => {"changed": false, "msg": "Error mounting /mnt/music: mount error(113): could n
ot connect to 192.168.2.12Unable to find suitable address.\n"}

This is because I ran a ifconfig /release and ifconfig /renew on the machine I'm invoking ./ansible.sh to repave this machine. Since the device I'm repaving was responsible for giving this PC the IP of 192.168.2.12, my PC is no longer using that IP. I also forgot to note this as a significant change since the last repave. This made a pretty strong dependency on my PC using this IP.

This SMB mount replaces the previous exFAT USB drive that was attached to the PI.

I'm not going to think about fixing this elegantly and will just this mount for now. This will result in navidrome to fail to run, but shouldn't block the repave script from going further.

Second Error

root@pippin:~# systemctl status nginx
× nginx.service - A high performance web server and a reverse proxy server
     Loaded: loaded (/lib/systemd/system/nginx.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Sat 2024-06-08 19:16:38 BST; 30s ago
       Docs: man:nginx(8)
    Process: 2506 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=1/FAIL>
        CPU: 22ms

Jun 08 19:16:38 pippin systemd[1]: Starting nginx.service - A high performance web server and a reverse proxy ser>
Jun 08 19:16:38 pippin nginx[2506]: 2024/06/08 19:16:38 [emerg] 2506#2506: host not found in upstream "pi.home.sc>
Jun 08 19:16:38 pippin nginx[2506]: nginx: configuration file /etc/nginx/nginx.conf test failed
Jun 08 19:16:38 pippin systemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE
Jun 08 19:16:38 pippin systemd[1]: nginx.service: Failed with result 'exit-code'.
Jun 08 19:16:38 pippin systemd[1]: Failed to start nginx.service - A high performance web server and a reverse pr>

This occurred while I was trying to bring back the smb mount.

The first run of the automation worked because nginx started before an invalid configuration was placed in /etc/nginx/sites-enabled/home.scottmuc.com. After the reboot, nginx was not running due to the invalid configuration so this resource was attempting to start the service. But due to the imperative ordering in the playbook, the fix would come AFTER (the fix is in vhost tasks).

I swapped the order to get the play to complete to completion. Then swapped them back and reran to ensure the automation still functions.

Self hosted Git survives the repave

The above commit was pushed to git.scottmuc.com before syncing with GitHub! I think the self-hosted git repaved without a hitch. There's the usual clearing of ~/.ssh/known_hosts locally, but that's pretty routine.

Third Set of Errors

The urls:

no longer resolve since I've removed public resolution of private IPs (2778c99). Since I haven't really finished the whole hostname migration thing, I'm swapping FQDNs to IPs for the time being. I had to update the configuration in grafana (not in ansible, via the admin UI) as well. The next commit will have an update to the repave validation.

Repave Complete

A bit bumpier than previous ones, but I had made significant changes to the system in the last 3 months. All the fixes were straightforward for me at least. These issues were less of the outside world changed things, but more that I setup some possibilities of repave issues.

Once again, this reminds me that repaving drives out a lot of automation bugs than re-running automation after a patch.

Attached is the new machine details:

repave.txt

Calling this done... will probably follow up with some potential improvements tomorrow, but as of now, this is a functional pi... I mean pippin.

New Machine Info

PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Linux pippin 6.6.31+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.6.31-1+rpt1 (2024-05-29) aarch64 GNU/Linux

state.txt