/ansible-nspawn

Ansible Connection Plugin for systemd-nspawn

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

systemd-nspawn Connection Plugin for Ansible

License: MIT Ansible Version Python Version

Ansible connection plugin for remote systemd-nspawn containers. Execute tasks inside containers without SSH daemons.

Why

  • Your containers don't need SSH daemons
  • You already manage containers with machinectl on remote servers
  • You want the security isolation of nspawn
  • You need to manage containers across your infrastructure from a central location

How It Works

Connection Overview

graph TB
    A[Ansible Controller] --> B[nspawn Plugin]
    B --> C{SSH Connection}
    C --> D[Remote Host]
    D --> E[nsenter -t PID]
    E --> F[nspawn Machine]
    
    B -.-> G[PID Cache<br/>per task]
    C -.-> H[SSH ControlMaster<br/>~/.ansible/cp/]
    
    style A fill:#e3f2fd
    style B fill:#f3e5f5
    style F fill:#e8f5e9
Loading

The plugin connects via SSH to your host, then uses nsenter to execute inside nspawn containers. No SSH daemon needed in containers.

Quick Start

# inventory.yml
containers:
  hosts:
    web:
      ansible_host: prod-server.example.com
      ansible_connection: nspawn
      ansible_nspawn_container: web
      ansible_pipelining: true # if not already set for [default] in ansible.cfg

# playbook.yml
- hosts: web
  tasks:
    - name: Reality check
      command: systemctl status nginx

Installation

# In your project
mkdir -p ./plugins/connection
git clone https://github.com/Gurpartap/ansible-nspawn ./plugins/connection/nspawn

# In ansible.cfg
[defaults]
connection_plugins = ./plugins/connection
pipelining = True

Test the connection

# One-liner test without inventory file
ansible all -i 'web,' -m ping -e ansible_host=prod-server.example.com -e ansible_connection=nspawn -e ansible_nspawn_container=web

# With inventory file
ansible web -i inventory.yml -m ping

# Debug what's happening
ansible web -i inventory.yml -m command -a "hostname" -vvv

# Run your playbook
ansible-playbook -i inventory.yml playbook.yml

Requirements

  • Remote host machine with systemd-nspawn containers
  • SSH access to the remote host
  • nsenter and machinectl on remote host (comes with systemd-container)
  • Ansible controller (your local machine) with SSH client

Configuration

Variable Description Required Default
ansible_connection Must be nspawn Yes -
ansible_nspawn_container Container name as shown in machinectl list Yes -
ansible_host Remote host machine running the container Yes -
ansible_user SSH user for remote host No Current user
ansible_port SSH port for remote host No 22
ansible_ssh_common_args Additional SSH arguments (e.g., -o ProxyJump=bastion) No ''
ansible_pipelining Enable pipelining for 3x performance No false

Architecture

How It Works

sequenceDiagram
    participant A as Ansible Controller
    participant P as nspawn Plugin
    participant H as Remote Host
    participant C as Container
    
    Note over A,H: First connection (PID lookup)
    A->>P: Execute task
    P->>H: ssh user@host "machinectl show container"
    H-->>P: Leader=12345
    P->>P: Cache PID for this task
    
    Note over A,C: Actual task execution
    P->>H: ssh user@host "nsenter -t 12345 -m -u -i -n -p -- /bin/sh -c 'command'"
    H->>C: Enter namespaces & run
    C-->>A: Result
Loading

File Transfer Flow

sequenceDiagram
    participant A as Ansible
    participant P as Plugin
    participant S as SSH
    participant C as Container
    
    Note over A,C: put_file operation
    A->>P: Transfer file.conf
    P->>P: Read local file
    P->>S: ssh "nsenter -t PID"
    
    alt Directory exists
        S->>C: dd of=/etc/app/file.conf
    else Directory missing
        S->>C: mkdir -p /etc/app
        S->>C: dd of=/etc/app/file.conf
    end
    
    C-->>A: Success
    
    Note over A,C: fetch_file operation
    A->>P: Fetch /var/log/app.log
    P->>S: ssh "nsenter -t PID"
    S->>C: dd if=/var/log/app.log
    C-->>P: File contents
    P->>P: Write local file
    P-->>A: Success
Loading

Multi-Container Architecture

graph TB
    A[Ansible Controller] --> B{nspawn Plugin}
    B --> C[Remote: server1]
    B --> D[Remote: server2]
    
    C --> E[Container: web]
    C --> F[Container: db]
    C --> G[Container: cache]
    
    D --> H[Container: worker1]
    D --> I[Container: worker2]
    
    style E fill:#e3f2fd
    style F fill:#fff3e0
    style G fill:#f3e5f5
    style H fill:#e8f5e9
    style I fill:#e8f5e9
Loading

Advanced Usage

Multiple Containers Example

# Basic example with group vars
containers:
  hosts:
    web:
      ansible_nspawn_container: web-prod
    db:
      ansible_nspawn_container: postgres-prod
    cache:
      ansible_nspawn_container: redis-prod
  vars:
    ansible_connection: nspawn
    ansible_host: prod-server.example.com
    ansible_user: deploy
    ansible_pipelining: true

# Production setup with multiple servers
production:
  children:
    web_servers:
      hosts:
        web1:
          ansible_host: server1.example.com
          ansible_nspawn_container: web
        web2:
          ansible_host: server2.example.com
          ansible_nspawn_container: web
    databases:
      hosts:
        db_primary:
          ansible_host: server1.example.com
          ansible_nspawn_container: postgres-primary
        db_replica:
          ansible_host: server2.example.com
          ansible_nspawn_container: postgres-replica
  vars:
    ansible_connection: nspawn
    ansible_ssh_common_args: "-o ProxyJump=bastion.example.com"
    ansible_pipelining: true

File Transfers

Binary-safe transfers using dd:

- name: Deploy binary
  copy:
    src: myapp
    dest: /usr/local/bin/myapp
    mode: '0755'

Debugging

# See what's happening
ansible-playbook -vvv playbook.yml

# Check container status
ssh prod-server machinectl list

# Test manually
ssh prod-server nsenter -t $(machinectl show web --property=Leader --value) -m -u -i -n -p -- hostname

Common Issues

Container not found

  • Check machinectl list on host
  • Ensure container is running
  • Verify container name matches exactly

Permission denied

  • Check SSH access to host
  • Verify user can run machinectl and nsenter
  • Run commands as root user or with appropriate permissions

Slow execution

  • Enable pipelining: ansible_pipelining: true
  • Check network latency to host
  • Reuse connections with ControlMaster

Implementation Details

  • Uses machinectl show --property=Leader to get container PID
  • Executes via nsenter -t PID -m -u -i -n -p
  • SSH ControlMaster with 60m persistence
  • File transfers use dd for binary safety
  • Combines mkdir + transfer in single SSH call when possible
  • PID cached per task execution

Limitations

No Privilege Escalation

This plugin does not support Ansible's become functionality. Commands run as the SSH user inside the container.

# ❌ This won't work:
- hosts: web
  become: yes  # Ignored by nspawn plugin
  tasks:
    - name: Install package
      package:
        name: nginx

# ✅ Do this instead:
# Option 1: SSH as root
web:
  ansible_host: server.example.com
  ansible_user: root  # SSH as root
  ansible_nspawn_container: web

# Option 2: Create separate inventory entries
web_deploy:
  ansible_host: server.example.com
  ansible_user: deploy
  ansible_nspawn_container: web

web_admin:
  ansible_host: server.example.com
  ansible_user: root
  ansible_nspawn_container: web

Workarounds:

  • Use ansible_user: root for tasks requiring privileges
  • Create multiple inventory entries for different permission levels
  • Use sudo inside tasks: command: sudo systemctl restart nginx
  • Configure passwordless sudo in containers if needed

Design Decisions

Why these implementation choices?

1. Direct nsenter Instead of machinectl shell

Decision: Use nsenter -t PID directly rather than machinectl shell
Rationale:

  • No DBus dependency or overhead
  • No intermediate shell process
  • Direct namespace access is faster
  • More control over execution environment

Trade-off: Requires PID lookup but enables raw performance

2. PID Caching per Task

Decision: Cache container PID for task duration, refresh between tasks
Rationale:

  • Containers rarely restart mid-task
  • Fresh lookup per task catches container restarts
  • Avoids stale PID issues

Trade-off: One extra lookup per task vs risk of stale PID

3. SSH ControlMaster with 60m Persistence

Decision: Reuse SSH connections with long timeout
Rationale:

  • Eliminates SSH handshake overhead
  • 60 minutes covers most playbook runs
  • Standard SSH feature, well-tested

Trade-off: Lingering connections vs performance gain

4. Separate Control Socket Namespace

Decision: Use ansible-nspawn-* instead of ansible-ssh-* for sockets
Rationale:

  • Avoids conflicts with regular SSH plugin
  • Clear separation of concerns
  • Easier debugging

Trade-off: Separate socket management

5. No Built-in Privilege Escalation

Decision: Execute as SSH user, no become support
Rationale:

  • Simplicity and predictability
  • Matches container security model
  • SSH as root when needed

Trade-off: Less flexible than full become support

Authors

Created by Gurpartap Singh

Licensed same as Ansible, GPL-3.0.