NetKAN Infrastructure
The NetKAN Infrastructure Project is a re-write of the original monolithic NetKAN-bot, with the goal of easier maintenance, a faster code to deployment pipeline, infrastructure as code.
Theory of operation
We use a modern microservice architecture to continuously index new mods and mod updates as the upstream mod files change. The microservices communicate via the AWS SQS system.
Microservices
Scheduler
Every once in a while, the Scheduler kicks off and submits NetKAN metadata for the Inflator to work with.
Inflator
The core of the system, it has the job to inflate NetKAN metadata to CKAN metadata as described in the Schema and Spec.
Indexer
The Indexer takes inflated metadata and, if the metadata is different to currently existing one, or entirely new, pushes it to the metadata repository. It also updates the status database.
Mirrorer
Uploads freshly indexed mods to the Internet Archive, if allowed by their licenses.
Webhooks
The webhooks provide a web server that responds to messages from other hosts and services, usually by queueing a module to be inflated by the Inflator.
Download Counter
Collects download counts for indexed mods and commits them to the metadata repository.
Status Page
The status page shows the current status of the inflation for each mod: Time of last inflation, time of last indexing, or errors that occurred during inflation. It accesses a DynamoDB database in the background.
SpaceDock Adder
Generates pull requests in the NetKAN repo when SpaceDock users request a mod to be added to CKAN.
Ticket Closer
Closes stale issues on GitHub.
Auto-Freezer
Submits pull requests to the NetKAN repo to freeze mods that haven't been indexed in a long time.
Containers
Container | Image | Repo | Code |
---|---|---|---|
Scheduler | kspckan/netkan | NetKAN-Infra | netkan/netkan/scheduler.py |
Inflator | kspckan/inflator | CKAN | Netkan/Processors/QueueHandler.cs |
Indexer | kspckan/netkan | NetKAN-Infra | netkan/netkan/indexer.py |
Mirrorer | kspckan/netkan | NetKAN-Infra | netkan/netkan/mirrorer.py |
Download Counter | kspckan/netkan | NetKAN-Infra | netkan/netkan/download_counter.py |
Adder | kspckan/netkan | NetKAN-Infra | netkan/netkan/spacedock_adder.py |
TicketCloser | kspckan/netkan | NetKAN-Infra | netkan/netkan/ticket_closer.py |
AutoFreezer | kspckan/netkan | NetKAN-Infra | netkan/netkan/auto_freezer.py |
Status Dumper | kspckan/netkan | NetKAN-Infra | netkan/netkan/cli/utilities.py |
Webhooks | kspckan/netkan | NetKAN-Infra | netkan/netkan/webhooks |
Webhooks Proxy | kspckan/webhooks-proxy | NetKAN-Infra | |
Cert Bot | certbot/dns-route53 | certbot/certbot |
Queues
The individual services communicate via Amazon's SQS (Simple Queue Service) system where needed. The Scheduler sends the netkans as SQS message to an SQS queue, where the Inflator picks them up and inflates them to ckan metadata. The Inflator again sends the resulting metadata via another queue to the Indexer.
Inbound.fifo
Message Attribute | Usage |
---|---|
Releases | Number of releases to inflate, for modules with backports |
HighestVersion | Version number of the latest release already indexed |
Payload: JSON contents of .netkan to inflate
Outbound.fifo
Message Attribute | Usage |
---|---|
ModIdentifier | The identifier of the module |
Staged | "true" to commit to new branch and submit a pull request |
Success | "true" if inflation succeeded, "false" if there was an error |
CheckTime | The datetime when the inflation happened, in ISO 8601 format |
FileName | Name of file to create or update, usually identifier-version.ckan |
ErrorMessage | Explanation of why inflation failed if Success is "false", omitted otherwise |
WarningMessages | Non-fatal potential problems noted during inflation, separated by newlines |
StagingReason | Body for pull request if Staged is "true", omitted otherwise |
Payload: JSON contents of .ckan file to index
Adding.fifo
Message Attribute | Usage |
---|---|
N/A | N/A |
Payload: JSON encoding of request received from SpaceDock to index a new mod
Mirroring.fifo
Message Attribute | Usage |
---|---|
N/A | N/A |
Payload: Path of .ckan file just added to CKAN-meta
Webhooks
The webhooks run on https://netkan.ksp-ckan.space/ and are firewalled to only a few servers that we know need to access them (so they're not going to work if you try them in your browser).
Route | Parameters | Usage |
---|---|---|
/inflate | POST body: {"identifiers": [ "Id1", "Id2", ... ]} |
Inflate the given modules (used by SpaceDock-Notify) |
/sd/add | POST form: See Adder code comments | Index a new mod from SpaceDock |
/sd/inflate | POST form: mod_id=1234&event_type=update |
Inflate modules with the given SpaceDock ID |
/gh/inflate | See the push API | Inflate modules after commits to NetKAN |
/gh/release?identifier=Id | See the release API | Inflate a module when a new release is uploaded to GitHub |
/gh/mirror | See the push API | Archive a mod that was just indexed |
Developing
How to set up a local development environment
References
Performing Updates
Updating the Stack:
aws cloudformation update-stack --stack-name DevQueues --template-body "`python dev-stack.py`" --capabilities CAPABILITY_IAM --profile ckan --region us-west-2