Deploy a cluster of heterogenous Infernet nodes on Amazon Web Services (AWS) and / or Google Cloud Platform (GCP), using Terraform for infrastructure procurement and Docker compose for deployment.
- Install Terraform
- Configure nodes: A node configuration file for each node being deployed.
- See example configuration.
- They must have unique names
- A straightforward approach would be
0.json
,1.json
, etc...
- A straightforward approach would be
- They must be placed under the top-level
configs/
directory. - Number and name of
.json
files must match the number and name of keys in thenodes
variable interraform.tfvars
.- See terraform.tfvars.example.
- Each key should correspond to the name of a
.json
file, excluding the.json
postfix.
- Each node strictly requires its own configuration
.json
file, even if those are identical. - For instructions on configuring individual nodes, refer to the Infernet Node.
The Infernet Router REST server is configured automatically by Terraform. However, if you plan to use it, you need to understand its implications:
IMPORTANT: When configuring a heterogeneous node cluster (i.e.
0.json
,1.json
, etc. are not identical), container IDs should be reserved for a unique container setup at the cluster level, i.e. across nodes (and thus.json
files). That is becuase the router uses container IDs to make routing decisions between services running across the cluster.Example: Consider nodes A and B, each running a single LLM inference container; node A runs
image1
, and node B runsimage2
. If we setid: "llm-inference"
in both containers (containers[0].id
attribute in0.json
,1.json
), the router will be unable to disambiguate between the two services, and will consider them interchangeable, which they are not. Any requests for"llm-inference"
will be routed to either container, which is an error.Therefore, re-using a IDs across configuration files must imply an identical container configuration, including image, environment variables, command, etc. This will explicitly tell the router which containers are interchangeable, and allow it to distribute requests for those containers across all nodes running that container.
-
Create an AWS service account for deployment:
cd procure/aws chmod 700 create_service_account.sh ./create_service_account.sh
This will require local authentication with the AWS CLI. Add
access_key_id
andsecret_access_key
to your Terraform variables (see step 3). -
Make a copy of the example configuration file terraform.tfvars.example:
cd procure/aws cp terraform.tfvars.example terraform.tfvars
-
Configure your
terraform.tfvars
file. See variables.tf for config descriptions. -
Run Terraform:
# Initialize cd procure make init provider=aws # Print deployment plan make plan provider=aws # Deploy make apply provider=aws # WARNING: Destructive # Destroy deployment make destroy provider=aws
-
Create a GCP service account for deployment:
cd procure/gcp chmod 700 create_service_account.sh ./create_service_account.sh
This will require local authentication with the GCP CLI, and create a local credentials file. Add the path to the credentials file (
gcp_credentials_file_path
) to your Terraform variables (see step 3). -
Make a copy of the example configuration file terraform.tfvars.example:
cd procure/gcp cp terraform.tfvars.example terraform.tfvars
-
Configure your
terraform.tfvars
file. See variables.tf for config descriptions. -
Run Terraform:
# Initialize cd procure make init provider=gcp # Print deployment plan make plan provider=gcp # Deploy make apply provider=gcp # WARNING: Destructive # Destroy deployment make destroy provider=gcp
# Install tflint
brew install tflint
# Install plugins
tflint --init
# Run on all directories
tflint --recursive
# Format AWS files
cd procure/aws
terraform fmt
# Format GCP files
cd procure/gcp
terraform fmt