Azure/azurehpc

question: deploy cycleserver into defined vnet

tbugfinder opened this issue · 12 comments

Hello,

I'm wondering how to skip the creating of the vnet and subnet while having pre-defined subnets available for the CycleServer deployment.
https://github.com/Azure/azurehpc/blob/master/examples/cycleserver/readme.md

Thanks for you guidance.

If in your network section you specify a resource_group which is different from the one you deploy your resources, then that vnet will be reused. I'm in the process of rewriting the doc that explains it, should be pushed soon.

Thank you, in my case Cycleserver has to be deployed to a pre-created resource group and make use of a vnet and subnet within a different resource group.

So you should be all good then. Can you please validate you have been able to do your deployment ?

Is it meant like that in 02-cycleserver.json?

{
    "location": "variables.location",
    "resource_group": "variables.resource_group",
    "install_from": "cycleserver",
    "admin_user": "variables.admin_user",
    "vnet": {
        "resource_group" : "myazrhpcnetrg",
        "name": "hpcvnet",
        "address_prefix": "10.2.0.0/20",
        "subnets": {
            "admin": "10.2.1.0/24",
            "storage": "10.2.3.0/24",
            "compute": "10.2.4.0/22"
        }
    },

Yes, if the content of variables.resource_group is not myazrhpcnetrg then that VNET will not be created, and if so you can remove address_prefix and subnets from the vnet definition.

I managed to get the cycleserver deployed into the given vnet/rg.

I've disabled public IP and run into an issue with the initial rsync:

[2020-04-22 06:53:59] building install scripts
[2020-04-22 06:54:00] error: invalid returncode
    args=['rsync', '-a', '-e', 'ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i cycleadmin_id_rsa', 'azhpc_install_02-cycleserver', 'cycleadmin@cycleserver:.']
    return code=255
    stdout=
    stderr=ssh: Could not resolve hostname cycleserver: Temporary failure in name resolution
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2]


IMHO, the ip address should be used for initial connection instead of the name.

The example for deploying Cycle Cloud has been built when the Cycle Cloud VM has a public IP. If you can to use a private only solution, then make sure that the cycleserver name can be resolved from the machine running the azhpc commands. As when there is no public IP the install_from value will be used as is.

The cyclecloud installation scripts keeps creating an invalid configuration in file "/opt/cycle_server/config/cycle_server.properties":

webServerMaxHeapSize=4096MwebServerJvmOptions=

Does the Cycle portal can be open ? if not, try to delete that VM and rerun the azhpc-build

Well, portal opened after fixing the config. I've rebuilt the VM, CycleCloud is up & running. The NSG couldn't be created, I have to check permissions.

Great to hear.

thank you very much