High scale parallel processing architecture in Azure based on Redis RQ and written in Azure.
- Runs in Microsoft Azure
- Written in Python
- Queues provided by Redis and the Python Redis RQ library
- Simple Linux VMs used for scheduling and processing
- No reliance on Azure VM extensions
- Minimal cloud provider lock-in
- Encryption of all data and scripts
brew install pyenv
echo -e 'if command -v pyenv 1>/dev/null 2>&1; then\n eval "$(pyenv init -)"\nfi' >> ~/.bash_profile
exec "$SHELL"
xcode-select --install
CFLAGS="-I$(brew --prefix openssl)/include" \
LDFLAGS="-L$(brew --prefix openssl)/lib" \
pyenv install -v 2.7.5
pyenv rehash
cd <yourlocalpath>/azure-python-redis-queue-processor
pyenv local 2.7.5
python --version
All PyPI package dependencies are included in requirements.txt.
- Git clone repo
- Install dependencies
- Open bash shell
- Pip install package dependencies
- Clone
config/config.example.json
and rename it toconfig.json
- Supply
config.json
with values for your Azure Subscription and app parameter - Run
sudo pip install -r requirements.txt
- Run
bash build.sh
At a high level, this repo provides a reference solution for a queue based parallel job processing solution. Jobs to be processed are queued and N number of workers pull jobs off the queue in parallel and process them.
- Encrypted scripts and data to be processed uploaded to Azure Blob Storage
- Scheduler VM role deployed and executes scheduler_bootstrap.sh
- Scheduler python script decrypted by schedulerconfiguration.py
- Data to be processed decrypted by schedulerconfiguration.py
- Data is parsed into jobs by scheduler.py
- Jobs to be executed are queued to Redis RQ by scheduler.py
- Job status records are created by scheduler.py
- Processor VM role deployed and executes processor_bootstrap.sh
- AES key for job result encyrption downloaded by processorconfiguration.py
- Processing job executed by processor.py
- Job status record updated with processing state
- Job result encrypted and written to Azure Blob Storage
- Completed or failed job status record written out to Azure Queue for additional processing
Microsoft Azure VM extensions are not required to be installed in this solution. Many high security institutions do not want to run third party extensions unless absolutely necessary. Basic VM metrics, such as CPU and disk, can be captured using the Azure Metrics REST APIs and remove the need for an extension to be installed on VMs.
metricslogger.py captures basic metrics for all VMs in an Azure Resource Group using the Azure Metrics REST API. This python script can be executed as a scheduled job via a chron job or similar mechanism.
Pre-req: Service Principal credentials with read access to all VMs
- Retrieves a list of all VMs in an Azure Resource Group
- Retrieves basic metrics (CPU, disk, network) for each VM in the Resource Group
- Stores metrics in Azure Storage for analysis
TODO: Add instructions for the following:
- Create RSA Key
- Generate AES Key
- Encrypt data
- Upload to Blob
We will be using ARM to deploy the scheduler and processing topology to Azure.
-
Pre-req:
- Storage acccount to upload scripts, logs, and results
- Azure KeyVault to generate RSA Key and store ssh secrets
- Local AES Key to encrypt data and uploaded to blob
-
Make a copy of the config/config.example.json and update the values appropriately
cp config/config.example.json config/config.json
-
Run the build script to package, encrypt, and upload the code to the Storage Account
sh build.sh
-
Update the arm/azuredeploy.parameters.json with the storage account information, custom data, and ssh keys
- For custom data use the value inside the config/config.json.encoded
-
Create a resource group to deploy the topology
az group create -n test1 -l westus2
-
Deploy topology using ARM template
az group deployment create --template-file arm/azuredeploy.json --parameters arm/azuredeploy.parameters.json -g test1
-
Get the jumpbox's public address on the Azure Portal or by executing the command:
az network public-ip list -g MyResourceGroup | grep fqdn
-
SSH into the jumpbox via password or SSH key
ssh adminRQ@mydomain.westus2.cloudapp.azure.com
-
Find the IP address of a scheduler or worker node. You can either look at the VNET in the Azure portal or use the CLI and get one for a specific instance. Ex: instance "3" of the worker VMSS
az vmss nic list-vm-nics -g MyResourceGroup --vmss-name workerVmss --instance-id 3 | grep privateIpAddress
-
SSH into one of those nodes
ssh adminBR@10.0.2.7
-
You need admin access to navigate to the custom scripts folder
sudo -s
-
Navigate to where ther custom script jobs are executed. Ex: execution "1"
cd /var/lib/waagent/custom-script/download/1/
-
You can look at the "stderr" and "stdout" log files to help debug what happened during deployment.
-
SSH into the desired box
-
Find the process identifier (PID)
ps -aux | grep python
-
You can view the logs for the process by doing the following
cat /proc/<PID>/fd/2
-
Copy the SSH private key to jumpbox
scp -i ~/.ssh/mykey_id_rsa ~/.ssh/mykey_id_rsa admin@public-ip.westus2.cloudapp.azure.com:~/.ssh/id_rsa
-
Remote into the Jumpbox
ssh admin@public-ip.westus2.cloudapp.azure.com -i ~/.ssh/mykey_id_rsa
-
Now you can remote to other nodes without needing to provide a password.
-
Make sure your container is up and then on another shell you can exec and attach to the redis cli.
docker exec -it redis redis-cli
-
Once attached you can perform actions such as "List All Keys"
127.0.0.1:6379> KEYS *
-
Get the applicationId of the service principal.
-
Execute the following command, passing the role you want to assign to the service principal
az role assignment create --assignee <SP Application Id> --role <Reader, Contribtor,...>