/azure-storage-loadtest

Load testing Azure Storage data moves with azcopy

Primary LanguagePythonMIT LicenseMIT

Azure Storage Load Testing

Dan Grecoe - a Microsoft employee

As an Azure customer, you want to get your data into the Azure platform for any number of reasons. Storae capacity, utilize a different service, the list goes on.

Some customers start with data collection in Azure and others will start with data in their own data centers.

This project assumes that data already exists in Azure, however that will not neccesarily be the case for all users. There are options, however, in moving data to Azure.

  1. Databox in which the customer requests this service and generates an Azure Storage account to load.
    • Customer has an Azure Subscrpition and a storage account.
    • Customer recieves the hardware and loads the data.
    • Customer returns the hardware and the data is loaded into the storage account.
  2. Manually uploading using any of the various API's or tools such as azcopy.

This repository currently does not cover either of the above scenarios but instead, focuses on moving large amounts of data around the Azure service itself. Why, you might as?

Consider the OSDU platform as it exists today. You use REST API's to request an upload URL for a new file, upload the file to that URL, add in metadata and now your OSDU platform can work with that file (making it searchable/etc.)

In an OSDU deployment in Azure, the Upload URL is actually a signed url to an Azure Blob Storage account. This makes moving data relatively easy with the use of the azcopy tool without a lot of additional overhead.

This repository can be used to perform just such a test to validate parallelization of data movement using azcopy alongside other Azure services.

Document Sections

Additional Documents

Pre-requisities

  • An Azure Subscription in which you can create
    • Azure Storage Accounts
    • Azure Container Instances
  • Docker desktop to build docker images.
  • Docker Hub account for hosting docker images to seed into the Azure Container Instances.
  • Large data files, such as those found in this open source project for the Energy sector.
  • Anaconda installed on your machine to test the application before generating Docker images.