Analyzing Multimodal Health Data on AWS

Introduction

This repository contains code samples related to the two-part blog series Building Scalable Machine Learning Pipelines for Multimodal Health Data on AWS and Training Machine Learning Models on Multimodal Health Data with Amazon SageMaker

Machine Learning Pipelines for Multimodal Health Data

You can use these artifacts to recreate the pipelines and analysis presented in the blog posts, as shown below.

Architecture on AWS

Project Structure

Artifacts for processing each data modality are located in corresponding subdirectories of this repo.

./
./genomics/ <-- Artifacts for genomics pipeline
./clinical/ <-- Artifacts for clinical pipeline *
./imaging/  <-- Artifacts for medical imaging pipeline
./model-train-test/ <-- Artifacts for performing model training and testing

* The clinical data can also be preprocessed with Amazon SageMaker Data Wrangler, as discussed in the blog.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.