A comprehensive guide and presentation on building a Healthcare Data Repository, focusing on the technical choices involved in constructing a robust data repositories.
This repository contains the research, insights, and presentation materials related to the construction of Healthcare Data Repositories. It dives into the intricacies of choosing the right technologies and tools based on varying scenarios and requirements.
- Presentation: A detailed slide deck (
data_management_storage_and_architecture_5th_october_Vaibhav_Kulkarni.pdf
) that provides insights into the various facets of building a Healthcare Data Repository. - Raw Data: Data collected from various sources, including StackOverflow, HackerNews, and several data engineering blogs.
- Scripts: Code snippets and scripts used to scrape the data.
- Importance of Healthcare Data Repositories
- Major components of the repository: Data Movement Pipelines, Data Quality & Transformation, Data Storage
- Decision-making areas in selecting technologies
- Analysis and comparison of popular tools
- Recommendations based on specific scenarios
- Ensure you have a PDF reader to view the presentation.
- For scripts, ensure you have the necessary runtime and libraries installed.
- Clone this repository to your local machine.
- Navigate to the directory containing the presentation and open
data_management_storage_and_architecture_5th_october_Vaibhav_Kulkarni.pdf
. - Review the raw data and scripts as required.
If you have suggestions or would like to contribute to this project, please open an issue or submit a pull request.
This project is licensed under MIT License.