/large-scale-computing-data-analysis

A project in the course Large-scale Computing and Data Analysis @ Aalto University

Primary LanguageTeXMIT LicenseMIT

Large Scale Computing Data Analysis

A project-based course @ Aalto University

Topic: Human-in-the-loop for connecting ML pipelines and workflow-based data analysis in large-scale computing

0 Preliminary Plan

  1. A literature review on current status of human-in-the-loop in ML pipelines and workflow-based data analysis
    • Can potentially reference other fields if there are not much information.
  2. Based on the literature review, identify one area to investigate further
  3. Identify potential improvement(s) in that area.
  4. Suggest and develop the concepts for the improvements (a proof of concept as a software may not fit in the scope of this course).
  5. Write and present results.

1 Literature Review

The list of the literature is presented in the folder literature.

2 Steps

  1. Basic principles of Human-in-the-loop (HITL) systems.
  2. Choose an application domain.
  3. Apply the principles in analyzing one (or several) frameworks/pipelines in the application domain.
  4. Identify the gap.
  5. Suggest and implement some improvements
  6. Write and report the results.

3 Status

  • 15.5.2021:
    • Complete the literature reviews. Choose the Decision Support System Framework to continue with the project.
    • Identify the application domain is the system support for astroinformatics research at Aalto.
  • 18.5.2021:
    • Complete the first interview with Maarit's team.
    • Understand the basic pipeline of the sun active regions prediction project.
    • Identify the potential human-in-the-loop applications.
  • 19.5.2021:
    • Talk to Linh about the overall system, message dispatching and receiving to asynchronous changing configuration of the system based on user needs
  • 21.5.2021:
    • Meeting. Maarit will send some details information about the system
  • 23.5.2021:
    • Maarit sent some information about the system. Not yet check.
  • 26.5.2021:
    • Check Maarit email and produce a overview graph of the system and the propose message dispatcher to asynchronous modify the system configuration.
  • 27.5.2021:
    • Inform Maarit that cannot see the source codes in the Gitlab Repo. Look at the magnetograms files (in FITS format) in Allas. Look briefly at the FITS format introduction.
  • 4.6.2021 - 11.6.2021:
    • Review the code of the project and make a presentation of prototype
  • 15.6.2021:
    • Meet with Maarit's team to clear some details of the f-modes code.
  • 16.6.2021 - 8.7.2021:
    • Modify the code and make a integrate into the workflow of Maarit's team
  • 9.7.2021:
    • Final presentation