/CENT

Prediction Analysis of COVID-19 Genome Beyond Omicron Variant on Cloud

Primary LanguageDartApache License 2.0Apache-2.0

Banner

About

The system being developed by Team 2133 is an application which can provide information and predictions about Covid-19 and its variants in a given geographic region. We aim to enable the general public to make informed decisions about travel and make predictive determination of likely Covid-19 genomic mutations in a specific area. It could also be used by epidemiologists and COVID researchers as a means of assessing recent variants through multiple genome alignment, and analyze developments through the aid of a predictive model which highlights components of the genome affected by mutations.

Demo

The Team

Name Email Roles
Bennett Burns bburns39@gatech.edu Summarizer, Compromiser, Encourager
Andrew Friedman afriedman38@gatech.edu Information Giver, Harmonizer
Noel Igbokwe nigbokwe3@gatech.edu Information Seeker, Summarizers
John Rehme jrehme3@gatech.edu Information Seeker, Feeling Expressers
Nihar Shah nshah400@gatech.edu Energizers, Clarifiers
Max Tang mtang80@gatech.edu Information Giver, Harmonizer
Daniel Varzari dvarzari3@gatech.edu Energizer, Clarifier, Compromiser, Organizer, Information Seeker

Team Resources & Instillation Guides

Release Notes

Sprint 5 Demo Video

Features

  • Multi node compute resource for notebooks
  • Confidence Model is now reachable as an endpoint (this is yet to be connected to the frontend)
  • Confidence Model has historical metrics and confusion matrix shown to display results
  • MLflow in Azure setup to record and deploy model experiments as versioned histories
  • API calls to the backend to receive accurate and full accession data
  • iOS and MacOS support
  • New versions of Variant Card, Country Card, and (now) Continent Card - support for multi-card activity and map zoom
  • New Settings page to adjust different parts of the app - account, accessibility, and more

Bug Fixes

  • Worked around memory issues within the notebooks by using smaller training sets
  • Fixed Google Maps zoom behavior
  • Multiple card behavior is now intended, not a bug

Known Issues

  • unable to pad csr matrix to be accepted by the MLflow API - could be solved by moving frontend to Azure as well
  • unable to run large memory blocks like the unified tables found in clustering notebook

Sprint 4 Demo Video

Features

  • Added new variant card - selecting a variant from a region pulls up additional info about the genome sequence, a link to the vairant on NCBI, and the ability to add the variant to a user's Saved list
  • New relational database created in Azure - will allow final connection between backend and frontend to populate website with data
  • New Postgres SQL commands allow us to filter genomes by region, date, etc. - 650,000 have been segregated so far
  • [Minor] Moved FAQ menu to new button in bottom right of home screen - cleans up home screen tabs selection

Bug Fixes

  • Data cleaning has allowed us to finally store and work with genomic sequence data
  • Made "card" feature universal - can now instantiate and remove cards at will

Known Issues

  • New variant card throws null error for non-logged in users
  • API request needs to be created for relational database

Features

  • Saved variant table added to user accounts - will allow saving of variants in future version of application
  • Added animated Google Maps functionality when selecting a region

Bug Fixes

  • Fixed Google Maps Controller to support changing of map

Known Issues

  • Still able to open multiple region cards - due to unique implementation to support card feature
  • Still running data cleaning - expected to end by this week or next due to the sheer volume of data
  • Architecting the model has proven to be more difficult than initally presumed, at least to the degree of accuracy desired

Sprint 2 Demo Video

Features

  • Multiselect for variants - can select group of variants to copy from a list displayed to the user
  • New copy & compare button – copies list of selected variants and can open BLAST to compare
  • Azure Databricks setup – have a place to run our notebooks from
  • Azure Functions – allows us to port data as JSON from blob storage to notebook

Bug Fixes

  • Country name now correctly shows in Variant View for the selected country

Known Issues

  • Able to open the same region card multiple times
  • Title for region cards shrinks when too long
  • FASTA format is not sufficient to give us all the data we need – consider .gbff
  • Azure is not connected as endpoint

Sprint 1 Demo Video

Features

  • Region card view – this extends from our search functionality for regions, and where we can start to see variant views.
  • Variant view – redirects to database link
  • Variant table view – can view more variants for a particular region

Bug Fixes

  • Fixed Google Maps agreement issue (was out-of-date)

Known Issues

  • Able to open the same region card multiple times
  • Title for region cards shrinks when too long