elimuinformatics/vcf2fhir

Conversion method should return the converted data, and rely on another method to write data onto a file

rhdolin opened this issue · 3 comments

Currently vcf2fhir converts and exports the HL7 FHIR format data to a json file. The converted json data for all the records exists in memory till it is exported in the end.

Evaluation Required:
In memory storage required for FHIR json in case of very big VCF file conversion.

VCF files are sometimes expected to be in the size of GB's, it is better to write the converted FHIR json format for each record to file instead of in memory before moving to the next record. Major complexity in doing this is handling phase relationship json blocks which spans across multiple records.

Other Options:

  1. Throw Exception if In memory json blob reaches near maximum capacity allowed by system instead of Heap dump.
  2. Update the Readme file to notify used to provide the conversion region which converts only limited records in case of very big VCF file.

I would like to work on this issue. Please guide me through it.

This particular issue would require lots of understanding and design before moving to implementation. Personally, in my opinion we would not like to fix this unless someone really wants it.

Okay. I will work on some other beginner-friendly issue.