PoonLab/covizu

Need a new toy dataset for front end devs

Closed this issue · 5 comments

Building the front-end database #487 with the actual data requires about 30 minutes.

We need to create a dataset for users to run an instance locally for development purposes. We currently have a test dataset in the repository, but that is a very small dataset.

Current test set is only 100 sequences, need to increase to close to 1M
The open data feed and VirusSeq database may be good resources for generating this set, since they are relatively free of usage restrictions

@GopiGugan has a script prepared for generating test data, check that this works and then pass to @SandeepThokala

@GopiGugan please place the new data set on the webserver under /covizu/data so that devs can retrieve it, and we don't have to stuff a large file into the repo

Also document availability and location of the data file in CONTRIBUTING doc

Also please deposit scripts used to generate this data set into this repo's scripts folder