HTAN Data Portal

This repo contains the code for the Human Tumor Atlas Network Data Portal

Framework

This is a Next.js project bootstrapped with create-next-app

Backend

All data is coming from Synapse. We have a Python script that generates a JSON file that contains all the metadata. There is currently no backend, it's a fully static site i.e. all filtering happens on the frontend.

Update Data Files

cd data
# Run the script that pulls all the HTAN metadata
# It outputs a JSON in public/syn_data.json and a JSON with links to metadata in data/syn_metadata.json
python get_syn_data.py
# Replace BulkWES -> BulkDNA-seq (this is a temp fix)
gsed -i 's/BulkWES/BulkDNA-seq/g' ../public/syn_data.json && gsed -i 's/BulkWES/BulkDNA-seq/g' ../data/syn_metadata.json
cd ..
# Convert the resulting  JSON to a more efficient structure for visualization
./node_modules/.bin/ncc run data/processSynapseJSON.ts  --transpile-only

Export to bucket

At the moment all data is hosted on S3 for producion. This is because there is a file size limit for vercel. To update it:

gzip file (note that it's already gzipped in the repo)
Remove ".gz" extension so it's just json and rename to include current date in filename.
Upload file to s3 bucket "htanfiles" (part of schultz AWS org)
The file needs two meta settings: Content-Encloding=gzip and Content-Type=application/json
Once file is up, change path in /lib/helpers.ts

Or step 1-4 as command:

MY_AWS_PROFILE=inodb
aws s3 cp processed_syn_data.json.gz s3://htanfiles/processed_syn_data_$(date "+%Y%m%d").json --profile=${MY_AWS_PROFILE} --content-encoding gzip --content-type=application/json --acl public-read

Testing

There are currently no automated tests, other than building the project, so be careful when merging to master

Getting Started

First, run the development server:

npm run dev
# or
yarn dev

Open http://localhost:3000 with your browser to see the result.

You can start editing any page. The page auto-updates as you edit the file.

Debugging processSynapseJSON

Add debugger; somewhere in the code. Then run:

./node_modules/.bin/ncc build --source-map --no-source-map-register data/processSynapseJSON.ts

Followed by:

node  --inspect-brk dist/index.js

Now you can attach to it in e.g. VSCode

Learn More about Next.js

To learn more about Next.js, take a look at the following resources:

Next.js Documentation - learn about Next.js features and API.
Learn Next.js - an interactive Next.js tutorial.

Deployment

The app is deployed using the ZEIT Now Platform from the creators of Next.js.

jaeddy/htan-portal