Process | Who | Outputs |
---|---|---|
Accept data from TA3 | sd2eadm | Deposit to Ingests |
Develop and test ETL processes | <etl_team> | Archive job results to User Temp, User Data, Staging while working |
Run production ETL processes | <etl_team> | Archive jobs produced by public apps to Processed and ensure public read ACL set |
Nickname | System | Path | Purpose | Read | Write |
---|---|---|---|---|---|
Ingests | data-sd2e-community | /ingest | Orignal data | All | sd2eadm |
Processed | data-sd2e-community | /processed | Accepted data+products | All | sd2eadm |
Reference | data-sd2e-community | /reference | Common reference data | All | sd2eadm,vaughn,jfonner,ngaffney |
Sample | data-sd2e-community | /sample | Samples and examples | All | sd2eadm,... |
Staging | data-sd2e-community | /processed_staging | Staging area for data+products | All | sd2eadm,<etl_team>* |
User Data | data-sd2e-projects-users | / | Collaborative storage for users | * | * |
User Temp | data-tacc-work-uname | * | User-specific high speed storage |
Google Docs Sheet with the following columns:
- Path (relative to original upload conventions)
- Public apps to process
- Output naming conventions
- Participants
- Status
Following Agave API ACL model for portability and future-proofness. Tutorials abound for its application and usage in SD2.
READ
WRITE
EXECUTE
(na)READ_WRITE
READ_EXECUTE
(na)WRITE_EXECUTE
(na)ALL
(READ_WRITE_EXECUTE)NONE
- Removes access
- uname - yours or someone else's TACC username
- etl_team - any of the following people
- vaughn
- ngaffney
- mweston
- jeg
- meslami
- jfonner
- wallen
- public - special user granting access to all authorized usernames
- world - special user granting world-readable access
Name | App ID | Host | Purpose | Lead | Public | Shared |
---|---|---|---|---|---|---|
FastQC | fastqc-0.5.0 |
maverick,wrangler | QC report for NGS data | Vaughn | No | na |
FCS-TASBE | fcs-tasbe-0.2.0u4 |
jetstream | Summarize Flow data | Gentile/Vaughn | X | na |
Kallisto | kallisto-0.43.1u3 |
maverick | Quantify RNAseq data | Vaughn | X | na |
LCMS | lcms-0.1.0u4 |
maverick | Summarize LCMS data | Weston | X | na |
MSF | msf-0.1.0u3 |
maverick | Summarize MS data | Weston | X | |
Sailfish | sailfish-0.10.1u3 |
maverick | Quantify RNAseq data | Vaughn | X | na |
SortmeRNA | sortmerna-0.0.1 |
maverick,wrangler | Filter rRNA from demux, trimmed RNAseq | Gaffney | No | vaughn |
TrimSortmeRNA | trimsortmerna-0.1.0 |
maverick,wrangler | Trimmomatic + rRNA filtering | Gaffney | No | No |
Request access and join development effort at Reactors-ETL Github Repo