Utilities to manage Gen3 schemas, projects and submissions.
# optionally
$python3 -m venv venv ; source venv/bin/activate
pip install gen3_util
$ gen3_util
msg: Version 0.0.4
Note: requires magic
library. If it is not already installed you will see a warning like this:
Requires libmagic installed on your system to determine mime-types
Error: 'failed to find libmagic. Check your installation'
For installation instructions see https://github.com/ahupp/python-magic#installation
Note: some environments will get a _ctypes
error. Please install 3.9.16
$gen3_util --help
Usage: gen3_util [OPTIONS] COMMAND [ARGS]...
Gen3 Management Utilities
Options:
--config TEXT Path to config file. GEN3_UTIL_CONFIG
--format [yaml|json|text] Result format. GEN3_UTIL_FORMAT [default: yaml]
--cred TEXT See https://uc-cdis.github.io/gen3-user-
doc/appendices/api-gen3/#credentials-to-query-
the-api. GEN3_API_KEY
--state_dir TEXT Directory for file transfer state
GEN3_UTIL_STATE_DIR [default: ~/.gen3/gen3_util]
--help Show this message and exit.
Commands:
projects Manage Gen3 projects.
buckets Manage Gen3 buckets.
meta Manage meta data.
files Manage file buckets.
access Manage access requests.
config Configure this utility.
- Leverages Gen3Auth See
- Store the
credentials.json
file in ~/.gen3/credentials.json or specify location with either env[GEN3_API_KEY], or--cred
parameter
I need to verify connectivity.
$ gen3_util projects ping
msg: OK connected to endpoint https://aced-training.compbio.ohsu.edu
I need to see what projects exist
$ gen3_util projects ls
endpoint: https://aced-training.compbio.ohsu.edu
msg: OK
projects:
- /programs
- /programs/aced
- /programs/aced/project
- /programs/aced/project/MCF10A
- /programs/aced/projects
- /programs/aced/projects/Alcoholism
- /programs/aced/projects/Alzheimers
- /programs/aced/projects/Breast_Cancer
- /programs/aced/projects/Colon_Cancer
- /programs/aced/projects/Diabetes
- /programs/aced/projects/HOP
- /programs/aced/projects/Lung_Cancer
- /programs/aced/projects/MCF10A
- /programs/aced/projects/NVIDIA
- /programs/aced/projects/Prostate_Cancer
- /programs/aced/projects/ohsu_download_testing
I need to see what buckets are associated with the commons
$ gen3_util buckets ls
buckets:
GS_BUCKETS: {}
S3_BUCKETS:
aced-default:
endpoint_url: https://minio-default.compbio.ohsu.edu
region: us-east-1
aced-manchester:
endpoint_url: https://minio-manchester.compbio.ohsu.edu
region: us-east-1
aced-ohsu:
endpoint_url: https://minio-ohsu.compbio.ohsu.edu
region: us-east-1
aced-stanford:
endpoint_url: https://minio-stanford.compbio.ohsu.edu
region: us-east-1
aced-ucl:
endpoint_url: https://minio-ucl.compbio.ohsu.edu
region: us-east-1
endpoint: https://aced-training.compbio.ohsu.edu
msg: OK
I need to create a project
$ gen3_util projects touch aced-MyExperiment
projects:
aced-MyExperiment:
exists: true
messages:
- Created program:aced Program is updated!
I need to assign default policies to that project
$ gen3_util projects add policies aced-MyExperiment
msg: Approve these requests to assign default policies to aced-MyExperiment
commands:
- gen3_util access update 24f047d7-0e7c-43c6-bab6-61e2d385c71a SIGNED
- gen3_util access update 293c6cd1-7ab7-420f-bafb-34319589eac4 SIGNED
I need to add a user to that project
$ gen3_util projects add user aced-MyExperiment linus.pauling@osu.edu
msg: Approve these requests to add linus.pauling@osu.edu to aced-MyExperiment
commands:
- gen3_util access update 293c6cd1-7ab7-420f-bafb-34319589eac4 SIGNED
Before proceeding, I need to sign those equests
gen3_util access update xxxxxx SIGNED
I want to create a simple project structure with a set of files
$ gen3_util meta import dir tests/fixtures/dir_to_study/ tmp/foo --project_id aced-MyExperiment
summary:
ResearchStudy:
count: 1
DocumentReference:
count: 5
size: 6013814
msg: OK
I want need to do something a bit more complex, for example, I want to create a project structure with a set of files, but I need to specify the
Patient
andSpecimen
based on the path of the file.
gen3_util meta import dir tests/fixtures/dir_to_study_with_meta/ tmp/foometa --project_id aced-foometa --plugin_path ./tests/unit/plugins
tests/fixtures/dir_to_study_with_meta/
├── file-2.csv
├── p1
│ ├── s1
│ │ └── file-3.pdf
│ ├── s2
│ │ └── file-4.tsv
│ └── s3
│ └── file-5
└── p2
└── s4
└── file-1.txt
Will produce the following meta data:
summary:
ResearchStudy:
count: 1
Patient:
count: 2
Specimen:
count: 4
DocumentReference:
count: 5
size: 6013814
For more see test_meta_plugin
I need to upload those files to the instance
$ gen3_util files cp --ignore_state --project_id aced-MyExperiment tmp/foo/DocumentReference.ndjson bucket://aced-development-ohsu-data-bucket
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 5.74M/5.74M [00:03<00:00, 1.71MB/s, elapsed=0:00:02.056022, file=6f8101]
info:
- Wrote state to ~/.gen3/gen3-util-state/state.ndjson
msg: OK
I need to upload the meta data about those files to the instance
$gen3_util meta cp tmp/foo bucket://aced-development-ohsu-data-bucket --project_id aced-MyExperiment
msg: Uploaded /var/folders/2c/hffqqtr94nv64tjy0xrl38r89k1sty/T/tmpacozhhoo/_aced-MyExperiment_meta.zip
I need to request or manage access to a project
$ gen3_util access
Usage: gen3_util access [OPTIONS] COMMAND [ARGS]...
Manage access requests.
Options:
--help Show this message and exit.
Commands:
touch Create a request for read access.
update Update the request's approval workflow.
ls List current user's requests.
cat Show details of a specific request.
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install -r requirements-dev.txt
pip install -e .
- fixtures - data for testing environment
tests/fixtures/
└── custom_config
└── config.yaml # testing configuration
- test parameters
tests/
├── integration
│ └── conftest.py
└── unit
└── conftest.py
- running tests
$ pytest --cov=gen3_util
88%
- pre commit tests
A reasonable set of checks, including running unit tests prior to each commit. You can run these tests on demand by:
$ pre-commit install
$ pre-commit run --all-files
debug statements (python)................................................Passed
check python ast.........................................................Passed
fix utf-8 byte order marker..............................................Passed
check json...........................................(no files to check)Skipped
detect private key.......................................................Passed
check yaml...............................................................Passed
check for added large files..............................................Passed
check that scripts with shebangs are executable..........................Passed
check for case conflicts.................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
mixed line ending........................................................Passed
run our unit tests.......................................................Passed
- PyPi
# update pypi
# pypi credentials - see https://twine.readthedocs.io/en/stable/#environment-variables
export TWINE_USERNAME= # the username to use for authentication to the repository.
export TWINE_PASSWORD= # the password to use for authentication to the repository.
# this could be maintained as so: export $(cat .env | xargs)
rm -r dist/
python3 setup.py sdist bdist_wheel
twine upload dist/*