Getting more information out of trident list
Opened this issue · 3 comments
After talking with Stephan today, we thought it would be nice to have the option of getting more information about ecah package out of trident list. This would streamline integration into data analysis, and also downstream processes (like thetis).
Below is an example of a TSV with information contained in the POSEIDON.yml of 2 packages as a reference:
poseidon_version package_dir title description contributors package_version last_modified genotype_format geno_file geno_file_chksum snp_file snp_file_chksum ind_file ind_file_chksum snp_set janno_file janno_file_chksum sequencing_source_file sequencing_source_file_chksum bib_file bib_file_chksum readme_file changelog_file
2.5.0 /Users/lamnidis/poseidon_packages/community-archive/2018_OlaldeNature 2018_OlaldeNature Ancient genomes from the Bell Beaker period in Europe. Originally AADR v42.4. Ayshin Ghalichi (ghalichi@shh.mpg.de) 2.1.1 2023-07-11 PLINK 2018_OlaldeNature.bed e11e8a7ef0b74e964732db0cbe5046f4 2018_OlaldeNature.bim 7a7ef4d4f9c78a0bba32a329b6162dbd 2018_OlaldeNature.fam 95f51d4ef3797b556e6c0154bf8d443d 1240K 2018_OlaldeNature.janno
2.5.0 /Users/lamnidis/poseidon_packages/community-archive/2018_Lamnidis_Fennoscandia 2018_Lamnidis_Fennoscandia Ancient genomes from Finland and Russia. Thiseas Lamnidis (lamnidisi@shh.mpg.de) 2.1.0 2023-07-04 PLINK 2018_Lamnidis_Fennoscandia.bed 74d8d52d45a0d2f6ed1212af5d2f4268 2018_Lamnidis_Fennoscandia.bim 10fe736b07171086524ec92dc5e06a22 2018_Lamnidis_Fennoscandia.fam 90c1b106d15bceccc1e25c34d3060d75 1240K 2018_Lamnidis_Fennoscandia.janno
trident list --remote --packages --raw
already shows some of this information, so adding more columns to the ouput with a dedicated flag would do the trick.
Sound like a good idea to me 👍
What would be a solid interface for this? Just a --verbose
(?) flag that adds all of these columns to the output? Or a more sophisticated argument to request specific columns?
IMO spitting out all the info in the YAML file with one flag is enough. It's easy enough to select a subset of columns downstream if need be.
The main thing for my use case here is the package_directory
column, which is not in the YANL, but implicitly known (as the path to the file)