seqcode/cegr-galaxy

Revisit the format of the results sent to PEGR

Opened this issue · 1 comments

dshao commented

Currently, there are a lot of ad hoc codes on PEGR to process and parse the results from Galaxy. Those codes are specific to each individual tool. It might make sense to move those logics to the Galaxy side since those tools are defined in Galaxy.

  1. Currently, datasets are formatted as
    "datasets": [{
    "type": "string",
    "id": "string",
    "uri": "uri",
    },
    // other datasets
    ]

The current "type" includes "html", "txt", "png", which is not always informative. And PEGR needs to interpret it (e.g. find memeFig, fourColor) from type and tool category. It would be easier on the PEGR side if Galaxy could send the datasets in the following format

"datasets":{"bamRaw": "uri"}
or
"datasets":{"memeFile":"uri", "memeFig":"uri"}
or
"datasets":{ "fourColor": ["uri1", "uri2"]}

  1. parameters currently include special character escapes, e.g. "\".

  2. Actually, only two parameters are processed and used in PEGR: peakCallingParam and peakPairsParam. Is it possible for Galaxy to send the processed parameters?

  3. statistics are enclosed in a list []. Why?
    e.g. [{},{"read":2,"adapterDimerCount":0.0}]

a new branch called "clean_outputs" is created. I implemented updates on datasets key. I am going to updated "Parameters" and "Statistics".