InformaticsMatters/squonk2-data-manager-job-tester

Need to support molecules type

Opened this issue · 1 comments

We are in the process of adding support for an input of molecules.
This would have an input like this:

          queries:
            title: Query molecules
            mime-types:
            - squonk/x-smiles
            - text/csv
            type: molecules

This is identical to a file input except for the type property.

The values that would be submitted for this are described here: InformaticsMatters/squonk2-data-manager-ui#876 (comment)

In trying this I find that jote does not handle this. It is expecting inputs to be files from the data directory. The error you get is:

$ jote -m manifest-im-virtual-screening.yaml -j similarity-screen-rdkit -c rdkit
# Using manifest "manifest-im-virtual-screening.yaml"
# Found 25 tests
# Limiting to Collection "rdkit"
# Limiting to Job "similarity-screen-rdkit"
  ---
+ collection=rdkit job=similarity-screen-rdkit test=simple-execution
> definition filename=None
> run-level=Undefined
> image=informaticsmatters/vs-prep:latest
> image-type=simple
> command=/code/screen.py --input '100.smi' --queries 'O=C(CCc1c[nH]c2ccccc12)OCc3ccccc3' --output 'screened.smi'      --descriptor 'rdkit' --metric 'tversky' --threshold '0.8'     --interval 10000
# Compose: Creating test environment...
# Compose: docker-compose (1.29.2, build unknown)
# Compose: Created
# Copying inputs (from "${PWD}/data")...
# + data/100.smi
# + O=C(CCc1c[nH]c2ccccc12)OCc3ccccc3
! FAILURE
! Input file O=C(CCc1c[nH]c2ccccc12)OCc3ccccc3 must start with "data/"

Need to think through the implications of this, but it seems that jote needs to have explicit understanding of the molecules type.

The proposed solution is to provide a type handling module that will simply replicate the Job's ability to handle this type, i.e. simply pass the queries string in or copy a user-provided test file in the repository (just like any other test-based input file).