draeger-lab/TFpredict

TFPredict prokaryote

Opened this issue · 0 comments

Hello,
I would like to use TFPredict for prokaryotes, but I keep running into all sorts of problems.

Here is the summary of what I've found:

The downloaded release 1.3 for prokaryotes (md5sum 30d0bab360671f19b8015f5b876354a1) reports that it's version is 1.2. Is the latest version for prokaryotes actually 1.2?

Testing with provided test vector (with specified blastPath from commandline) the run ends with

java.lang.NullPointerException
        at java.base/java.io.Reader.<init>(Reader.java:167)
        at java.base/java.io.InputStreamReader.<init>(InputStreamReader.java:72)
        at io.BasicTools.readStream2List(BasicTools.java:451)
        at io.BasicTools.readResource2List(BasicTools.java:418)
        at io.BasicTools.readResource2List(BasicTools.java:408)
        at modes.Predict.prepareInput(Predict.java:352)
        at modes.Predict.main(Predict.java:187)
        at main.TFpredictMain.main(TFpredictMain.java:123)

I was able to track this to missing resources in the .jar archive (e.g. domainsTFpred.txt). Once copied the content of the 1.3 release zip file into the jar and repacking, this exception disappeared. However, I have no idea, if these files are the correct one for the prokaryote branch.

With the altered jar file I've managed to get to the phase where interproscan is running, however, the interproscan step never finishes (yes - I've waited quite a while).

Next, I've installed local copy of the interproscan to run locally and now the run ends with

java.lang.ArrayIndexOutOfBoundsException: Index 11 out of bounds for length 1
        at ipr.IPRextract.getSeq2DomainMap(IPRextract.java:57)
        at ipr.IPRextract.getSeq2DomainMap(IPRextract.java:46)
        at modes.Predict.runInterproScan(Predict.java:485)
        at modes.Predict.main(Predict.java:189)
        at main.TFpredictMain.main(TFpredictMain.java:123)

I've also tried out the release for eukaryotes. This appeared to work better (no missing files), however, I've encountered an issue with the web version of interproscan, the run on he test vector ends with

# continued output
/tmp/TFpredict_2548192364576464553_basedir/iprscan5-S20211026-090003-0157-94737396-p2m.tsv.txt (No such file or directory)

ls /tmp/TFpredict_2548192364576464553_basedir
iprscan5-S20211026-090003-0157-94737396-p2m.tsv.tsv  query.fasta

When I've used locally installed interproscan, the test vector run finished.


I also have few minor notes

  • the TFPredict prokaryotes does not print to stdout instead all output goes automatically to output.txt (this is different from the TFPredict eukaryotes)
  • what should be provided as path is inconsistent, for the blast, it appears to be the parrent directory for the bin/[blast executables], for the interproscan it appears to be the executable script directly. This was not obvious from the documentation
  • I've also tried the ant build for the prokaryotes and that didn't work either.

Could you please advise me, how to get the TFpredict for prokaryotes working?