lhatsk/AlphaLink

FDR Description in Arg parse is potentially wrong

smturzo opened this issue · 3 comments

In the file: predict_with_crosslink.py
The description for the following code is possibly wrong. Number of CPUs definitely cannot be floating point. What is fdr and what does it mean?
parser.add_argument( "--fdr", type=float, default=0.05, help="""Number of CPUs with which to run alignment tools"""

Sorry, it is indeed wrong, thanks for letting me know! I updated the text to clarify.

The FDR is the false discovery rate, a confidence estimate (see e.g., https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5944926/). There are dataset-level FDRs and link-level FDRs. The argument corresponds to the dataset-level FDR and is only used if the link-level FDR (last column in the CSV) is not given. The link-level FDR is then a confidence estimate that this particular crosslink, say (i=10,j=40) is correct. FDR of 5% would mean here that we expect 5% noise in the data set.

Thank you!! Appreciate it.

Let me know how it's working out for you. Curious to hear about more real-world usage. :-)