[BUG] documentation issue: Statistical Testing / SAM
smoothdeveloper opened this issue · 8 comments
I'd like to propose some fixes to this piece of the documentation (broken link, image, missing data, maybe some terms don't fit the image anymore).
https://fslab.org/FSharp.Stats/Testing.html#SAM
My main problem is that there is no TestDataSAM.txt
.
@zieglerSe, wondering if by chance you have a copy of this file?
Of course!
The TestDataSAM.txt file is located in FSharp.Stats\tests\FSharp.Stats.Tests\data\TestDataSAM.txt
.
Also, you can find a copy of it attached here:
TestDataSAM.txt
@zieglerSe sorry for having missed it in the repository and thanks for the other PR to adjust the docs, I'll be able to proceed if I see further adjustments to suggest.
In #272 where I'm suggesting some changes in the overall code (not SAM only), there is seemingly a missing check for -infinity
in the implementation of SAM, could you confirm if this is an oversight in which case I'll try to adjust a test case that would cover it and add the check for -infinity
.
Thanks!
@smoothdeveloper no problem!
I saw #272 and checked it - the missing check for -infinity
was an oversight. Feel free to add the changes, otherwise i'll take care of it!
I'll get to it unless it is breaking anyone right now or is urgent, as I'd like to just grasp the thing some more and get familiar with how tests are done in the repository.
A small note about the SAM
record, it has the ID
field hardcoded to string
but I believe we'd want to make this a generic type parameter (that is IComparable
).
I've not looked how it would impact client code, but unless there is a "ID" of items we put in SAM is settled to be a string, always, I think it is worth adjusting to make it generic.
The main caveat is, people who haven't used statically typed languages, with generics, may find this a bit "head scratching" at first.
If there is interest, I'll make separate issue and check the outcome of making this record field generic.
I agree, that a generic type for the ID field would be beneficial. The only draw back I can see is, that the type would contain a generic type annotation which makes the type a little bit more complex but the benefit definetely greater.
Testing.SAM.SAMResult becomes Testing.SAM.SamResult<'a>
While in most cases the identifier would be a UniProt accession number (string
) or some other kind of domain-id, it definetely is possible to have e.g. BioFSharp.IO.FastA.FastaItem<BioArray<Nucleotide>>
as bioitem identifier.
If you want, you can take care of it, or I can quickly modify the SAM type🚀
@bvenn, I'll make a PR for the ID
field so we can review the impact, you are right, it bubbles to SAMResult as well. Thanks for the feedback!
Closing the issue as the fixed documentation has made it online :)