factsmission/synospecies

NaN ie treatments with no dates

Closed this issue · 14 comments

@retog @nleanba
why do we get this NaN?

image

https://synospecies.plazi.org/#Lestes+barbara

let me know asap, so I can explain at the meeting on coming tuesday

@gsautter @retog @nleanba might this have something to do with recent changes in the use of baseAuthority vs authority https://github.com/plazi/ggi/issues/266?

The treatment dates are indirectly connected via the publication's date, the relevant part of the sparql queries used to find them is

 ​        ?treat treat:publishedIn ?publ . 
 ​        ?publ dc:date ?date .

So I don't think it's related to authority problems. I'm not currently sitting at a computer, so it's difficult for me to check the specific treatment's data, but I can try to find out more later

retog commented

I can't reproduce the issue. Please reopen if the problem occurs again.

Checking up on some old issues; I'm still seeing NaN as a year in SynoSpecies.
image

Keeping an eye on this issue.
https://synospecies.plazi.org/#Tyrannosaurus+rex
this search appears to run without end, and produces a lot of NaN for year columns.

Checking back, any new ideas about this NaN issue? Are you able to reproduce? (I can, https://synospecies.plazi.org/#Tyrannosaurus+rex)

retog commented

I saw the NaN but then it didn't reapear in reload.
@nleanba could you look into this? where does year.year get computed?

Hi Reto,
Interesting that NaN doesn't appear to you upon reload. Not the case for me.
image
Sorry for the delay in responding, this is a busy teaching period. Appreciate your attention to my cluster of issues.
Best regards,
Jeremy

now we get many of those NAN for https://synospecies.plazi.org/#Manospondylus+gigas - this case using the lindas endpoint. however, all these NAN refer to the same treatment https://treatment.plazi.org/id/039887A7FFB1DE57FCB3FF54FB68C542
image

Initial observation: these NaNs are produced SynoLib, and not by a bug in SynoSpecies. Will investigate there further.

@nleanba not sure, whether you got the email last week regarding stardog (the triplestore engine at LINDAS): In case this has something to do with timeouts in startdog, please report this to Adrian Gschwend who is working with startdog in this regards.

No, this was a bug on our side

@retog can you update (?) synolib s.t. Synospecies makes use of this change?

(Please note that this will only change these NaNs to a placeholder value, but it should remove all the duplicates)

I have found a related problem in the data though, unsure if XML or XSLT is the problem:

In the abovementioned treatment, there is a problem in the data.

If we look at the file https://github.com/plazi/treatments-rdf/blob/main/data/03/98/87/039887A7FFB1DE57FCB3FF54FB68C542.ttl we see the following two relevant sections:
(L.12)

<http://dx.doi.org/10.1007/s11692-022-09561-5>
    dc:creator "Iv, W. Scott Persons", "Paul, Gregory S.", "Van Raalte, Jay" ;
    dc:date "2022" ;
    dc:title "The Tyrant Lizard King, Queen and Emperor: Multiple Lines of Morphological and Stratigraphic Evidence Support Subtle Evolution and Probable Speciation Within the North American Genus Tyrannosaurus" ;
    bibo:endPage "179" ;
    bibo:issue "2" ;
    bibo:journal "Evolutionary Biology" ;
    bibo:pubDate "2022-03-01" ;
    bibo:startPage "156" ;
    bibo:volume "49" ;
    a fabio:JournalArticle .

and (L.101)

<http://treatment.plazi.org/id/039887A7FFB1DE57FCB3FF54FB68C542>
    trt:augmentsTaxonConcept <http://taxon-concept.plazi.org/id/Animalia/Tyrannosaurus_rex_Osborn_1905> ;
    trt:publishedIn <https://doi.org/10.1007/s11692-022-09561-5> ;
    dc:creator "Iv, W. Scott Persons", "Paul, Gregory S.", "Van Raalte, Jay" ;
    dwc:basisOfRecord <http://treatment.plazi.org/id/039887A7FFB1DE57FCB3FF54FB68C542/BHI%204100%3F>, <http://treatment.plazi.org/id/039887A7FFB1DE57FCB3FF54FB68C542/BHI%206230>, <http://treatment.plazi.org/id/039887A7FFB1DE57FCB3FF54FB68C542/BHI%206233>, <http://treatment.plazi.org/id/039887A7FFB1DE57FCB3FF54FB68C542/BHI%206435>, <http://treatment.plazi.org/id/039887A7FFB1DE57FCB3FF54FB68C542/BHI%206436>, <http://treatment.plazi.org/id/039887A7FFB1DE57FCB3FF54FB68C542/CM%209340>, <http://treatment.plazi.org/id/039887A7FFB1DE57FCB3FF54FB68C542/MNHUK%20R7994%3F>, <http://treatment.plazi.org/id/039887A7FFB1DE57FCB3FF54FB68C542/RSM%202523.8%3F> ;
    a trt:Treatment .

These two are not linked, but they should be, compare the url for the publication <http://dx.doi.org/10.1007/s11692-022-09561-5> with the url the treatments uses for trt:publishedIn: <https://doi.org/10.1007/s11692-022-09561-5>.

The only difference is http vs https and ds.doi.org vs doi.org.

@retog can you figure out where this discrepancy between the two urls comes from? It seems like both should be the same.

This missing link is then the reason synolib/synospecies cannot associate a date with these treatments, which due to a synolib bug produced NaNs.

Please also note that all the following treatments have the exact same problem: (as in the all are trt:publishedIn <https://doi.org/10.1007/s11692-022-09561-5>) http://treatment.plazi.org/id/039887A7FFB1DE57FF5AF8B0FAB7C790, http://treatment.plazi.org/id/039887A7FFB1DE51FCB3F8B0FDBFC4D2, http://treatment.plazi.org/id/039887A7FFB1DE57FCB3FD82FB97C376, http://treatment.plazi.org/id/039887A7FFB1DE57FCB3F9FDFC32C074, http://treatment.plazi.org/id/039887A7FFB1DE57FCB3FB8EFB5AC129, http://treatment.plazi.org/id/039887A7FFB1DE57FCB3FF54FB68C542

retog commented

@retog can you update (?) synolib s.t. Synospecies makes use of this change?

@nleanba, published.