wpoa/JATS-to-Mediawiki

Redundant square brackets around footnote links

Klortho opened this issue · 2 comments

Moved from konrad#15
Quoting @jgmorse:

WikiMedia auto-wraps footnote links in square brackets; if the source data explicitly wraps elements in square brackets, then you end up with some ugly rendering like '[[5]]'. Since the outside brackets are CDATA children of the xref's parent element, and xref is in the content model for a wide array of elements, scrubbing out those outside square brackets is very difficult (impossible?) in XSLT. More viable solutions:
a. pre- or post-processing of source DATA to remove brackets around elements
b. investigate alternative WikiMedia rendering options
c. remove the brackets in XSLT (would require a tortured template structure if it's even possible)

See the imported article here: https://en.wikisource.org/wiki/Wikisource:WikiProject_Open_Access/Programmatic_import_from_PubMed_Central/The_Vpr_protein_from_HIV-1:_distinct_roles_along_the_viral_life_cycle

This is PMC554975, downloadable from ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/6f/7e/Retrovirology_2005_Feb_22_2_11.tar.gz.

Problem example: 1, HIV-2 and SIV [<ref name="B1">1</ref>]. The role of

@jgmorse It is tricky but not impossible. The trick is to match on the text nodes that begin/end in square brackets and have following or preceding siblings .

@Daniel-Mietchen , this is really a freaking nasty transformation problem, and definitely the kind of thing that JATS4R should address. The square brackets are part of the punctuation, and should be part of the presentation layer (in other words, the transformation).

Anyways, I can solve this with adding yet another special case, and I'll do that now.