Support HTML markup in submission titles

Question

Support HTML markup in submission titles

Closed this issue 2 years ago · 84 comments

Italics display in the TOC title (see The Poetic Life-Form: An Analysis on the Role of Elegy and Form in In Memoriam):
http://journals.sfu.ca/courses/index.php/eng435/index

But not on the article page, where the tags themselves are displayed:
http://journals.sfu.ca/courses/index.php/eng435/article/view/7

Edits

Metadata support details listed by @asmecher #2564 (comment)

PRs

App
pkp-lib --> #8584
ui-library --> pkp/ui-library#252
ojs --> pkp/ojs#3731
ops --> pkp/ops#464
omp --> pkp/omp#1329

Plugins
OAI DC, MarcXML, RSS/ATOM [HTML Tag Support : NO] --> No Change Required
DC(DublinCore) [HTML Tag Support : NO] --> No Change Required
googleScholar[HTML Tag Support : NO] --> pkp/googleScholar#11 [PR CLOSED]
orcidProfile [HTML Tag Support : NO] --> pkp/orcidProfile#231
crossref-ojs [HTML Tag Support : YES] --> pkp/crossref-ojs#25
crossref-ops [HTML Tag Support : YES] --> pkp/crossref-ops#22
medra[HTML Tag Support : NO] --> pkp/medra#6 [PR CLOSED]
DataCite [HTML Tag Support : NO] --> touhidurabir/ojs@e0676e5
DOAJ [HTML Tag Support : NO] --> touhidurabir/ojs@b080ef1
PubMed[HTML Tag Support : YES] --> touhidurabir/ojs@ae06853
jatsTemplate [HTML Tag Support : YES] --> pkp/jatsTemplate#22
oaiJats [HTML Tag Support : YES] --> pkp/oaiJats#33
ONIX [HTML Tag Support : NO] --> pkp/omp#1355

Answer 1 · 2017-06-02T18:24:24.000Z

In OJS 2.x, we used to allow HTML tags in article titles, but never (IIRC) dealt with what that meant downstream e.g. in OAI consumers. Plus it causes problems for users who legitimately want to write e.g. p < 0.05 in titles; they would need to write p < 0.05 in the text field, which is totally unintuitive.

I suggest that if we do support this, we need to support it properly (e.g. with a rich text editor).

Answer 2 · 2017-06-20T05:59:50.000Z

I have been looking for a rich text editor for a text input field, but to my surprise there seems to be none?

I suggest two solutions:

Allow a limited set of markdown in the title. The problem here is of course usability. (I think a Drupal module uses this approach for the same problem: https://www.drupal.org/project/html_title)
Let's use a textarea for the title fields instead of input and a custom set of tinymce buttons and a more strict html filter. (see http://wpsnipp.com/index.php/page/add-tinymce-editor-to-postpage-title-input-field/). This is a bit ugly, but will probably give better results.

Answer 3 · 2017-06-20T15:43:28.000Z

I'd prefer option 2 -- and we already have two presentations for TinyMCE ("extended" vs. normal or something like that), so adding a third configuration for minimal controls wouldn't be too invasive. I suspect titles would be a good driving use case for choosing which buttons should be available -- suggest B/I/U, perhaps super/subscript, and a button to edit the markup. Even just facilitating copy/paste into these fields in a rich format will be a big win, I think.

Answer 4 · 2017-11-14T04:09:24.000Z

Hi @asmecher,
in biology this is a very important issue because the scientific name of any species have to be write in italics each time! And now in titles appear like name except in TOC, as say @stranack. For resolve this is not necessary implement TinyMCE, for now is sufficient with display the title like TOC, similar to OJS 2.4

What do you think?

Answer 5 · 2017-11-15T22:19:29.000Z

@t4x0n, the problem comes when we provide data to 3rd-party services/tools, like Google Scholar, OAI-PMH, CrossRef, etc. Most of those will expect plaintext, not HTML, and most authors will expect to have to enter plaintext, not HTML. If we standardize on accepting HTML, then it'll be necessary to strip tags from the feeds that go to 3rd-party standards. But without a tool like TinyMCE to facilitate the entry of HTML, we run the risk of having it strip legitimate uses of <, >, and & from titles.

Answer 6 · 2017-11-16T13:17:02.000Z

@asmecher I understand, but maybe an option can would whether for 3rd-party offer a XML file instead of HTML? ...I don't know, I am thinking in SciELO system, here an example article: http://www.scielo.cl/scielo.php?script=sci_arttext&pid=S0717-66432017000100001&lng=es&nrm=iso&tlng=en

Answer 7 · 2017-11-16T19:43:10.000Z

@t4x0n, I think the "proper" solution actually won't be so much work -- mostly it's just finding a configuration of TinyMCE that looks good when used to replace an <input type="text"> element.

Answer 8 · 2017-11-16T20:43:47.000Z

mmm ok! But @asmecher, that does not resolve the 3rd-party standards, right?

in OJS 2 we just put html tags for italics and was not a problem, but TinyMCE is probably a better solution, I am not familiar yet with the OJS code, so when I try for a similar solution to another issue I can’t was it :/ thanks for the reply!

Answer 9 · 2017-11-16T20:50:04.000Z

If we had TinyMCE support in the title fields...

Authors would be able to naively type <, >, and & characters and they'd be HTML-encoded properly
Authors could use the WYSIWYG tools to enter rich content without needing to know HTML, or paste them in from rich-capable sources
OJS would predictably know all stored titles were in HTML, so
- it could down-convert to plaintext when needed (there are definitely a few hairy considerations here)
- it could present them internally with rich content

Best of all worlds, as long as

TinyMCE can be persuaded to look/behave well as a single-line input, and
we can reliably convert HTML to text for formats not supporting HTML. (This is already implemented and used in various fields, but titles are so important that we may need to review how well this works.)

Answer 10 · 2017-11-16T20:56:38.000Z

Excellent! I did not know that the system can convert html to plain text!

Answer 11 · 2017-11-17T09:51:49.000Z

Hi,

Basically this should do it: #3074 (of course needs some cleaning, the layout is problematic because of the prefix field and we should add rows="1" to make those fields smaller and maybe change the font size for those fields)

However, I noticed that there is a bug in the code that probably preventes the custom toolbar of being used.

For example the abstract field should use a special "rich" toolbar

It is called here:
https://github.com/pkp/pkp-lib/blob/master/templates/submission/submissionMetadataFormTitleFields.tpl#L23

And defined here:
https://github.com/pkp/pkp-lib/blob/master/js/classes/Handler.js#L700

The toolbar contents are defined here: https://github.com/pkp/pkp-lib/blob/master/js/controllers/SiteHandler.js#L150

However, at least for me, I am not seeing the bullist and numlist buttons in 3.1 although the abstract field should have them while they are included in the richToolbar?

@NateWr, I see that you have done work with the toolbar settings, can you confirm that this is a bug or am I missing something?

Answer 12 · 2017-11-17T10:44:19.000Z

It's been a long time since I fiddled with that so I don't have an idea off the top of my head. I'd check that the $FBV_rich variable is being set properly, and that the bullist and numlist buttons are the right names.

Answer 13 · 2017-11-17T10:44:19.000Z

@ajnyga for me is the same, I am not seeing the bullist and numlist buttons in 3.1, and upload image is not working for me either #3064

Some days ago, I was trying to generate another toolbar for this issue #2979, a toolbar without JBImages, but I can't be made it... maybe is the bug as you say

Answer 14 · 2017-11-17T10:49:04.000Z

hmm, maybe it is missing plugins: https://community.tinymce.com/communityQuestion?id=90661000000If5QAAS

Answer 15 · 2017-11-17T10:58:41.000Z

Ok, so the richToolbar is definitely loading, Just without those two buttons. I added advlist and lists to the list of plugins, but it does not seem to make a difference.

It is the richToolbar because it includes the superscript button which the default toolbar does not have.

But something is wrong there anyway, but I do not have the time to look at it just now.

So not sure why my pr above is not working. There is something fishy here...

Answer 16 · 2017-11-17T16:39:10.000Z

Hi @asmecher, it also looks like we have inconsistent escaping of the article title, i.e. in the issue TOC we have $article->getLocalizedTitle()|strip_unsafe_html whereas in the article summary we have $article->getLocalizedTitle()|escape

Answer 17 · 2017-11-17T20:01:27.000Z

@ajnyga, the reason the lists tools weren't being displayed was that the lists plugin was missing from TinyMCE's initialization. I've added it: 0c3d3ec

Answer 18 · 2017-11-17T20:05:06.000Z

Hi,
Yes, I tried that as well, but could not get it to work. Maybe it was a cache issue or something,

Anyway, I understood from the link above that advlist plugin is also needed. But is it workign without it as well?

Answer 19 · 2017-11-17T20:10:43.000Z

I got the tools to display with the changes I committed (on a control that was set to rich="extended"). You'll need to turn off minification (so your uncompiled Javascript is read instead of the compiled version) and flush the browser cache for changes to appear.

Answer 20 · 2018-01-12T09:01:33.000Z

Hi @asmecher (and @NateWr, see especially the editor height issues below)

Here is a new pr. You were right that it was a cache issue that caused the earlier problems.
#3265

I did not test yet what ends up in the database when saving the form, or in what cases and how the html should be stripped from the titles. But these should not be that hard to add.

The visual layout is of course something that could be enhanced.

Note that I set the min_height value for tinymce. I had difficulties in resizing the tinymyce height until I notice that there is a code that counts the height based on the number of textarea rows: https://github.com/pkp/pkp-lib/blob/master/js/controllers/SiteHandler.js#L220

However, if that value is below 100px there is a default min_height value in tinymce that returns the height back to 100px. But you can deal with that by setting a custom min_height in the settigns like I did now. I used 20px because now the setting of 1 row will return 20px height like you would expect based on the code here: https://github.com/pkp/pkp-lib/blob/master/js/controllers/SiteHandler.js#L220. BUT there could be cases where the editor will now return a smaller editor than was expected, for example the setting of 3 rows will return 60px when it used to return 100px which was the default min_height.

Another thing with the editor heights is that I noticed no visible affect when using for example height=$fbvStyles.height.TALL in the textarea settings. I did create a new ONELINE setting there, but maybe those could/should be removed altogether?

Answer 21 · 2018-01-12T20:31:45.000Z

Excellent start, @ajnyga! @NateWr, would you mind tinkering a little with this?

Answer 22 · 2018-01-15T12:36:34.000Z

Great work @ajnyga. I do have a few concerns about going ahead with this. Sorry to jump in so late.

What happens when a user hits the enter key?
What happens when a user pastes from a Word document or somewhere that carries with it HTML formatting (like styles)?
Do we have a legitimate use-case for allowing bold and underlines?
What exactly is being saved? In my experience, TinyMCE wraps text in <p> by default.

I understand that this is an important feature. But we're taking something bulletproof (a plain-text field) and introducing scope for lots of human error, some of it critical -- like if a bad title is handed off to citation/indexing handlers.

In my experience, taking something wide-ranging and trying to lock it down will lead us into a lot of maintenance. Every new device that comes out ends up breaking old hacks or supporting new types of entry we need to lock down.

I'd be tempted to explore with the Substance people what their experience is with needs like this, how they go about sanitizing a rich text area, and see whether there might be some option which is more limiting to users. Should I reach out to them?

Answer 23 · 2018-01-15T12:59:38.000Z

Hi,

You are right that the input here would need a 100% sure way of cleaning the input.

I think there should be the settings needed to take care of this in tinymce. Not sure what the right combination would be: https://www.tinymce.com/docs-3x/reference/configuration/Configuration3x@forced_root_block/ and https://www.tinymce.com/docs-3x/reference/configuration/Configuration3x@force_br_newlines/
Probably this would help limit the copy pasted styles (but would need some testin) https://www.tinymce.com/docs-3x/reference/configuration/Configuration3x@valid_elements/
No, I do not have, just for the three other settigns I added there. I realize that people will be tempted to bold the whole title now :D But maybe math would be something that people would like to see.
I think setting this to false would take care of that: https://www.tinymce.com/docs-3x/reference/configuration/Configuration3x@forced_root_block/

As I mentioned above, I actually tried to look for a very simple text input editor for this purpose, but really could not find one. So tinymce (or something similar) seemed like the only alternative. By all means do ask the Substance people about this, because they have probably talked about similar use cases. For us (journal.fi) this is not a big issue. One of our journals does use italics in their title, but have been doing so since 2.x and just add the tags there.

Answer 24 · 2018-01-15T14:34:52.000Z

Are you able to try those config options? My sense from reading them is that the user will be still be able to insert line breaks (<br>), but worth trying it out.

Stripping attributes may be more or less successful. I remember it used to be awful at it but not sure if that's still the case.

Answer 25 · 2018-01-15T14:41:15.000Z

Yes sure, but go ahead and contact Substance in case they have something better in mind.

Answer 26 · 2018-01-15T14:52:08.000Z

It says something that this old PKP issue is on the first page of results when I search for "tinymce restrict to single line": https://pkp.sfu.ca/bugzilla/bugslist/show_bug.cgi?id=6759

Answer 27 · 2018-01-16T10:45:48.000Z

Hi @NateWr

#3265

Referring to your list above

Enter key is now disabled
I added a list of valid tags. Even if you paste text that is divided to paragraphs or has linebreaks, the editor will remove those upon copying. Note that this would have to be tested with other browsers as well, not just Firefox. Also, I would be incluned to add a PHP filter that would be applied upon saving the data to make sure that only the limited tags will end up in the database.
I removed bold and underline. If requrested, these can be added easily
The default wrapping <p> is now removed form the oneline editor.

So I think that this is now working fairly well.

Answer 28 · 2018-01-16T10:49:54.000Z

🎉 Let's remove the "powered by" too. I think it's a config. We can keep them for the big ones, but the less clutter the better on these wee fields.

Also, let's explore inline mode to reduce the iframes we're loading as well as the space the fields take up in a form.

Answer 29 · 2018-01-16T11:05:35.000Z

I added the branding setting, but it is a bit weird, because it affects all the editors, also the one in the abstract. I am not sure why it does so although I am only applying it to the oneline editor.

With the inline mode, I am not sure what you have in mind? Are you thinking that there would be a default text like "insert title here" inside a div and that the editor would be enabled upon clicking that div? Would that work with the current solution you have for multilingual fields?

Answer 30 · 2018-01-16T11:13:29.000Z

My assumption is that the inline mode will do two things:

It will not use an iframe which might improve out performance.
It will not display a toolbar by default. Only when the field is focused will the toolbar "hover" above the cursor. So when not in focus, the field should appear exactly like a regular text input.

If that doesn't work out, or you have trouble getting it going, let me know and I'll try to play around with it.

Answer 31 · 2018-01-16T11:41:20.000Z

So I tried to add div to submissionMetadataFormTitleFields.tpl and added settings.inline = true; and settings.selector = "#divname" to Handler.js to the custom oneline editor settings I have added there, but this was not enough. I am not sure what was missing.

Answer 32 · 2018-01-17T09:51:56.000Z

To get the inline working, we need to use a <div> (or any block element) instead of a <textarea>. I was able to get this going with a few changes. It looks like your branch is a bit out-dated so I wasn't able to issue a clean PR, but you can just cherry-pick this commit in:

NateWr@5f1460c

It still needs some work. Here's what it looks like:

So still todo:

Remove the top menu bar
Figure out how to re-instate the multilingual input support
Remove the focus styling around the element (in css: outline: 0)

Answer 33 · 2018-01-17T09:52:39.000Z

Oh, I see now what you said about every tinymce going inline... Hmm...

Answer 34 · 2018-01-17T09:55:54.000Z

Ok, so we just needed to explicitly set inline to false each time we load the tinymce, otherwise the old value in the settings lingered. You can cherry-pick this commit in: NateWr@eeb646a

Answer 35 · 2018-03-31T16:16:13.000Z

Hi @NateWr

See #3544 for the latest pr. It seems to be working with one locale, but two form locales caused an error. Did not hae time to check it closer yet...

edit: was able to fix the multilingual support by changing the jquery selector. One small problem though: if you click on the title field the editor buttons appear, and when you leave the field, the editor buttons do not disappear. They do disappers if you edit the secondary locale last. So it probably needs a secondary function for hiding the editor buttons?

edit2: this commit a6e84d9#diff-83df7d7f7c760e14fa91323920849101 solves to problem partially. The toolbar is now hidden, but it is not initialized if you try to focus it right after that. If you click something else and try again it works. So it needs an additional show but could not figure out the right place and/or the right selector. You probably know the multilingual popover code a lot better...

Answer 36 · 2018-04-06T07:40:21.000Z

maybe you know that, but... in OJS 3.1.1 the titles with italics look like this yet:

I think this was solved in this version

Answer 37 · 2018-04-06T14:54:33.000Z

Yes, italics/HTML in article titles isn't supported yet.

Answer 38 · 2019-04-11T14:47:17.000Z

Hi all, just checking on this. Has there been any progress on this, or on #3544?

Answer 39 · 2019-04-11T14:57:24.000Z

i think i got that almost ready back then, only some validation things were not working. Can't promise that I have time to return to this right now, but should not take much work when I have the time. Anyone can step in and continue of course.

Answer 40 · 2019-06-03T17:37:28.000Z

Just a heads-up here: the title and subtitle fields will be migrated to the new Vue.js forms system as part of some UI/UX changes related to versioning. So this issue will likely be addressed in that context, as a new field type in the UI Library (which also uses TinyMCE, so we should be able to draw on the work done here).

Answer 41 · 2019-06-03T17:38:50.000Z

thanks, been kind of waiting for that.

Answer 42 · 2019-10-21T19:44:00.000Z

Hi all,
to understand what is the actual state of this issue: Will be possible to implement TinyMCE in titles in OJS 3.1.x or in OJS 3.2 because the migrate to the new Vue.js?

Answer 43 · 2019-10-22T08:50:08.000Z

Unfortunately, I don't think it will make it into OJS 3.2.

Answer 44 · 2019-10-23T03:57:23.000Z

ok, thanks @NateWr but please do not forget this issue for the future ;)

Answer 45 · 2020-02-06T10:33:00.000Z

Hi @NateWr,
from my perspective as a scientist, the ability to italicize words or letters in article titles is of great importance. Is there a chance to prioritize this feature and implement it already in OJS 3.2? I think the community really needs it. We (currently hosting eight journals) got frequently asked for the feature.
Cheers, Tim

Answer 46 · 2020-02-06T11:31:08.000Z

It's too late for 3.2 but I've applied the Community Priority label to increase the likelihood that this will get attention in the future.

Answer 47 · 2020-02-06T13:40:37.000Z

I almost got this working with the old forms. Now that the new forms are implemented will try to solve this using them. We have a need for it as well.

Answer 48 · 2020-07-01T23:12:50.000Z

@t4x0n, the problem comes when we provide data to 3rd-party services/tools, like Google Scholar, OAI-PMH, CrossRef, etc. Most of those will expect plaintext, not HTML, and most authors will expect to have to enter plaintext, not HTML. If we standardize on accepting HTML, then it'll be necessary to strip tags from the feeds that go to 3rd-party standards. But without a tool like TinyMCE to facilitate the entry of HTML, we run the risk of having it strip legitimate uses of <, >, and & from titles.

CrossRef seems friendly to basic markup in the title -- bold, italic, underline, overline, superscript, subscript, small caps, and typewriter text (monospaced font):

https://www.crossref.org/education/content-registration/crossrefs-metadata-deposit-schema/face-markup/

Answer 49 · 2020-07-02T00:12:15.000Z

Just to note that superscript/subscript and some mathematics seem to be already supported via Unicode (typeset externally then pasted into OJS), see for example:

https://forum.pkp.sfu.ca/t/add-superscript-to-journal-title/30469/4

So it's mainly bold and italics that would seem to have no support so far.

Answer 50 · 2020-09-11T13:18:59.000Z

I really second this request. I'm currently preparing the first issue of a new journal, and they frequently refer to other titles in their titles, or they have transliterations in their titles.

Answer 51 · 2020-11-14T14:00:52.000Z

Adding a big plus one for this feature. It has been requested by several journals where they are used to being able to italicize book or poem titles in article titles.

@NateWr having missed the 3.2 release, do you know if it is likely to make it into 3.3?

Thanks

Answer 52 · 2020-11-14T20:12:05.000Z

We have journals that requested this feature from us as well.
And others that use it in OJS 2.

Answer 53 · 2020-11-16T11:10:50.000Z

No, I don't think this will be making it into 3.3, unfortunately.

Answer 54 · 2021-01-22T16:10:31.000Z

Hello!

I am newbie to pkp (3.x). I am moving a science magazine to ojs and have run into this problem.

For the transfer I have created an xml importer that already works perfect. The magazine has many titles in HTML formats. The exporter moves them smoothly and protects problem characters appropriately. Everything is fine, but in the titles the problem reported here appears.

In my opinion, there are two different levels in this matter. The output and input of these formats.
For my work, the entry does not give a problem, because the importer enters them correctly through xml.

The problem is the output, and there is clearly something inconsistent here. In the table of contents the formatting is interpreted correctly, therefore it is perfectly possible. However, in the independent articles, in the title and in the citation, the html markup appears.

The output looks like it could at least be revised to make it consistent across TOC and individual items.

This would already solve (more than) half the problem, in my humble opinion.

Cheers!

Answer 55 · 2021-01-25T11:12:26.000Z

Thanks @adguah, you're right there is an inconsistency in the default theme regarding how titles are handled. In a summary, we use strip_unsafe_html, which will leave some HTML tags in place:

https://github.com/pkp/ojs/blob/master/templates/frontend/objects/article_summary.tpl#L43

In the full article page, we use escape which will escape them all:

https://github.com/pkp/ojs/blob/master/templates/frontend/objects/article_details.tpl#L91

@adguah One of the things holding us back here is an understanding of how other downstream services handle HTML in titles. For example, how are these displayed in Google Scholar or other indexing services? It looks like Crossref supports some HTML markup (see comment above).

@asmecher would it be appropriate for us to use strip_unsafe_html on the article landing page for now? Should we consider a new Smarty filter specifically for titles with a more limited tag set?

Answer 56 · 2021-01-25T17:03:11.000Z

Thanks @NateWr,

Just an opinion.

On the full article page, the article title appears in three places (plus meta name = "DC.Title" and meta name = "DC.Title.Alternative").

In the DOM document.title the best option, I think, is to remove all html tags. The escape option makes the tags visible, but not interpreted. It hampers readability without providing any benefit.

In the article title and in the citation box, the best option (in my opinion, the correct one) is to apply strip_unsafe_html just like in the TOC issue page (article_summary?).

I do not know the use of the meta elements. If it is important for all this, maybe you could include one more with plain text, removing all the html tags (meta name = "DC.Title.Text"?).

Google and Google scholar seem to take primarily the title from the DOM document.title, and somehow, they are also able to identify texts with title function inside the document, even within pdf. As far as I have been able to verify, in this identification it removes the html formattings without problem.

Escaping the DOM document.title does not make this task easier, but rather, in my opinion, complicates it by making the tags appear as part of the text. It has the same effect for the human readers in the title of the article and in the citation box, since it does nothing more than show the labels in a visible way.

Here are some screen captures showing the results:

This comes from the DOM document.title:

This is the article title:

And this is the citation box:

In the issue TOC, the article title appears as it should:

Please, consider all of this just as comments for what they can be of any use.

Thanks again,

Alberto

Answer 57 · 2021-01-25T23:36:25.000Z

@asmecher would it be appropriate for us to use strip_unsafe_html on the article landing page for now? Should we consider a new Smarty filter specifically for titles with a more limited tag set?

Until we have a proper solution in place for this, we should be using escape to display titles throughout so that special characters are not accidentally interpreted as HTML (e.g. p<15; Mr & Mrs Smith). Variations from this are strictly bugs until we add proper support for rich text editing in titles and, as part of that process, convert all existing titles from plain-text to HTML during upgrade by passing them through htmlspecialchars. (Relaxing existing escaping now will cause havoc when the htmlspecialchars step is applied, leading to some titles being double-escaped.)

Meta tags, the OAI interface, CrossRef, and any other consumer of title data would need to be adjusted at the same time to use PKPString::html2text (or a better alternative) so that they would continue to receive plain-text titles as they expect. (I don't believe any of these toolsets expect HTML in titles.)

Answer 58 · 2021-03-31T17:58:56.000Z

+1 for hosted journal that would be interested in this.

Answer 59 · 2021-08-23T21:30:58.000Z

Reviewing the field, I think we have several viable options:

UTF-8 entities for simple formatting, e.g. chemical formulas and basic polynomial mathematics. https://en.wikipedia.org/wiki/Unicode_subscripts_and_superscripts:

Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals.[1] These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.

These can be pasted into TinyMCE (or even plain text fields) and don't require additional support.
Basic HTML formatting, in particular to support the use of italics in biology journals. TinyMCE can support this adequately out of the box. Example:
MathJAX-type embedded LaTeX for complex (mathematical) formulas. Example:

There is no single toolset that will meet all of these requirements.

Proposal: Support all 3 options.

Continue to support Unicode rich text for simple cases without changes.
Turn all title fields into TinyMCE rich text fields with the editor adapted for one-line entry.
For those requiring math support, implement a MathJAX plugin that:
- adds support for MathJAX integration atop TinyMCE using something like https://github.com/dimakorotkov/tinymce-mathjax
- adds MathJax javascript to the front end to aid in presentation
Downstream consumers will need content either stripped of HTML content (strip_tags), or if they support richer options, adapted as needed:
- OAI DC, MarcXML, RSS/ATOM, etc: convert to plaintext
- CrossRef: Limited HTML-like markup and MathML supported
- JATS: Limited HTML-like markup, MathML and TeX supported
- ORCID: Convert to plaintext
- DOAJ: Convert to plaintext
- DataCite: Convert to plaintext (schema does not support, but formatting observed in the wild)
- PubMed: See JATS
- Google Scholar: Convert to plaintext

Challenge: How do math formulas get converted to plaintext? Proposal: We don't. Inline formulas get sent downstream in embedded LaTeX notation. Caveat emptor but it is not too bad a form if displayed in plaintext.

Answer 60 · 2021-08-23T23:30:12.000Z

ORCID supports Unicode, too, e.g.:

https://orcid.org/0000-0001-7361-1628

"A representative of RΓ(N,T) for higher dimensional twists of ℤpr(1)"

for:

maybe the conversion from MathML to Unicode can be done via "AsciiMath":

https://en.wikipedia.org/wiki/AsciiMath
https://github.com/plurimath/mathml2asciimath
https://github.com/learningobjectsinc/mathml-to-asciimath

Answer 61 · 2021-08-26T09:48:32.000Z

Turn all title fields into TinyMCE rich text fields with the editor adapted for one-line entry.

One challenge here is that text copy-pasted into such a field will come with associated HTML. So whether we like it or not, we'll get <p>, <div> and <span> tags thrown into the bargain. We'll need to do some work to mitigate this when text is pasted into such a field.

Answer 62 · 2021-08-26T09:49:53.000Z

Additional downstream consumers to consider:

DataCite
PubMed
RSS
Google Scholar (meta tags on landing pages)

Answer 63 · 2021-08-28T01:09:20.000Z

So whether we like it or not, we'll get <p>, <div> and <span> tags thrown into the bargain.

Yes, I think we can define our very few supported elements (strong, underline, italics) and strip out any other tags using the same approach we use for abstracts etc.

Additional downstream consumers to consider:

I'll add these to the comment above.

Answer 64 · 2021-09-01T11:58:38.000Z

Yes, I think we can define our very few supported elements (strong, underline, italics) and strip out any other tags using the same approach we use for abstracts etc.

We'll probably need to do something client-side, at the moment of pasting in. That's because TinyMCE is always bound to a contenteditable="true" field, which will make it behave more like a textarea. For example, line breaks can cause text in a "single line" field to disappear since only the cursor's current position is visible.

Answer 65 · 2021-09-01T12:25:07.000Z

I May suggest italics, subscript and superscript. In my experience, bold or underline are not relevant in article titles. Get Outlook for Android<https://aka.ms/AAb9ysg>

…

________________________________ De: Nate Wright ***@***.***> Enviado: miércoles, 1 de septiembre de 2021 1:58 p. m. Para: pkp/pkp-lib Cc: adguah; Mention Asunto: Re: [pkp/pkp-lib] Inconsistent HTML display in article titles (#2564) Yes, I think we can define our very few supported elements (strong, underline, italics) and strip out any other tags using the same approach we use for abstracts etc. We'll probably need to do something client-side, at the moment of pasting in. That's because TinyMCE is always bound to a contenteditable="true" field, which will make it behave more like a textarea. For example, line breaks can cause text in a "single line" field to disappear since only the cursor's current position is visible. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#2564 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASR57LZV4CZIZXXM7ERIN3LT7YIPVANCNFSM4DN2PGXA>. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

Answer 66 · 2022-01-31T07:37:00.000Z

So it's mainly bold and italics that would seem to have no support so far.

the Unicode Mathematical Alphanumeric Symbols block offers Latin characters in italics and/or bold, such as 𝐀, 𝐴, 𝑨. A caveat (from Wikipedia):

Unicode expressly recommends that these characters not be used in general text as a substitute for presentational markup;[3] the letters are specifically designed to be semantically different from each other. Unicode does not include a set of normal serif letters in the set.[a] Still they have found some usage on social media, for example by people who want a stylized user name.[4]

so non-math italics/bold, such as in biological species, would need a non-Unicode solution.

Answer 67 · 2022-09-16T12:49:33.000Z

Hi @asmecher and @NateWr, reviewing this large and ancient discussions and just to tend to wrap up a bit: Unicode support is a partial solution to this problem. Bold and Italic would need to twist TinyMCE to have it work with Title fields as discribed above in #2564 (comment)
Do we know if new version of TinyMCE supports single-line input? Could this be worth checking back with the TinyMCE community?

Answer 68 · 2023-01-30T09:43:30.000Z

@asmecher @NateWr PRs
pkp-lib --> #8584
ui-library --> pkp/ui-library#252
ojs --> pkp/ojs#3731

For now this PR only integrate enhancement to add a minimal HTML editor (only bold, italic, superscript and subscript) and ability to render those to views and parsed . Still need work to handle metadata deposition part .

@asmecher need some overview about the metadata deposition and code section high level overview . @NateWr need some help on the UI part.

Answer 69 · 2023-01-30T16:27:53.000Z

@touhidurabir I've added some initial comments based on a read of the code. Let me know when you're ready for me to test it all out in the browser. 👍

Answer 70 · 2023-02-03T18:05:41.000Z

@touhidurabir and I were talking and realized that we need to consider how prefix and subtitle fields are handled, and what happens when these are mixed into the "full title" field. (The combined "full title" is currently served via the API, Dublin Core and other metadata crosswalks, various parts of the UI, etc.)

I propose:

The subtitle field should get the same rich text treatment as the title field; the prefix field should remain text-only.
The PKPPublication::getLocalizedFullTitle and PKPPublication::getFullTitles functions should get a new optional parameter $format, supporting both text and HTML as options, and defaulting to text (for backwards compatibility).
The deprecated PKPSubmission::getLocalizedFullTitle, PKPSubmission::getFullTitle functions should remain unchanged (thus always returning text), but where they're used in the main codebase, they should be replaced by equivalent PKPPublication:: calls. (And I would be open to removing these functions entirely, since they've been deprecated since 3.2.0.)

Answer 71 · 2023-02-06T09:49:45.000Z

@NateWr The UI part is pretty much done , so you can try it out to see if any more improvement is still required. I am still looking into possible way to run a validation rather than run a sanitization process . However we sill need to finalise the proposal by @asmecher .

Answer 72 · 2023-02-07T09:34:15.000Z

supporting both text and HTML as options, and defaulting to text (for backwards compatibility).

@asmecher having the text as default for backward compatibility, we will be force to update the call of getLocalizedFullTitle and getFullTitles in a lot of places in the current code base as

...->getLocalizedFullTitle(null, 'html');
//or
...->getFullTitles(null, 'html')

if we set the format default to html, will there be too much breaking for the backward compatibility?

Answer 73 · 2023-02-07T14:18:53.000Z

@asmecher changes are pushed based on your proposal . Please review

Answer 74 · 2023-02-07T22:35:33.000Z

if we set the format default to html, will there be too much breaking for the backward compatibility?

Yes, I fear this will cause mass destruction e.g. for third-party code. I'd rather leave the defaults backwards-compatible (text) and explicitly specify where HTML is allowed.

Feel free to start using named parameters rather than reiterating default values. For example, rather than:

$publication->getLocalizedFullTitle(null, 'html');

...you could use...

$publication->getLocalizedFullTitle(format: 'html');

I'd prefer not to have 'html' hard-coded each time; historically we've used constants, and once we set a baseline of PHP8.1 we'll be able to use enumerations. (Unfortunately enumerations aren't extensible, e.g. by plugins, so we'll have to use them cautiously.)

Answer 75 · 2023-02-13T15:20:58.000Z

@touhidurabir, I've prepared some PRs with the UX changes we discussed here:

touhidurabir#3
touhidurabir/ui-library#1

Answer 76 · 2023-02-13T17:22:40.000Z

@NateWr I noticed that this PR depend on a PR for tinymce where you added a new css file content_oneline.css . But it still hasn't merge to tinymce . So If i merge it right now , will it work as expected or the pkp/pkp-lib#2564 Add stylesheet for oneline TinyMCE field need to get merge also to make it work ?

Answer 77 · 2023-02-13T17:29:43.000Z

Sorry @touhidurabir! I've converted that PR so it can be merged now. Are you able to merge it? If not, you might want to cherry-pick that commit in to your own fork for now.

Answer 78 · 2023-02-13T20:19:10.000Z

@asmecher I have merged the PR by @NateWr . You can test it now .

Answer 79 · 2023-02-15T01:37:34.000Z

@touhidurabir and @NateWr, this looks and behaves wonderfully. I've taken a quick spin through a few import/export formats and the OAI interface and it looks to be working well there. We'll definitely do some testing during RC1 and RC2 but I'd be happy for us to get this merged. @touhidurabir, are you still working on aspects of it?

Answer 80 · 2023-02-15T11:05:56.000Z

@asmecher no more work is going on this one. I have rebased it all . Before rebase all tests was green but now failing for OJS and OMP for the cypress/tests/integration/Statistics.cy.js and reason for that seems to be the one you mentioned in MatterMost . If all test pass, it good to merge .

Answer 81 · 2023-02-16T00:34:27.000Z

All merged except pkp/oaiJats#33 -- there's a comment there. Thanks, @touhidurabir!

Answer 82 · 2023-02-16T19:30:22.000Z

@asmecher check updated the PR at pkp/oaiJats#33 for OAI Jats .

Answer 83 · 2023-02-16T21:00:15.000Z

All merged! Thanks again, @touhidurabir.

Answer 84 · 2023-03-03T06:45:18.000Z

@asmecher please review the ONIX update for OMP at pkp/omp#1355