speechmarkdown/speechmarkdown-js

How to remove just Speechmarkdown markup but leave Markdown (i.e. NOT back to plain text)?

Closed this issue · 5 comments

Hi,
Do you have a formatter or technique for either removing only the SpeechMarkdown (SM) markup directly from a Markdown-based document or for the SM markdown to be filtered or linted by external Markdown interpreters?

The idea is, I'd rather not have to duplicate the text of a Markdown document and include SM mark-up within it, then utilise the singular SM+MD text to drive both UI/Web and speech.

Any ideas how I could do this?

Could the SSML XML be name-spaced?

I understand what you are saying: a single doc with MD & SM that can be formatted to SSML for text-to-speech and MD for UI/Web. I have typically used 2 fields in my CMS for that: displayText & speechText. For my skills, the text has been different enough that 2 fields made sense.

Do you have a list of SM tags that you would want to support?

The current implementation of SM would most likely conflict with MD in the same document. If you could figure out a SM -> MD mapping, you could do a formatted since SM is changed into an intermediate form before the formatter is applied.

For example, you could have:
SM -> MD
emphasis: strong -> bold *
emphasis: medium -> italic

If the mappings where limited, a custom formatter might work.

I'm an SSML newbie, so unfortunately I can't suggest a set of tags I'd be want to remove/transform. Basically, I'd be offering the ability for non-IT literate individuals to extend their MD presentations with voiceovers but to not have to duplicate their work.

Now, with a little analysis, as I think you're suggesting, I might like to leave some 'overlapping' SD markup in there, for UI/visual markup rendering, but I'd want to process out '[break:100ms]'-like constructs, so they're not visible in MD renderings.

[Ok, I'm brain-dumping here and this is probably more for me than yourself but] maybe I could use a Remark plugin to remove the SM markup, in a middleware or your formatters:

  • remark-redact — conceal text matching a regex
  • remark-redactable — write plugins to redact content from a Markdown document, then restore it later

I've got some related ideas about possibly renaming/name-spacing/transforming your SM markup during my usage scenario (observing that Markdown is only interpreted text), thus making it easier to identify and remove, kind of like XML name-spacing.

arjan commented

Closing issue due to no activity