Filters that alter document structure
Closed this issue · 1 comments
I have a complex html document that I've read into pandoc, and I'm trying to write filters that will isolate the content I'm after. Some examples of this are dropping certain Divs entirely, or replacing Tables by just the content of their rows.
I can write filters of the type Pandoc -> Pandoc
which is workable for changing the top-level structure of a document, but would become very tedious when Block
s are nested. I could also write my functions to return Null :: Block
when removing Block
s, but that doesn't feel like the right way to do it. Or is that perhaps precisely why Null
is there in the first place?
I'd like to be able to write functions of type [Block] -> [Block]
to use in filters, but I get the error
No instance for (Text.Pandoc.Walk.Walkable [Block] Pandoc).
I tried to think about how to write that instance, but it's hard to combine walking lists of Blocks with applying the function to them.
So I feel I'm either missing something obvious, or going about it in the wrong way. Is there perhaps a simple solution to what I want to do? (Sorry if this is a stupid or naive question)
This is really a question rather than a bug report, and should go on
pandoc-discuss, where many people can help (and also benefit from any
answers that are given)! If a concrete suggestion for a change to
pandoc-types comes out of the discussion, then you could put it here.
+++ Mark Szepieniec [Nov 15 14 14:48 ]:
I have a complex html document that I've read into pandoc, and I'm trying to write filters that will isolate the content I'm after. Some examples of this are dropping certain Divs entirely, or replacing Tables by just the content of their rows.
I can write filters of the type
Pandoc -> Pandoc
which is workable for changing the top-level structure of a document, but would become very tedious whenBlock
s are nested. I could also write my functions to returnNull :: Block
when removingBlock
s, but that doesn't feel like the right way to do it. Or is that perhaps precisely whyNull
is there in the first place?I'd like to be able to write functions of type
[Block] -> [Block]
to use in filters, but I get the errorNo instance for (Text.Pandoc.Walk.Walkable [Block] Pandoc).
I tried to think about how to write that instance, but it's hard to combine walking lists of Blocks with applying the function to them.
So I feel I'm either missing something obvious, or going about it in the wrong way. Is there perhaps a simple solution to what I want to do? (Sorry if this is a stupid or naive question)
Reply to this email directly or view it on GitHub:
#13