jgm/pandoc-types

Filters that alter document structure

Closed this issue · 1 comments

I have a complex html document that I've read into pandoc, and I'm trying to write filters that will isolate the content I'm after. Some examples of this are dropping certain Divs entirely, or replacing Tables by just the content of their rows.

I can write filters of the type Pandoc -> Pandoc which is workable for changing the top-level structure of a document, but would become very tedious when Blocks are nested. I could also write my functions to return Null :: Block when removing Blocks, but that doesn't feel like the right way to do it. Or is that perhaps precisely why Null is there in the first place?

I'd like to be able to write functions of type [Block] -> [Block] to use in filters, but I get the error

No instance for (Text.Pandoc.Walk.Walkable [Block] Pandoc).

I tried to think about how to write that instance, but it's hard to combine walking lists of Blocks with applying the function to them.

So I feel I'm either missing something obvious, or going about it in the wrong way. Is there perhaps a simple solution to what I want to do? (Sorry if this is a stupid or naive question)

jgm commented

This is really a question rather than a bug report, and should go on
pandoc-discuss, where many people can help (and also benefit from any
answers that are given)! If a concrete suggestion for a change to
pandoc-types comes out of the discussion, then you could put it here.

+++ Mark Szepieniec [Nov 15 14 14:48 ]:

I have a complex html document that I've read into pandoc, and I'm trying to write filters that will isolate the content I'm after. Some examples of this are dropping certain Divs entirely, or replacing Tables by just the content of their rows.

I can write filters of the type Pandoc -> Pandoc which is workable for changing the top-level structure of a document, but would become very tedious when Blocks are nested. I could also write my functions to return Null :: Block when removing Blocks, but that doesn't feel like the right way to do it. Or is that perhaps precisely why Null is there in the first place?

I'd like to be able to write functions of type [Block] -> [Block] to use in filters, but I get the error

No instance for (Text.Pandoc.Walk.Walkable [Block] Pandoc).

I tried to think about how to write that instance, but it's hard to combine walking lists of Blocks with applying the function to them.

So I feel I'm either missing something obvious, or going about it in the wrong way. Is there perhaps a simple solution to what I want to do? (Sorry if this is a stupid or naive question)


Reply to this email directly or view it on GitHub:
#13