Support Pandoc 3.0
Closed this issue · 8 comments
Pandoc 3.0 and 3.0.1 are associated to pandoc-types 1.23 (>= 1.23, < 1.24
) . The new document model is:
data Pandoc = Pandoc !Meta ![Block]
newtype Meta = Meta {unMeta :: Map Text MetaValue}
data MetaValue
= MetaMap !(Map Text MetaValue)
| MetaList ![MetaValue]
| MetaBool !Bool
| MetaString !Text
| MetaInlines ![Inline]
| MetaBlocks ![Block]
type ListAttributes = (Int, ListNumberStyle, ListNumberDelim)
data ListNumberStyle
= DefaultStyle
| Example
| Decimal
| LowerRoman
| UpperRoman
| LowerAlpha
| UpperAlpha
data ListNumberDelim = DefaultDelim | Period | OneParen | TwoParens
type Attr = (Text, [Text], [(Text, Text)])
newtype Format = Format Text
newtype RowHeadColumns = RowHeadColumns Int
data Alignment
= AlignLeft | AlignRight | AlignCenter | AlignDefault
data ColWidth = ColWidth !Double | ColWidthDefault
type ColSpec = (Alignment, ColWidth)
data Row = Row !Attr ![Cell]
data TableHead = TableHead !Attr ![Row]
data TableBody = TableBody !Attr !RowHeadColumns ![Row] ![Row]
data TableFoot = TableFoot !Attr ![Row]
type ShortCaption = [Inline]
data Caption = Caption !(Maybe ShortCaption) ![Block]
data Cell = Cell !Attr !Alignment !RowSpan !ColSpan ![Block]
newtype RowSpan = RowSpan Int
newtype ColSpan = ColSpan Int
data Block
= Plain ![Inline]
| Para ![Inline]
| LineBlock ![[Inline]]
| CodeBlock !Attr !Text
| RawBlock !Format !Text
| BlockQuote ![Block]
| OrderedList !ListAttributes ![[Block]]
| BulletList ![[Block]]
| DefinitionList ![([Inline], [[Block]])]
| Header !Int !Attr ![Inline]
| HorizontalRule
| Table !Attr
!Caption
![ColSpec]
!TableHead
![TableBody]
!TableFoot
| Figure !Attr !Caption ![Block]
| Div !Attr ![Block]
data QuoteType = SingleQuote | DoubleQuote
type Target = (Text, Text)
data MathType = DisplayMath | InlineMath
data Inline
= Str !Text
| Emph ![Inline]
| Underline ![Inline]
| Strong ![Inline]
| Strikeout ![Inline]
| Superscript ![Inline]
| Subscript ![Inline]
| SmallCaps ![Inline]
| Quoted !QuoteType ![Inline]
| Cite ![Citation] ![Inline]
| Code !Attr !Text
| Space
| SoftBreak
| LineBreak
| Math !MathType !Text
| RawInline !Format !Text
| Link !Attr ![Inline] !Target
| Image !Attr ![Inline] !Target
| Note ![Block]
| Span !Attr ![Inline]
data Citation
= Citation {citationId :: !Text,
citationPrefix :: ![Inline],
citationSuffix :: ![Inline],
citationMode :: !CitationMode,
citationNoteNum :: !Int,
citationHash :: !Int}
data CitationMode = AuthorInText | SuppressAuthor | NormalCitation
Very little changes overall:
!
is a strictness flag that doesn't matter for us, we can filter it out.- No more
Null
block. - A new figure block :
Figure Attr Caption [Block]
.
Need to update:
-
the documentation: API + Pandoc's Markdown.
-
the list of extensions that are recognized (e.g. Jupyter notebooks).
-
add a cookbook for notebooks (get rid of the old, manual one? Or keep it?)
First great library! Thank you. I am playing around with it right now and have this issue:
When using the pandoc.write command:
ERROR:pydoxtools.document_base:problem with extractor 'full_text'
Traceback (most recent call last):
File "/home/dev/git/pydoxtools/pydoxtools/document_base.py", line 513, in x
res = extractor_func._mapped_call(self, *args, config_params=params, **kwargs)
File "/home/dev/git/pydoxtools/pydoxtools/document_base.py", line 157, in _mapped_call
output = self(*args, **mapped_kwargs)
## ▼▼▼▼▼▼▼▼▼ pandoc-relevant part ▼▼▼▼▼▼▼▼▼:
File "/home/dev/git/pydoxtools/pydoxtools/extract_pandoc.py", line 64, in __call__
full_text = pandoc.write(pandoc_document, format=output_format)
File "/home/dev/.cache/pypoetry/virtualenvs/pydoxtools-UuJZOkke-py3.10/lib/python3.10/site-packages/pandoc/__init__.py", line 355, in write
pandoc(options)
File "/home/dev/.cache/pypoetry/virtualenvs/pydoxtools-UuJZOkke-py3.10/lib/python3.10/site-packages/plumbum/commands/base.py", line 113, in __call__
return self.run(args, **kwargs)[1]
File "/home/dev/.cache/pypoetry/virtualenvs/pydoxtools-UuJZOkke-py3.10/lib/python3.10/site-packages/plumbum/commands/base.py", line 252, in run
return p.run()
File "/home/dev/.cache/pypoetry/virtualenvs/pydoxtools-UuJZOkke-py3.10/lib/python3.10/site-packages/plumbum/commands/base.py", line 215, in runner
return run_proc(p, retcode, timeout)
File "/home/dev/.cache/pypoetry/virtualenvs/pydoxtools-UuJZOkke-py3.10/lib/python3.10/site-packages/plumbum/commands/processes.py", line 304, in run_proc
return _check_process(proc, retcode, timeout, stdout, stderr)
File "/home/dev/.cache/pypoetry/virtualenvs/pydoxtools-UuJZOkke-py3.10/lib/python3.10/site-packages/plumbum/commands/processes.py", line 17, in _check_process
proc.verify(retcode, timeout, stdout, stderr)
File "/home/dev/.cache/pypoetry/virtualenvs/pydoxtools-UuJZOkke-py3.10/lib/python3.10/site-packages/plumbum/machines/base.py", line 27, in verify
raise ProcessExecutionError(
plumbum.commands.processes.ProcessExecutionError: Unexpected exit code: 64
Command line: | /usr/bin/pandoc -t markdown -o /tmp/tmph_3tsfz8/output -f json /tmp/tmph_3tsfz8/input.js
Stderr: | JSON parse error: Error in $: Incompatible API versions: encoded with [1,22,2,1] but attempted to decode with [1,23].
All I am doing is calling the
on a previously opened document.
pandoc_document = pandoc.read(raw_content, format="docx")
pandoc.write(pandoc_document, format="markdown")
It works when installing pandoc https://github.com/jgm/pandoc/releases/tag/2.19.2
but if I use this version: https://github.com/jgm/pandoc/releases/tag/3.0.1
I get the error. So I am not sure, if its related to the python library, or pandoc itself?
(forgot to mention, that I am running the latest version of your library 2.3 on an ubuntu 22.04 system)
Hi @yeus! 👋
Thanks for the report! 🙏 It's probably an issue with the python library, since the 2.3 version of this project (the latest available on PyPi) does not support pandoc 3.x yet (see https://boisgera.github.io/pandoc/changelog/).
Last time I had some free time for the project, the conda packaging of pandoc 3.x was not ready, which was a (soft) blocker for me ; I have started to take into account the new document model in the main branch, but nothing has been tested.
Now pandoc 3.1.1 is available for conda and I should be able to make some progress on the tests and release on PyPi.
Cheers,
SB
Hello @yeus,
Could you try again with the current (beta) version of the Pandoc Python Library on PyPi?
$ pip install pandoc==2.4b0
Cheers,
SB
Hello @yeus,
Could you try again with the current (beta) version of the Pandoc Python Library on PyPi?
$ pip install pandoc==2.4b0
Cheers,
SB
will do! will come back in a couple days, as soon as I had the time! thanks for helping out there!
Hello @yeus,
Could you try again with the current (beta) version of the Pandoc Python Library on PyPi?
$ pip install pandoc==2.4b0
Cheers,
SB
Hi, seems to work! 👍
here the versions I am using :
pip install pandoc==2.4b0
pandoc.configuration:
{'auto': True,
'path': '/usr/bin/pandoc',
'version': '3.1.2',
'pandoc_types_version': '1.23'}
python 3.10
Thank you!