Annoyed that Pandoc doesn't correctly handle figure labels? Well, this module is just for you!
ltmd
uses regex to extract figures, references, and mathematics, and
processes them separately to Pandoc so that the figure references, etc. are
preserved.
Use:
python3 preprocess.py <input> <output>
for example to generate the test markdown, we use
python3 preprocess.py test.tex test.md
The module can also be used through an API, through the two objects that are given.
One should use:
pre_processed = ltmd.PreProcess(input_text)
pandocced = ltmd.run_pandoc(pre_processed.parsed_text)
post_processed = ltmd.PostProcess(pandocced, pre_processed.parsed_data)
The final output string can then be extracted by using post_processed.parsed_text
.
It is also possible to use a wrapper function in ltmd
from markdown,
ltmd.inputoutput.parse_file(input_filename, output_filename)
python3
(nopython2
version will ever be made available)pandoc
somewhere in your path.pypandoc
-
Preprocessing
python3 preprocess.py input.tex output.md debug
-
Remove labels in figure In sublime text, open output.md Find using RegEx:
\\label\{fig:(.*?)\}
replace to: -
Change figure reference In sublime text, open output.md Find using RegEx:
\[@fig:(.*?)\]
replace to:{@fig:$1}
-
Align figure in center In sublime text, open output.md Find using RegEx:
!\[(.*?)\]\((.*?)\){#fig:(.*?)}
replace to:<div align=center>\n![$1]\($2){#fig:$3}\n</div>
-
Fix multi-citations In sublime text, open output.md Find using RegEx:
\[@(.*?), (.*)\]
replace to:[@$1; @$2]
Several times!!! -
Convert to docx pandoc --filter pandoc-fignos --filter pandoc-citeproc --bibliography=mybib.bib --csl=elsevier-harvard.csl output.md -o output.docx