JessicaTegner/pypandoc

Ignoring Alt Text when convert from docx to txt

caphefalumi opened this issue · 1 comments

Currently, when I convert from docx to txt, the alt text of images is retrieved along with the paragraphs as something like "[ALT TEXT]", how do I exclude alt text?
Here is my code
pypandoc.convert_file(docx_path, 'plain', extra_args=['--wrap=none'], outputfile='output.txt')

From the pandoc user guide:

A link immediately preceded by a ! will be treated as an image. The link text will be used as the image’s alt text:
![la lune](lalune.jpg "Voyage to the moon")

![movie reel]

[movie reel]: movie.gif
Extension: implicit_figures
An image with nonempty alt text, occurring by itself in a paragraph, will be rendered as a figure with a caption. The image’s alt text will be used as the caption.
![This is the caption](/url/of/image.png)
[...]
If you just want a regular inline image, just make sure it is not the only thing in the paragraph. One way to do this is to insert a nonbreaking space after the image:
![This image won't be a figure](/url/of/image.png)\