JessicaTegner/pypandoc

Feature request: compatibility with pathlib.Path

RensDimmendaal opened this issue · 1 comments

Hey there, thanks for the great project!

I'm using it now and I run into the issue that the project is not compatible with pathlib.Path (docs).

When I run this:

from pathlib import Path
import pypandoc
fpath = Path("/path/to/my/file.docx")
output = pypandoc.convert_file(fpath,"markdown_strict")

I get this back:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Input In [27], in <cell line: 1>()
----> 1 output = pypandoc.convert_file(fpath,"markdown_strict")

File ~/.pyenv/versions/miniforge3-4.10.3-10/envs/my-project/lib/python3.9/site-packages/pypandoc/__init__.py:155, in convert_file(source_file, to, format, extra_args, encoding, outputfile, filters, verify_format, sandbox, cworkdir)
    150     return _convert_input(discovered_source_files[0], format, 'path', to, extra_args=extra_args,
    151                       outputfile=outputfile, filters=filters,
    152                       verify_format=verify_format, sandbox=sandbox,
    153                       cworkdir=cworkdir)
    154 else: # behavior for multiple  files or file patterns
--> 155     format = _identify_format_from_path(discovered_source_files[0], format)
    156     return _convert_input(discovered_source_files, format, 'path', to, extra_args=extra_args,
    157                       outputfile=outputfile, filters=filters,
    158                       verify_format=verify_format, sandbox=sandbox,
    159                       cworkdir=cworkdir)

I'm pretty sure it's because the pypandoc checks if fpath is either a string or a list. But it'd be nice if it would also accept pathlib.Path objects. Both for files as for pathlib generators (e.g. pathlib.Path("my/dir/").glob("*.docx")).

Pathlib has some other nice benefits. For example you can also check if the file exists (Path.exists()).

As a workaround I just convert the Paths to strings in my own code.

If you're open to this change I'd be happy to submit a pull request.

Hi there.

Yes I'm very much open to this.
I think the solution here, is to just to check if it's a pathfile object and if it is, to convert it to a string.