Bug in load_data when using full path
yoeldk opened this issue · 2 comments
yoeldk commented
This code would fail:
full_path = 'C:\\temp\\A\\test.pdf'
documents = pdf_loader.load_data(full_path )
However, if relative path is given it works fine.
It looks like the issue is in file_reader.py:63
is_url = urlparse(path_or_url).scheme != ""
In case of full path the scheme will be the letter of the drive (C in this case) which would make it treat it as a URL instead of a path.
STageAmp commented
I am facing the same problem, did you find any workaround ?
parvpareek commented
you could just change the code and make it:
is_url = urlparse(path_or_url).scheme != "" && len(urlparse(path_or_url).scheme) > 2