[BUG] issue just running sample code
Closed this issue · 1 comments
Summary
issue just running sample code
Did you read the FAQ?
- I have read the FAQ
Did you search GitHub Discussions?
- I have searched the discussions
(Optional) PDF URL
No response
About your environment
import tabula
pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
tabula.read_pdf(pdf_path, stream=True)
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
Cell In[7], line 3
1 import tabula
2 pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
----> 3 tabula.read_pdf(pdf_path, stream=True)
File /opt/homebrew/lib/python3.11/site-packages/tabula/io.py:395, in read_pdf(input_path, output_format, encoding, java_options, pandas_options, multiple_tables, user_agent, use_raw_url, pages, guess, area, relative_area, lattice, stream, password, silent, columns, relative_columns, format, batch, output_path, force_subprocess, options)
392 raise ValueError(f"{path} is empty. Check the file, or download it manually.")
394 try:
--> 395 output = _run(
396 tabula_options,
397 java_options,
398 path,
399 encoding=encoding,
400 force_subprocess=force_subprocess,
401 )
402 finally:
403 if temporary:
File /opt/homebrew/lib/python3.11/site-packages/tabula/io.py:82, in _run(options, java_options, path, encoding, force_subprocess)
79 elif set(java_options) - IGNORED_JAVA_OPTIONS:
80 logger.warning("java_options is ignored until rebooting the Python process.")
---> 82 return _tabula_vm.call_tabula_java(options, path)
...
--> 571 raise CalledProcessError(retcode, process.args,
572 output=stdout, stderr=stderr)
573 return CompletedProcess(process.args, retcode, stdout, stderr)
CalledProcessError: Command '['java', '-Djava.awt.headless=true', '-Dfile.encoding=UTF8', '-jar', '/opt/homebrew/lib/python3.11/site-packages/tabula/tabula-1.0.5-jar-with-dependencies.jar', '--stream',
What did you do when you faced the problem?
pip install tabula-py
run the code
ERROR:
Code
import tabula
pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
tabula.read_pdf(pdf_path, stream=True)
Expected behavior
read the table from pdf
Actual behavior
CalledProcessError Traceback (most recent call last)
Cell In[7], line 3
1 import tabula
2 pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
----> 3 tabula.read_pdf(pdf_path, stream=True)
File /opt/homebrew/lib/python3.11/site-packages/tabula/io.py:395, in read_pdf(input_path, output_format, encoding, java_options, pandas_options, multiple_tables, user_agent, use_raw_url, pages, guess, area, relative_area, lattice, stream, password, silent, columns, relative_columns, format, batch, output_path, force_subprocess, options)
392 raise ValueError(f"{path} is empty. Check the file, or download it manually.")
394 try:
--> 395 output = _run(
396 tabula_options,
397 java_options,
398 path,
399 encoding=encoding,
400 force_subprocess=force_subprocess,
401 )
402 finally:
403 if temporary:
File /opt/homebrew/lib/python3.11/site-packages/tabula/io.py:82, in _run(options, java_options, path, encoding, force_subprocess)
79 elif set(java_options) - IGNORED_JAVA_OPTIONS:
80 logger.warning("java_options is ignored until rebooting the Python process.")
---> 82 return _tabula_vm.call_tabula_java(options, path)
...
--> 571 raise CalledProcessError(retcode, process.args,
572 output=stdout, stderr=stderr)
573 return CompletedProcess(process.args, retcode, stdout, stderr)
CalledProcessError: Command '['java', '-Djava.awt.headless=true', '-Dfile.encoding=UTF8', '-jar', '/opt/homebrew/lib/python3.11/site-packages/tabula/tabula-1.0.5-jar-with-dependencies.jar', '--stream',
Related issues
No response
Please follow the issue template. You didn't provide an appropriate answer for "About your environment".