chezou/tabula-py

[BUG] issue just running sample code

Closed this issue · 1 comments

Summary

issue just running sample code

Did you read the FAQ?

  • I have read the FAQ

Did you search GitHub Discussions?

  • I have searched the discussions

(Optional) PDF URL

No response

About your environment

import tabula
pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
tabula.read_pdf(pdf_path, stream=True)

---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
Cell In[7], line 3
      1 import tabula
      2 pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
----> 3 tabula.read_pdf(pdf_path, stream=True)

File /opt/homebrew/lib/python3.11/site-packages/tabula/io.py:395, in read_pdf(input_path, output_format, encoding, java_options, pandas_options, multiple_tables, user_agent, use_raw_url, pages, guess, area, relative_area, lattice, stream, password, silent, columns, relative_columns, format, batch, output_path, force_subprocess, options)
    392     raise ValueError(f"{path} is empty. Check the file, or download it manually.")
    394 try:
--> 395     output = _run(
    396         tabula_options,
    397         java_options,
    398         path,
    399         encoding=encoding,
    400         force_subprocess=force_subprocess,
    401     )
    402 finally:
    403     if temporary:

File /opt/homebrew/lib/python3.11/site-packages/tabula/io.py:82, in _run(options, java_options, path, encoding, force_subprocess)
     79 elif set(java_options) - IGNORED_JAVA_OPTIONS:
     80     logger.warning("java_options is ignored until rebooting the Python process.")
---> 82 return _tabula_vm.call_tabula_java(options, path)
...
--> 571         raise CalledProcessError(retcode, process.args,
    572                                  output=stdout, stderr=stderr)
    573 return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command '['java', '-Djava.awt.headless=true', '-Dfile.encoding=UTF8', '-jar', '/opt/homebrew/lib/python3.11/site-packages/tabula/tabula-1.0.5-jar-with-dependencies.jar', '--stream',

What did you do when you faced the problem?

pip install tabula-py

run the code

ERROR:

Code

import tabula
pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
tabula.read_pdf(pdf_path, stream=True)

Expected behavior

read the table from pdf

Actual behavior


CalledProcessError Traceback (most recent call last)
Cell In[7], line 3
1 import tabula
2 pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
----> 3 tabula.read_pdf(pdf_path, stream=True)

File /opt/homebrew/lib/python3.11/site-packages/tabula/io.py:395, in read_pdf(input_path, output_format, encoding, java_options, pandas_options, multiple_tables, user_agent, use_raw_url, pages, guess, area, relative_area, lattice, stream, password, silent, columns, relative_columns, format, batch, output_path, force_subprocess, options)
392 raise ValueError(f"{path} is empty. Check the file, or download it manually.")
394 try:
--> 395 output = _run(
396 tabula_options,
397 java_options,
398 path,
399 encoding=encoding,
400 force_subprocess=force_subprocess,
401 )
402 finally:
403 if temporary:

File /opt/homebrew/lib/python3.11/site-packages/tabula/io.py:82, in _run(options, java_options, path, encoding, force_subprocess)
79 elif set(java_options) - IGNORED_JAVA_OPTIONS:
80 logger.warning("java_options is ignored until rebooting the Python process.")
---> 82 return _tabula_vm.call_tabula_java(options, path)
...
--> 571 raise CalledProcessError(retcode, process.args,
572 output=stdout, stderr=stderr)
573 return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command '['java', '-Djava.awt.headless=true', '-Dfile.encoding=UTF8', '-jar', '/opt/homebrew/lib/python3.11/site-packages/tabula/tabula-1.0.5-jar-with-dependencies.jar', '--stream',

Related issues

No response

Please follow the issue template. You didn't provide an appropriate answer for "About your environment".