Access Excel Tables with Python | Samuel Oranyeli
Opened this issue ยท 18 comments
Access Excel Tables with Python | Samuel Oranyeli
Extract data from a defined table in a spreadsheet
hello! thank you for posting this code....was the only article i could find that shows how to bring in a table from an excel sheet with multiple tables!
i tried using your code and got an error:
AttributeError Traceback (most recent call last)
in
6 ws = wb['Sheet1']
7
----> 8 {key : value for key, value in ws.tables.items()}
9
10 for table in ws._tables:
AttributeError: 'Worksheet' object has no attribute 'tables'
what am i doing wrong?(file exists and there is 2 tables on it--)
from openpyxl import Workbook
from openpyxl import load_workbook
from openpyxl.worksheet.table import Table, TableStyleInfo
wb = load_workbook(filename=r'C:\Users\Keith\Downloads\20220123_pythondlparameters.xlsx')
ws = wb['Sheet1']
{key : value for key, value in ws.tables.items()}
Hi, @brianblakeley. Apologies for replying this late. I just saw your message. Have you resolved the issue? I have wrapped this into a function, that should make it easier to use. Let me know if you still have this issue. Again my sincere apologies for the late reply.
This is brilliant! Thank you very much!
Hello @samukweku. Im trying to use your xlsx.table function from janitor to accomplish the above, however I
m unable to import the xlsx.table function. I have successfully downloaded / installed pyjanitor and import the package into the jupyter notebook, however I can failures when I try to import xlsx.table from janitor. Do you have any advice?
@DanielPerkins472 thanks for getting in touch.
can you share your script?
import janitor as in
jn.xlsx_table
Below is the script that I had used
!pip install pyjanitor
from janitor import xlsx_table
returns error
ImportError Traceback (most recent call last)
in
----> 1 from janitor import xlsx_table
ImportError: cannot import name 'xlsx_table' from 'janitor' (C:\Users\perkind2\Anaconda3\lib\site-packages\janitor_init_.py)
I did more searching and found another thread where it looks like someone else asked a similar question (link below). Your response there was that the version of pyjanitor was outdated and to uninstall and then reinstall the latest version from github. I looked into the io.py file downloaded and it does in fact look like there is no xlsx_table function.
I tried using the pip common to download / install pyjanitor from your below post but was met with the below error
Script
!pip install git+https://github.com/pyjanitor-devs/pyjanitor.git
Returns error
Collecting git+https://github.com/pyjanitor-devs/pyjanitor.git
Cloning https://github.com/pyjanitor-devs/pyjanitor.git to c:\users\perkind2\appdata\local\temp\pip-req-build-ffliyi_i
Running command git clone -q https://github.com/pyjanitor-devs/pyjanitor.git 'C:\Users\perkind2\AppData\Local\Temp\pip-req-build-ffliyi_i'
ERROR: Error [WinError 2] The system cannot find the file specified while executing command git clone -q https://github.com/pyjanitor-devs/pyjanitor.git 'C:\Users\perkind2\AppData\Local\Temp\pip-req-build-ffliyi_i'
ERROR: Cannot find command 'git' - do you have 'git' installed and in your PATH?
https://samukweku.github.io/data-wrangling/python_pandas_access_excel_tables_with_python.html
@DanielPerkins472 Ahhhh ... You are on windows it seems. I think you should install git on your machine then retry the installation. Hopefully soon we will do a release on pyjanitor and you do not have to go through this. Lemme know how it goes
With this method, my excel formulas are being imported instead of the resultant values. Is there a way to fix this?
Hi @YehudaColton can you share a sample of the file, and the code used?
Hi @samukweku, I was able to get your dev version of janitor installed. I am now getting a new error when I try to import the xlsx_table function though. Any thoughts?
from janitor import xlsx_table
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-2-e5fafa9ca390> in <module>
----> 1 from janitor import xlsx_table
~\Github\pyjanitor\janitor\__init__.py in <module>
5 pass
6
----> 7 from .functions import * # noqa: F403, F401
8 from .io import * # noqa: F403, F401
9 from .math import * # noqa: F403, F401
~\Github\pyjanitor\janitor\functions\__init__.py in <module>
17
18
---> 19 from .add_columns import add_columns
20 from .also import also
21 from .bin_numeric import bin_numeric
~\Github\pyjanitor\janitor\functions\add_columns.py in <module>
1 import pandas_flavor as pf
2
----> 3 from janitor.utils import check, deprecated_alias
4 import pandas as pd
5 from typing import Union, List, Any, Tuple
~\Github\pyjanitor\janitor\utils.py in <module>
209 "The keyword argument '{argument}' of '{func_name}' is deprecated."
210 ),
--> 211 error: bool = True,
212 ) -> Callable:
213 """
TypeError: 'type' object is not subscriptable
Hi @DanielPerkins472 let me have a look. This is on a windows machine right? If you have access to a Unix machine, could you try and see if it gives the same error?
Hi @samukweku that's correct this is on a windows machine. I don`t have any Unix experience but will see what I can find out.
@DanielPerkins472 I just tested it on my Unix machine and works fine. unfortunately I do not have access to a windows machine to test it out.
Hi @YehudaColton can you share a sample of the file, and the code used?
wb = load_workbook(file_name, data_only=True)
I modified the code slightly by utilizing this 'data_only' tag and it fixed the issue. Thank you for the response however, as well as the wonderful code.
@DanielPerkins472 if you can, create a virtual environment, install only pyjanitor and see if you still get this error message. Let's isolate it and see if it is an OS issue.
I just tested on a windows VM, in a virtual environment and it works fine.
@samukweku I ended up just using the openpyxl method above in this thread and was successful. I had an outdated version of anaconda / python that was causing issues so I had to completely uninstall / reinstall and then the openpyxl method worked for me.
@samukweku This is a great resource! One query: these solutions require the tables to be "named". Is there some way to pick up contiguous cells as a table from a sheet where they have not been named as a specific table. Hoping I have explained the issue well!