/docs2df

Convert docx tables to pandas dataframe

Primary LanguagePythonMIT LicenseMIT

docs2df

An opinioned module that allows you to:

  • Convert tables from multiple .docx with numeric values to pandas dataframes
  • Preprocess those values based on columns and rows textual values
  • Aggregate tables from cross documents based on it's context
  • Normalize columns names

See sample.py for more details