csvtype is a Python package with pybind11 bindings to C++ code that can infer what column types a CSV consists of. It can do so by processing the file line by line so that memory usage is very limited while still being fast enough.
The column types can be defined using PCRE regex patterns.
You can find the documentation at: https://csvtype.readthedocs.io/en/latest/
Requirements: MacOS >= 10.13 or (basically) any Linux distro (i686 or x86-64). 3.6 <= Python <= 3.8
Run
pip install csvtype
Requirements: MacOS or (basically) any Linux distro and boost dynamic libraries and headers (boost and boost-devel) and of course a >= C++11 compatible clang/g++ compiler.
- Clone this repository
- Run:`
python setup.py build
pip install -e .
TODO
Create a virtual environment first.
python3 -m venv venv
source venv/bin/activate
install necessary packages
pip install -r requirements.txt
Documentation is auto generated from NumPy style doc strings in csvtype/inferencer.py
.
To build:
cd csvtype/docs
make html
Run make lint
from the root folder.
Run make test
from the root folder.