LLMIX

Large Language Models for Information eXtraction

A Compilation of Notes on the Use of Large Language Models (LLMs) for Information Extraction

Joseph F. Vergel-Becerra | joefaver.dev

About • Features • Contribute

About

llm-information-extraction is a Python library for training, testing and reporting of the FTL-Pricing predictive models. This Python library is designed to training and generate the machine and deep learning models that predicts base transportation cost of FTL modality in United States & Canada.

Features

llm-information-extraction is built on Python 3.11 with pandas, numpy and scikit-learn, matplotlib, seaborn, plotly among others, to preprocess the data, build the machine learning models, and visualize the results.

For development, the library use:

Formatting with black
Import sorting with isort
Linting with flake8
Git hooks that run all the above with pre-commit
Testing with pytest

Contribute

First, make sure that before enabling pipenv, you must have Python 3.11 installed. If it does not correspond to the version you have installed, you can create a conda environment with:

# Create and activate python 3.9 virutal environment
$ conda create -n py311 python=3.11
$ conda activate py311

Now, you can managament the project dependencies with Pipenv. To create de virtual environment and install all dependencies follow:

# Install pipx if pipenv and cookiecutter are not installed
$ python3 -m pip install pipx
$ python3 -m pipx ensurepath

# Install pipenv using pipx
$ pipx install pipenv

# Create pipenv virtual environment
$ pipenv shell

# Install dependencies
$ pipenv install --dev

Once the dependencies are installed, we need to notify Jupyter of this new Python environment by creating a kernel:

$ ipython kernel install --user --name KERNEL_NAME

joefavergel/llm-information-extraction

LLMIX

Large Language Models for Information eXtraction

About

Features

Contribute