/trex

Enforce structured output from LLMs 100% of the time

Primary LanguagePythonMIT LicenseMIT

Trex

Transformer Regular EXpressions

Transform unstructured to structured data

Trex transforms your unstructured to structured data—just specify a regex or context free grammar and we'll intelligently restructure your data so it conforms to that schema.

Installation

To experiment with Trex, check out the playground.

To install the Python client:

pip install git+https://github.com/automorphic-ai/trex.git

If you'd like to self-host this in your own cloud / with your own model, email us.

Usage

To use Trex, you'll need an API key, which you can get by signing up for a free account at automorphic.ai.

import trex

tx = trex.Trex('<YOUR_AUTOMORPHIC_API_KEY>')
prompt = '''generate a valid json object of the following format:

{
    "name": "string",
    "age": "number",
    "height": "number",
    "pets": pet[]
}

in the above object, name is a string corresponding to the name of the person, age is a number corresponding to the age of the person in inches as an integer, height is a number corresponding to the height of the person, and pets is an array of pets.

where pet is defined as:
{
    "name": "string",
    "species": "string",
    "cost": "number",
    "dob": "string"
}

in the above object name is a string corresponding to the name of the pet, species is a string corresponding to the species of the pet, cost is a number corresponding to the cost of the pet, and dob is a string corresponding to the date of birth of the pet.

given the above, generate a valid json object containing the following data: one human named dave 30 years old 5 foot 8 with a single dog pet named 'trex'. the dog costed $100 and was born on 9/11/2001.
'''

json_schema = {
    "type": "object",
    "properties": {
        "name": {
            "type": "string"
        },
        "age": {
            "type": "number"
        },
        "height": {
            "type": "number"
        },
        "pets": {
            "type": "array",
            "items": [{
                "type": "object",
                "properties": {
                    "name": {
                        "type": "string"
                    },
                    "species": {
                        "type": "string"
                    },
                    "cost": {
                        "type": "number"
                    },
                    "dob": {
                        "type": "string"
                    }
                }
            }]
        }
    }
}

print(tx.generate_json(prompt, json_schema=json_schema).response)
# the above produces:
# {
#     "name": "dave",
#     "age": 30,
#     "height": 58,
#     "pets": [
#         {
#             "name": "trex",
#             "species": "dog",
#             "cost": 100,
#             "dob": "2008-10-27"
#         }
#     ]
# }

Roadmap

  • Structured JSON generation
  • Structured custom CFG generation
  • Structured custom regex generation
  • SIGNIFICANT speed improvements
  • Generation from JSON schema
  • Auto-prompt generation for unstructured ETL
  • More intelligent models

Join our Discord or email us, if you're interested in or need help using Trex, have ideas, or want to contribute.

Follow us on Twitter for updates.