/marly

Context-aware structured outputs. Search your documents or the web for specific data and get it back in JSON or Markdown.

Primary LanguagePythonOtherNOASSERTION

Marly

PyPI version Discord

FeaturesWhat is a Schema?Use CasesGetting StartedDocumentation


Marly allows your agents to extract tables & text from your PDFs, Powerpoints, etc in a structured format making it easy for them to take subsequent actions (database call, API call, creating a chart etc).

Marly Logo


🚀 Features

  • 📄 Give your agents the ability to find whats relevant from large documents, extract it and get it back in JSON with a single API call.
  • 🔍 Extract data based on multiple schemas from numerous documents without a vector database or specifying page numbers
  • 🔄 Built-in caching to enable instant retrieval of previously extracted schemas, allowing for rapid repeat extractions without reprocessing the original documents.

🧰 What is a Schema?

A schema is a set of key-value pairs describing what needs to be extracted from a particular document (JSON format).

📋 Example Schema
{
    "Firm": "The name of the firm",
    "Number of Funds": "The number of funds managed by the firm",
    "Commitment": "The commitment amount in millions of dollars",
    "% of Total Comm": "The percentage of total commitment",
    "Exposure (FMV + Unfunded)": "The exposure including fair market value and unfunded commitments in millions of dollars",
    "% of Total Exposure": "The percentage of total exposure",
    "TVPI": "Total Value to Paid-In multiple",
    "Net IRR": "Net Internal Rate of Return as a percentage"
}

🎯 Use Cases

💼 Financial Report Analysis 📊 Customer Feedback Processing 🔬 Research Assistant 🧠 Legal Contract Parsing
Extract key financial metrics from quarterly PDF reports Categorize feedback from various document types Process research papers, extracting methodologies and findings Extract key legal terms and conditions from contracts

🛠️ Getting Started

Install the Python Package


To install the python package, run the following command:

pip install marly

Build the Platform


To build the platform from source, run the following command:

docker-compose up --build

Run an Example Extraction or Notebook

  1. Navigate to the example scripts/example notebooks folder:

    cd example_scripts

    or

    cd example_notebooks
  2. Run the example extraction script:

    python azure_example.py

📚 Documentation

For more detailed information, please refer to our documentation.


🤝 Contributing

We welcome contributions! Please see our Contributing Guide for more details.

📄 License

This project is licensed under the MIT License.