Low-code Python-Based ETL for structured and unstructured data.
English · Changelog · Report Bug · Request Feature
To install amphi-etl
, run the following command:
pip install amphi-etl
Note
If you prefer to install Amphi's Jupyterlab extension, use pip install jupyterlab-amphi
in your environment. More information here.
To start Amphi, simply run:
amphi start
Use the following parameters to specify your workspace (where you can access files and create pipelines on your system) and port to use:
amphi start -w /your/workspace/path -p 8888
Note
Amphi focuses on structured and unstructured data manipulation for data and AI pipelines. It aims to empower data scientists and data engineers to easily develop pipelines with an intuitive low-code interface while generating Python code you can deploy anywhere.
Modern ETL for the AI age:
- 🧑💻 Low-code: Accelerate data and AI pipeline development and reduce maintenance time.
- 🐍 Python-code Generation: Generate native Python code leveraging common libraries such as pandas, DuckDB and LangChain that you can run anywhere.
- 🔒 Private and Secure: Self-host Amphi on your laptop or in the cloud for complete privacy and security over your data.
Structured & Unstructured
- 🔢 Structured - Import data from various sources, including CSV and Parquet files, as well as databases. Transform structured data using aggregation, filters, joins, SQL queries, and more. Export the transformed data into common files or databases.
- 📝 Unstructured - Extract data from PDFs, Word documents, and websites (HTML). Perform parsing, chunking and embedding processing. Load the processed data into vector stores such as Pinecone and ChromaDB.
- 🔁 Convert - Easily convert structured data into unstructured document for vector stores and vice versa for RAG pipelines.
Features In Progress
- Custom components - Add the ability to develop your own component and wrap configured ones
- Implement connections -
Add the ability to securely create connections to reuse in components - Developer documentation - Write comprehensive documentation to allow extensions
- Save Components -
Save components and reuse them in other pipelines
- Use and Innovate: Try Amphi and share your use case with us. Your real-world usage and feedback help us improve our product.
- Voice Your Insights: Encounter a glitch? Have a query? Share them by submitting issues and help us enhance the user experience.
- Shape the Future: Have code enhancements or feature ideas? We invite you to propose pull requests and contribute directly.
Every contribution, big or small, is celebrated. Join us in our mission to refine and elevate the world of ETL for data and AI. 😃
Amphi is available as an extension for Jupyterlab, and Amphi ETL is based on Jupyterlab. Therefore Jupyterlab extensions can be installed on Amphi ETL.
- Jupyterlab - JupyterLab computational environment.
- jupyterlab-git - A Git extension for JupyterLab.
Copyright © 2024 - present Amphi Labs.
This project is ELv2 licensed.