Backend repository and playground for experimental datagovmy AI/ML services:
- 👨💻 Open API Documentation Assistant (See in action)
- 📈 MyDataGPT Assistant (Coming Soon)
- Install pyenv and then use it to install the Python version in
.python-version
. - Create virtual environment in root directory with
python -m venv env
- Activate virtual environment.
- Install pip-tools first with
python -m pip install pip-tools
. - Run
make init
to install all dependencies for this project. - Create your own
.env
file from.env.example
. - Navigate to the application folder to run (eg.
src/assistant
) and runuvicorn app:app --reload
- Interact with the chat endpoint is at
/chat
This project has the following dependencies:
Click on the link of the respective projects to find out how to set them up for your environment.
Built to assist developers in getting started with using the data.gov.my open API. The docs assistant is a Retrieval Augmented Generation (RAG) application powered by OpenAI's gpt-3.5-turbo
model. Its data pipeline indexes .mdx
files from the API documentation in the datagovmy-front repository and stores embeddings in a Weaviate vectorstore for retrieval.
This assistant is part of a bigger effort to build a one-stop data assistant for the nation's open data designed to eventually answer data queries and show insights on all data released on data.gov.my. Similar to the docs assistant, it is also an RAG application that leverages a Weaviate vector index loaded with metadata. This is currently under development. Stay tuned!
- Bahasa Malaysia (BM) language support - in our testing with BM queries, GPT-3.5 tends to lean towards responses that sound more like Bahasa Indonesia despite our best efforts in prompting. YMMV, but more work to be done here!
Full understanding of the Data Catalogue API fields (coming soon)
This is an experimental product that utilizes the OpenAI API. It is provided for testing and educational purposes only. The government and its representatives make no warranties or guarantees regarding the accuracy, completeness, or suitability of the information provided by this product.
Thank you for your willingness to contribute to this free and open source project by the Malaysian public sector! When contributing, consider first discussing your desired change with the core team via GitHub issues or discussions!
data.gov.my is licensed under MIT
Copyright © 2023 Government of Malaysia