/DB-GPT

Primary LanguageJavaScriptApache License 2.0Apache-2.0

LLM As Database Administrator

NewsFeaturesQuickStartCasesCommunityRelevant Projects



🧗 Database administrators (DBAs) play a crucial role in managing, maintaining and optimizing a database system to ensure data availability, performance, and reliability. However, it is hard and tedious for DBAs to manage a large number of database instances. Thus, we propose D-Bot, a LLM-based database administrator that can acquire database maintenance experience from textual sources, and provide reasonable, well-founded, in-time diagnosis and optimization advice for target databases.

What's New

Features

  • Well-Founded Diagnosis: D-Bot can provide founded diagnosis by utilizing relevant database knoweledge (with document2experience).

  • Practical Tool Utilization: D-Bot can utilize both monitoring and optimization tools to improve the maintenance capability (with tool learning and tree of thought).

  • In-depth Reasoning: Compared with vanilla LLMs, D-Bot will achieve competitive reasoning capability to analyze root causes (with multi-llm communications).



A demo of using D-Bot

db_diag.mp4

QuickStart

Current version is developed from agentverse and bmtools, to which we previously contributed.



Prerequisites

  • PostgreSQL v12 or higher

    Add database settings into config.ini and rename into my_config.ini:

    [postgresql]
    host = xxx.xxx.xxx.xxx
    port = 5432
    user = xxx
    password = xxx
    dbname = postgres

    Additionally, install extensions like pg_stat_statements (track slow queries) and pg_hint_plan (optimize physical operators)

  • Prometheus and Grafana (tutorial)

Installation

Step1: Install python packages.

pip install -r requirements.txt

Step2: Configure environment variables.

  • Export your OpenAI API key
# Export your OpenAI API key
export OPENAI_API_KEY="your_api_key_here"
  • If accessing openai service via vpn, execute this command:
export https_proxy=http://127.0.0.1:7890 http_proxy=http://127.0.0.1:7890 all_proxy=socks5://127.0.0.1:7890

Preparation

Diagnosis Knowledge

  • Extract knowledge from both code (./knowledge_json/knowledge_from_code) and documents (./knowledge_json/knowledge_from_document).

Tool Usage

  • Extract hundreds of tool APIs to carry out different monitoring/optimization functions (./tool_learning/tool_apis).

    • Check the update results and sync to api.py.
  • Start bmtools service.

cd tool_learning/bmtools
python host_local_tools.py
  • Test tool usage (basic tree of thought algorithm within ./tool_learning/tree_of_thought).
cd tool_learning/tree_of_thought
python test_database.py

History messages may take up many tokens, and so carefully decide the turn number.

Anomaly Generation & Detection

Within the anomaly_scripts directory, we offer scripts that could incur typical anomalies, e.g.,

(1) ./run_benchmark_tpcc.sh or ./run_db_exception.sh

Example Anomalies: INSERT_LARGE_DATA, IO_CONTENTION
monitoring dashboard


(2) ./run_benchmark_job.sh

Example Anomalies: POOR_JOIN_PERFORMANCE, CPU_CONTENTION
monitoring dashboard


(3) ./run_benchmark_tpch.sh

Example Anomalies: FETCH_LARGE_DATA (lineitem with 28GB); CORRELATED_SUBQUERY
monitoring dashboard

Diagnosis & Optimization

Command-line Interface

python3 main.py --task db_diag

Website Interface

We also provide a local website demo for this environment. You can launch it with

python3 main_demo.py --task db_diag

After successfully launching the local server, you can visit http://127.0.0.1:7860/ to trigger the diagnosis procedure (click the Start Autoplay button).

Prompt Template Generation

Derive high-quality prompt templates from a small number of collected samples (splitting into training and evaluation sets), e.g.,

  1. Prompts for index tuning optimization
cd prompt_template_scripts/index_tuning
./run.sh
  1. Prompts for query rewrite optimization
cd prompt_template_scripts/query_rewrite
./run.sh

Cases

INSUFFICIENT_MEMORY_ALLOCATION

POOR_JOIN_PERFORMANCE
case_poor_join5.mp4

Todo

  • Project cleaning
  • None-resource problem diagnosis
  • (framework update) Integrate components as a whole
  • Public generated anomaly training data
  • Fine-tune open-source Model
  • Support other databases like MySQL
  • Collect more knowledge and store in vector db (./knowledge_vector_db)

The listed items are urgent, which we will fix within this month.

Community

Relevant Projects

https://github.com/OpenBMB/AgentVerse

https://github.com/OpenBMB/BMTools

https://github.com/OpenBMB/ToolBench