DBEDA is an experimental data analysis framework designed for database performance monitoring in a Jupyter environment. This framework combines server and client components to collect, visualize, and analyze performance data from integrated databases.
- Start the DBEDA server and client using Docker Compose:
docker compose up
To set up the server component, follow these steps:
- Run these commands to set up the server:
service postgresql start cd /root/DBEDA/server pip install -r server_requirements.txt python3 server_collector.py
To set up the client component, follow these steps:
- Run these commands to set up the client:
service postgresql start cd /root/DBEDA/client pip install -r client_requirements.txt jupyter lab --allow-root
Click tutorial.ipynb
Register the configuration of the database for collecting performance data:
from client_side import *
config = connect_db(db_type='postgres', host='dbeda-client', database='test_cli', user='postgres', password='postgres', port='5434')
collect_performance_data(config)
Execute a widget to visualize the collected performance data:
visualize(config)
On the left, you can verify which table the collected performance data is currently stored in.
You can specify the performance table to visualize using the Tables widget and set the time interval with the Time Range widget.
In the Task widget, you can select various database performance analysis tasks. For basic performance metric charts, you can choose the 'metrics' task.
To visualize the performance data, use the Data widget to select the data, specify the type, and click the Draw button. The selected chart will then be added below.
The overall appearance of the visualization component is as follows:
Extract the desired performance data:
data = query_performance_data(config, table='os_metric', metrics='cpu_percent', task='metrics', recent_time_window='1 day')
df_metric = pd.DataFrame(data['metric'])
df
The data collected is displayed in the form of a DataFrame, similar to the image above.
Train a model, retrieve the trained model, and make predictions:
response = train(config, train_df, 'load prediction', pipeline='RNN')
get_trained_model(config, 'load prediction')
predicted = predict(config, 'load prediction', metric='tps', path="darts_TCN_20230523_150814.pickle")
Contributions to the DBEDA framework are welcome. If you have suggestions or improvements, please feel free to open issues or submit pull requests.