This repository provides a library for the VetBiz PowerBI dashboard's Python scripts.
- Python 3.9 or above
- Poetry
pip install poetry
poetry install
poetry build
Built wheel file is located in dist
folder
from vetbiz_extractor.utils.common import (
fetch_data_in_batches,
fetch_xero_journals_data_from_etani,
)
from vetbiz_extractor.core.insights_extractor import (
get_follow_up_consults,
get_dental_sales_after_consultation,
get_lapsed_clients,
get_filtered_active_customers,
)
# Etani database connection details
etani_db_user = "..."
etani_db_password = "..."
etani_db_server = "..."
etani_db_name = "TAZTECH_d434a453-c859-41d1-93b7-f367a1572792"
journals_table_names = [
"TAZTECH_CLIENT3_XEROBLUE_Journals",
"TAZTECH_CLIENT4_XEROBLUE_Journals",
"TAZTECH_CLIENT5_XEROBLUE_Journals",
"TAZTECH_CLIENT6_XEROBLUE_Journals",
"TAZTECH_CLIENT7_XEROBLUE_Journals",
"TAZTECH_CLIENT8_XEROBLUE_Journals",
"TAZTECH_CLIENT9_XEROBLUE_Journals",
]
A_Journals = fetch_xero_journals_data_from_etani(
etani_db_server,
etani_db_user,
etani_db_password,
etani_db_name,
journals_table_names,
)
# VetBiz data warehouse database connection details
db_user = "..."
db_password = "..."
db_host = "..."
db_name = "..."
sales_query
SELECT s.*,
p.practice_name,
c.clinic_name,
cu.customer_id,
cu.name,
d.date_field AS invoice_date,
(s.unit_cost + s.fixed_cost) AS total_cost,
(s.unit_sale + s.fixed_sale) AS total_sale,
d.year,
d.month,
d.month_desc,
d.month_short_desc,
pr.product_name
FROM f_sales s
LEFT JOIN d_practice p ON p.practice_tk = s.practice_tk
LEFT JOIN d_clinic c ON c.clinic_tk = s.clinic_tk
LEFT JOIN d_customer cu ON cu.customer_tk = s.customer_tk
LEFT JOIN d_product pr ON pr.product_tk = s.product_tk
LEFT JOIN d_date d ON d.date_tk = s.invoice_date_tk;
sales_data = fetch_data_in_batches(
query=sales_query,
db_host=db_host,
db_user=db_user,
db_password=db_password,
db_name=db_name,
)
customers_query
SELECT cu.*, d.date_field, d.year, d.month, d.month_desc, d.month_short_desc, p.practice_name, c.clinic_name
FROM d_customer cu
LEFT JOIN d_date d ON d.date_tk = cu.created_tk
LEFT JOIN d_practice p ON p.practice_tk = cu.practice_tk
LEFT JOIN d_clinic c ON c.clinic_tk = cu.clinic_tk
WHERE cu.active = 1 AND cu.created_tk IS NOT NULL;
customers_data = fetch_data_in_batches(
query=customers_query,
db_host=db_host,
db_user=db_user,
db_password=db_password,
db_name=db_name,
)
Customers from f_sales
table
customers_from_sales_data_query
SELECT s.visit_id, s.sale_id, s.clinic_tk, cus.customer_tk, cus.customer_id, cus.active,
pat.patient_id, p.practice_name, c.clinic_name, u.user_name, pr.product_name,
s.unit_cost, s.fixed_cost, s.unit_sale, s.fixed_sale, d.date_tk, d.date_field,
d.month, d.year, u.clinician
FROM f_sales s
JOIN d_user u ON u.user_tk = s.user_tk
JOIN d_customer cus ON cus.customer_tk = s.customer_tk
JOIN d_clinic c ON c.clinic_tk = s.clinic_tk
JOIN d_practice p ON p.practice_tk = c.practice_tk
JOIN d_product pr ON pr.product_tk = s.product_tk
JOIN d_date d ON d.date_tk = s.invoice_date_tk
JOIN d_patient pat ON pat.patient_tk = s.patient_tk
# Get filtered active customers
customers_from_sales_data = fetch_data_in_batches(
query=customers_from_sales_data_query,
db_host=db_host,
db_user=db_user,
db_password=db_password,
db_name=db_name,
)
# requires sales data as input
data = get_follow_up_consults(sales_data)
# requires sales data as input
data = get_dental_sales_after_consultation(sales_data)
# requires sales data as input
data = get_lapsed_clients(sales_data)
# requires customers from sales data as input
data = get_filtered_active_customers(customers_from_sales_data)
Fetches data from multiple Xero journals tables in the Etani SQL database and combines them into a single DataFrame.
Parameters:
db_server (str)
: The Etani database server address.db_user (str)
: The username for the Etani's database.db_password (str)
: The password for the Etani's database user.db_name (str)
: The name of the Etani's database.journals_tables_list (List[str])
: A list of journal table names to fetch data from.batch_size (int)
: An batch size for fetching data.query_limit (Optional[int])
: An optional limit on the number of rows per table.
Returns:
pd.DataFrame
: A DataFrame containing the combined data from the specified journal tables.
Fetch data from the database in batches.
Parameters:
query (str)
: SQL query to execute.db_user (str)
: Database user.db_password (str)
: Database password.db_host (str)
: Database host.db_port (int)
: Database port (default is 3306).db_name (str)
: Database name.batch_size (int)
: Number of rows to fetch per batch (default is 10000).
Returns:
pd.DataFrame
: A DataFrame with the fetched data.
Filter the sales data to retrieve follow-up consults within a specified days threshold.
Parameters:
sales_data (pd.DataFrame)
: DataFrame containing sales data.days_threshold (int)
: Number of days to define the follow-up threshold (default is 14 days).
Returns:
pd.DataFrame
: DataFrame filtered for follow-up consults within the specified days threshold.
Filter the sales data to retrieve dental sales made after consultations within a specified days threshold.
Parameters:
sales_data (pd.DataFrame)
: DataFrame containing sales data.days_threshold (int)
: Number of days to define the threshold for sales after consultation (default is 14 days).
Returns:
pd.DataFrame
: DataFrame filtered for dental sales made after consultations within the specified days threshold.
Identify and filter lapsed clients from the sales data. A lapsed client is defined as a client who has not made any purchases since a specified start year.
Parameters:
sales_data (pd.DataFrame)
: Pandas DataFrame containing sales data. It must include a 'last_purchase_date' column with dates of the last purchase.start_year (int)
: integer representing the start year from which to measure inactivity (default is 2018).
Returns:
pd.DataFrame
: A DataFrame filtered to include only the lapsed clients who have not made a purchase since the start year.
Filter the customers who have been active within a specified number of months since a given start year.
Parameters:
customers_from_sales_data_df (pd.DataFrame)
: DataFrame containing customer sales data. It must include a 'last_purchase_date' column.start_year (int)
: integer representing the start year from which to measure activity (default is 2020).months_threshold (int)
: Number of months within which a customer must have made a purchase to be considered active (default is 18 months).
Returns:
pd.DataFrame
: A DataFrame filtered to include only the active customers.