Bike Sharing API
For original README (Introduction of capstone project) visit OLD README
SETUP
Pengaturan environment dapat menggunakan environment.yml
(mengutamakan channel conda-forge
). Untuk pip dapat menggunakan requirements.txt
. Environment ini digunakan saat menggunakan VSCODE (ipykernel) dan test API (requests).
- Install langsung menggunakan conda
conda env create -f "https://raw.githubusercontent.com/taruma/bikeshareapi/main/environment.yml"
TEST FOR RUBRIC
1 [1] Created Flask App
- Menjalankan aplikasi flask
app.py
.
2 [2] Created functionality to read to get specific data from the database
- Mengambil data berdasarkan
id
dari databasetrips
.
3 [4] Created functionality to input new data into each table for the databases
- Memasukkan data
austin_bikeshare_trips_2021.csv
ke databasetrips
.
4 [3] Created static endpoints which return analytical result (must be different from point 2,3)
- Memperoleh daftar tipe subscriber yang diurutkan dari terbanyak hingga tersedikit. Return: JSON (dari
DataFrame
).
def top_subscribers(conn):
query = """
SELECT
trips.subscriber_type,
COUNT(subscriber_type) AS total_subscriber
FROM trips
GROUP BY trips.subscriber_type
ORDER BY total_subscriber DESC
"""
return pd.read_sql_query(query, conn)
# ...
### Static Endpoint
@app.route('/trips/top_subscribers/')
def route_trips_top_subscriber():
conn = make_connection()
top_list = top_subscribers(conn)
return top_list.to_json()
5 [3] Created dynamic endpoints which return analytical result (must be different from point 2,3,4)
- Menampilkan jumlah penggunaan bike pada hari Senin-Minggu pada stasiun tertentu (
start_station_id
), diurutkan dari yang tertinggi hingga terendah. RETURN: JSON (fromDataFrame
) atau Text (jika station tidak ditemukan).
def top_days_start_in_station(start_station_id, conn):
try:
return (
pd.read_sql_query(
f"""
SELECT bikeid, start_time, start_station_id
FROM trips
WHERE trips.start_station_id = {start_station_id}
""",
conn,
parse_dates='start_time'
).assign(
start_day_of_week=lambda x: x.start_time.dt.day_name()
).pivot_table(
index='start_day_of_week',
values='start_time',
aggfunc='count'
).sort_values('start_time', ascending=False)
)
except:
return f"STATION ID {start_station_id} NOT FOUND!"
### ...
### dynamic endpoints
@app.route('/trips/top_days_start_in_station/<station_id>')
def route_top_days_start_in_station(station_id):
conn = make_connection()
top_days = top_days_start_in_station(station_id, conn)
return top_days.to_json() if isinstance(top_days, pd.DataFrame) else top_days
6 [3] Created POST endpoint which receive input data, then utilize it to get analytical result (must be different from point 2,3,4,5)
- Mengirimkan informasi
subscriber_type
dan bulan (dalam bentuk angka00
) dan memperoleh informasi daftar informasi jumlah penggunaan bike, rata-rata + median + maksimum durasi penggunaan bike yang diagregasi setiap tahun dan hari Senin-Minggu. RETURN: JSON (dariDataFrame
).
def top_duration_mean(subscriber_type, month_number, conn):
return (
pd.read_sql_query(
f"""
SELECT *
FROM trips
WHERE
trips.subscriber_type LIKE '{subscriber_type}'
AND trips.start_time LIKE '%-{month_number}-%'
""",
conn,
parse_dates='start_time'
)
.assign(
start_year=lambda df: df.start_time.dt.year,
start_dow=lambda df: df.start_time.dt.day_name()
)
.groupby(['start_year', 'start_dow']).agg({
'bikeid': 'count',
'duration_minutes': ['median', 'mean', 'max'],
})
.droplevel(0, axis='columns')
.sort_values(['start_year', 'mean'], ascending=[False, False], axis='index')
.rename({'median': 'duration_median', 'mean':'duration_mean', 'max':'duration_max'}, axis='columns')
)
# ...
### post endpoint
@app.route('/trips/top_duration_mean', methods=['POST'])
def route_top_duration_mean():
data_input = request.get_json(force=True)
conn = make_connection()
return top_duration_mean(conn=conn, **data_input).to_json()
Kekurangan / TO-DO LIST
- penggunaan nama fungsi yang masih dibilang kurang singkat dan jelas. Seperti fungsi
top_subscriber()
yang sebaiknya menjaditop_subscribers_type()
. - endpoint harusnya menggunakan dash
-
dibandingkan_
. Info: Hyphen, underscore, or camelCase as word delimiter in URIs? - pada endpoint
/trips/top_duration_mean
, seharusnya jangan menggunakan unpacking dictionary pada fungsinya, untuk menghindari error ketika menerima keyword yang tidak ada. - memisahkan logic function ke file yang berbeda.
- validasi data sebelum diproses. (data yang hilang, dlsbnya).
- untuk operasi
count
seharusnya mengacu padabikeid
. - single output endpoint, jadi endpointnya mengeluarkan nilai, bukan
dataframe.to_json()
. - refactor kode. š¤£