[LoadBrightHub] Add new function to pull the cleaning log from BrightHub
stephenholleran opened this issue · 3 comments
stephenholleran commented
The BrightHub wind resource data management platform is opening up a new API to allow users to pull the cleaning logs for a particular measurement station. Without this new function, a user would have to log into BrightHub and manually download the cleaning log file and then load that file into the brightwind library.
I expect the new function would look something like the below:
def get_cleaning_log(measurement_station_uuid):
"""
Get the cleaning logs for the measurement station.
:param measurement_station_uuid: A specific measurement station's uuid.
:type measurement_station_uuid: str
:return: The cleaning logs for the measurement station.
:rtype: pd.DataFrame
"""
........
return cleaning_log_df
And be used this way:
# Get the cleaning logs and timeseries data from BrightHub.
cleaning_log = bw.LoadBrightHub.get_cleaning_log(measurement_station_uuid='9344e576-6d5a-45f0-9750-2a7528ebfa14')
data = bw.LoadBrightHub.get_data(measurement_station_uuid='9344e576-6d5a-45f0-9750-2a7528ebfa14')
# Apply the cleaning logs to the data resulting in a dataset that is ready to work with.
data_clnd = bw.apply_cleaning(data, cleaning_log, sensor_col_name='MeasurementName',
date_from_col_name='DateFrom', date_to_col_name='DateTo')
The new URL is yet to be released.
BiancaMorandi commented
@dancasey-ie, @shwetajoshi601 what is URL that I should use to pull the cleaning logs for a particular measurement station?
stephenholleran commented
Suggested code:
import requests
import pandas as pd
from io import StringIO
import numpy as np
def get_cleaning_log(measurement_station_uuid):
"""
Get the cleaning log from BrightHub for a particular measurement station.
:param measurement_station_uuid: A specific measurement station's uuid.
:type measurement_station_uuid: str
:return: The cleaning logs for the measurement station.
:rtype: pd.DataFrame
"""
response = bw.LoadBrightHub._brighthub_request(url_end="/measurement-locations/{}/cleaning-log"
.format(measurement_station_uuid))
response_json = response.json()
if 'Error' in response_json: # catch if error comes back e.g. measurement_location_uuid isn't found
raise ValueError(response_json['Error'])
pre_signed_url = response_json["url"]
cleaning_log_response = requests.get(pre_signed_url)
return pd.read_csv(StringIO(cleaning_log_response.text,))