brightwind-dev/brightwind

[LoadBrightHub] Add new function to pull the cleaning log from BrightHub

stephenholleran opened this issue · 3 comments

The BrightHub wind resource data management platform is opening up a new API to allow users to pull the cleaning logs for a particular measurement station. Without this new function, a user would have to log into BrightHub and manually download the cleaning log file and then load that file into the brightwind library.

I expect the new function would look something like the below:

def get_cleaning_log(measurement_station_uuid):
    """
    Get the cleaning logs for the measurement station.
    
    :param measurement_station_uuid: A specific measurement station's uuid.
    :type measurement_station_uuid:  str
    :return:                         The cleaning logs for the measurement station.
    :rtype:                          pd.DataFrame
    """
    ........
    return cleaning_log_df

And be used this way:

# Get the cleaning logs and timeseries data from BrightHub.
cleaning_log = bw.LoadBrightHub.get_cleaning_log(measurement_station_uuid='9344e576-6d5a-45f0-9750-2a7528ebfa14')
data = bw.LoadBrightHub.get_data(measurement_station_uuid='9344e576-6d5a-45f0-9750-2a7528ebfa14')

# Apply the cleaning logs to the data resulting in a dataset that is ready to work with.
data_clnd = bw.apply_cleaning(data, cleaning_log, sensor_col_name='MeasurementName',
                              date_from_col_name='DateFrom', date_to_col_name='DateTo')

The new URL is yet to be released.

@dancasey-ie, @shwetajoshi601 what is URL that I should use to pull the cleaning logs for a particular measurement station?

Suggested code:

import requests
import pandas as pd
from io import StringIO
import numpy as np

def get_cleaning_log(measurement_station_uuid):
    """
    Get the cleaning log from BrightHub for a particular measurement station.

    :param measurement_station_uuid: A specific measurement station's uuid.
    :type measurement_station_uuid:  str
    :return:                         The cleaning logs for the measurement station.
    :rtype:                          pd.DataFrame
    """
    response = bw.LoadBrightHub._brighthub_request(url_end="/measurement-locations/{}/cleaning-log"
                                                   .format(measurement_station_uuid))
    response_json = response.json()
    if 'Error' in response_json:  # catch if error comes back e.g. measurement_location_uuid isn't found
        raise ValueError(response_json['Error'])

    pre_signed_url = response_json["url"]
    cleaning_log_response = requests.get(pre_signed_url)
    return pd.read_csv(StringIO(cleaning_log_response.text,))