/HistoricalWeatherTW

台灣歷史天氣爬蟲

Primary LanguagePythonMIT LicenseMIT



1   HistoricalWeatherTW 台灣歷史天氣爬蟲

This script is to crawl the information of 觀測資料查詢 website

Data from 觀測資料查詢系統

2   Usage

  1. Prepare station.csv.

    Note

    station.csv (you can find it from the https://e-service.cwb.gov.tw/wdps/obs/state.htm)

  2. run the function of collect_weather_tw you will see the result!

    def collect_weather_tw(station_csv_path: Path, output_path,
                           end_date: datetime.date, begin_date: datetime.date,
                           query_format,
                           convert2num):

You can refer to __init__.py for more help

  1. Prepare config.yaml and use this path as input parameter
  2. run __init__.py

2.1   QUICKLY START

from Carson.Tool.HistoricalWeatherTW import collect_weather_tw, QueryFormat
from pathlib import Path
import datetime
import os

if __name__ == '__main__':
    STATION_CSV = '../config/CSV/station_test.csv'
    OUTPUT_PATH = Path(f'../temp/year_result.csv')
    BEGIN_DATE = datetime.date(2019, 10, 1)
    END_DATE = datetime.date(2019, 10, 2)
    QUERY_FORMAT = QueryFormat.DAY
    CONVERT2NUM = True
    collect_weather_tw(Path(STATION_CSV), OUTPUT_PATH,
                       BEGIN_DATE, END_DATE,
                       QUERY_FORMAT,
                       CONVERT2NUM)
    os.startfile(OUTPUT_PATH)

3   Data

The output depends on QueryFormat!

Note

There have some non-numeric forms of the original data. Such as indefinite wind direction V, rain track T, etc. the variable of CONVERT2NUM will replace them with numbers.

4   Release note

4.1   v4.0

Encapsulated as API

4.2   v3.0

Features:

  • All outputs into a single file. (making it easier to use for SQL)
  • The output header field will automatically grab the content on the web page (not use hard coding)
  • You can choose the type of query (year, month, day) according to your needs.

Other:

  • Make the code easier to read.

4.3   V2.0

加入全台觀測站

4.4   V1.0

第一版