Gidari is a "web-to-storage" tool for querying web APIs and persisting the resulting data onto local storage. A configuration file is used to define how this querying and storing should occur. Once you have a configuration file, you can intiate this transport using the command gidari --config <configuration.yml>
. See here for a quick demonstration.
TODO
Using Gidari in command mode is a two step process:
- Create a configuraiton file to instruct the binary on how to make the RESful HTTP requests and where to store the data
- Run
gidari --config your_configuration.yml --verbose
The configuration.yml
file is used to define a set of rules for making RESTful HTTP requests and where to store the data. See here for example configurations.
Key | Required | Type | Description |
---|---|---|---|
url | T | string | The API base URL |
authentication | F | map | Data required for authenticating the web API HTTP Requests |
authentication.apiKey.passphrase | T | string | |
authentication.apiKey.Key | T | string | |
authentication.apiKey.Secret | T | string | |
authentication.auth2.Bearer | T | string | |
connectionString | T | List | List of connection strings for communication with storage |
rateLimit | T | map | Data required for limiting the number of requests per second, avoiding 429 errors |
rateLimit.burst | T | uint | Number of requests that can be made per second |
rateLimit.period | T | uint | Period for the rateLimit.burst |
truncate | F | bool | Truncate all tables in the databse before performing upserts |
requests | F | list | List of requests to receive data from the web API for upserting into storage |
request.endpoint | T | string | Endpoint for making the RESTful API request |
request.table | F | string | Name of the table in the storage for upserting data. This field defaults to the last string in the endpoint path |
request.timseries | F | map | Data required for upserting timeseries data, which are batched and can be resource intensive |
request.timeseries.startName | T | string | "Name of the query/path parameter for the "start" datetime of the timeseries" |
request.timeseries.endName | T | string | "Name of the query/path parameter for the "end" datetime of the timeseries" |
request.timeseries.period | T | uint | How often (in seconds) to build a new datetime range to batch. |
request.timeseries.layout | T | string | The layout for how to build a datetime to query over (e.g. RFC3339 would be "2006-01-02T15:04:05Z07:00") |
request.query | N | map | A hash of data that holds the query parameters for a request |
TODO
The NoSQL use case should require no overhead from the user. Just include the connection string in the connectionString
list of the configuration file.
The repository
and proto
packages are the only packages within the application that are public-facing stable API with the purpose of communicating CRUD requests to the storage devices used in the web-to-storage transfers.
Follow this guide for information on contributing.
- Public REST APIs: https://documenter.getpostman.com/view/8854915/Szf7znEe
- Artwork by Victoria Trum