New report generation routes
Opened this issue · 9 comments
As part of the audio event summary reports, the client needs information to create the accumulation Data plot, species composition plot, analysis coverage plots, and false colour spectrograms.
"accumulationData": [
{ "date": "2020-01-03T00:00:00.000Z", "countOfSpecies": 1, "error": 0.5 },
{ "date": "2020-01-04T00:00:00.000Z", "countOfSpecies": 2, "error": 0.5 }
],
"speciesCompositionData": [
// each of these object data points will be of a bin size
{
"date": "2020-01-03T00:00:00.000Z",
"values": [
{ "tagId": 2, "ratio": 0.7 },
{ "tagId": 1, "ratio": 0.3 }
]
},
{
"date": "2020-01-04T00:00:00.000Z",
"values": [
{ "tagId": 1, "ratio": 0.5 },
{ "tagId": 4, "ratio": 0.5 }
]
}
// etc...
],
False colour spectrograms should be returned as a url location or a collection of url locations.
A collection is an option because false colour spectrograms can only be created for 24 hour periods, and therefore, should return a collection or sorted url to false colour spectrograms
"falseColorSpectrograms": [ "https://api.ecosounds.org/false_color123.png" ],
We should either
- a: Expose each of these graphs under different atomic routes
- b: Expose these graphs under a common route (e.g.
/{project,region,site}/:id/audio_events/graphs
and have filter conditions)
It is also up for consideration to generate the species accumulationData
and speciesCompositionData
client side. However, this may be poor practice and slow down the client
OK, a couple of notes:
In lieu of a generalized model of endpoint aggregation, each reports will probably have it's own endpoint.
This means you need to include the full payload you want me to represent in this report.
Also: the false colour image generation will be handles by a different means.
This means you need to include the full payload you want me to represent in this report.
I'll be updating the ticket with the full payload, routes, and request structure needed for the "Ecoacoustic event summary report"
I've attached the full request including filter parameters the client will be sending and the expected response in the drop downs below
POST /reports/event_summary/filter
Request (Filter)
"filters": {
"project": {
"id": { "eq": 123 }
},
"region": {
"id": { "eq": 321 }
},
"sites": {
"in": [ 1, 2, 3, 4, 5, 6 ]
},
"startDate": "2020-01-01T00:00:00.000Z", // new virtual columns needed to be added to the audio event api
"endDate": "2020-01-03T00:00:00.000Z", // new virtual columns needed to be added to the audio event api
"startDate": {
"gt": {
"value": "12:12",
"expression": ["time_of_day", "local_tz"]
}
},
"endDate": {
"lte": {
"value": "12:12",
"expression": ["time_of_day", "local_tz"]
}
},
"or": [
{
"provenance.id": {
"eq": 2 // e.g. BirdNet
}
},
{
"provenance.id": {
"eq": 5 // e.g. Lances recogniser
}
}
],
"score": {
"gteq": 0.7
},
"tags.id": {
"in": [ 1, 2, 3, 4, 5, 6 ]
},
"analysisJob.id": {
"eq": "system" // this can be a user defined analysis job
}
},
Response
{
"generatedDate": "2023-01-03T00:00:00.000Z", // not required for on-the-fly report generation needed at the moment, by for future saved reports, this will be useful
"eventGroups": [
{
"provenanceId": 5, // provenance id, we need to pull the full model
"tagId": 1, // tag id, we need to get the tag text and tag name from the api pulled model
"detections": 23,
"binsWithDetections": 10,
"binsWithInterference": [
{
"name": "wind",
"value": 4
},
{
"name": "rain",
"value": 2
}
],
// this is the confidence plot for the event
"score": {
// each histogram bin will be 100 bins
"histogram": [ 0.1, 0.2, 0.3, 0.25 ],
"standardDeviation": 0.05,
"mean": 0.25,
"min": 0.1,
"max": 0.3
}
},
{
"provenanceId": 6,
"tagId": 1, // this same tag id links it to the identified event above from a different provenanceId
"detections": 11,
"binsWithDetections": 1,
"binsWithInterference": [
{
"name": "wind",
"value": 4
},
{
"name": "rain",
"value": 2
}
],
// this is the confidence plot for the event
"score": {
// each histogram bin will be 100 bins
"histogram": [ 0.1, 0.2, 0.3, 0.25 ],
"standardDeviation": 0.05,
"mean": 0.25,
"min": 0.1,
"max": 0.3
}
}
],
// requested through the FILTER /{project,region,site}/:id/audio_events/reports api route
"graphs": {
"accumulationData": [
// each of these object data points will be of a bin size
{ "date": "2020-01-03T00:00:00.000Z", "countOfSpecies": 1, "error": 0.5 },
{ "date": "2020-01-04T00:00:00.000Z", "countOfSpecies": 2, "error": 0.5 }
],
"speciesCompositionData": [
// each of these object data points will be of a bin size
{
"date": "2020-01-03T00:00:00.000Z",
"values": [
{ "tagId": 2, "ratio": 0.7 },
{ "tagId": 1, "ratio": 0.3 }
]
},
{
"date": "2020-01-04T00:00:00.000Z",
"values": [
{ "tagId": 1, "ratio": 0.5 },
{ "tagId": 4, "ratio": 0.5 }
]
}
// etc...
],
"analysisCoverage": [
// each of these object data points will be of a bin size
{
"date": "2020-01-02T00:00:00.000Z",
"audioCoverage": 0.5,
"analysisCoverage": 0.4
}
]
},
"statistics": {
"totalSearchSpan": 2592000, // 1 month in seconds
"audioCoverageOverSearchSpan": 2092000, // there are a few missing recordings in this month
"analysisCoverageOverSearchSpan": 2002000, // a few of the audio recordings are being analysed or could not be processed
"countOfRecordingsAnalyzed": 100,
"coverageStartDay": "2020-01-01T00:00:00.000Z",
"coverageEndDay": "2020-01-31T00:00:00.000Z"
},
"locations": [ 1, 2, 3, 4 ] // these ids are site ids
}
Additionally, there will be a new provenance route for recognisers / event creators most likely under
GET /provenance/:id
//! GET /provenance/:id
//? response
{
"id": 123,
"name": "BirdNet",
"version": "1.0.0",
"description": "An avian event detector",
"score": 0.5, //* stretch goal
"score_minimum": 0.1,
"score_maximum": 0.8
}
In the request spec above, there is no field for binSize
. How will we send bin size in the requests?
My assumption is it will be one of the following:
- A number in seconds representing the duration OR
{
"binSize": 23123124
}
- hard coded values ("day", "month", "year", "season") that can be derived server side
{
"binSize": "season"
}
great question, currently undecided.
If we can imagine an argument for variable sized bins then maybe seconds as an number.
Probably will go with an enum - mainly because time intervals (months) can be irregularly sized
Since time series is now in scope for the acoustic event report, the response should include a document for recording coverage and analysis coverage
After discussion, I've done a mock-up of what the data structure will be client side (image bellow)
The ITimeSeriesGraph
will probably be implemented additional document under graphs.coverageData
You've mentioned that false colour spectrograms will be handled by different means
Also: the false colour image generation will be handles by a different means.
I'm assuming that spectrograms will be fetched through either:
a. A new endpoint in which we can provide a date range, and get a list of spectrogram URL's returned
or
b. Fetch the spectrograms will be an analysis result item, so we should just request the analysis results, and perform a filter-map client side to extract and collate the spectrograms
You've mentioned that false colour spectrograms will be handled by different means
Also: the false colour image generation will be handles by a different means.
I'm assuming that spectrograms will be fetched through either: a. A new endpoint in which we can provide a date range, and get a list of spectrogram URL's returned or b. Fetch the spectrograms will be an analysis result item, so we should just request the analysis results, and perform a filter-map client side to extract and collate the spectrograms
https://github.com/QutEcoacoustics/baw-client/blob/master/src/app/visualize/visualize.js
Finalised models & services for api
Event Provenances
Expected server-side model (event provenances)
id :bigint not null, primary key
name :string
version :string # (this is a string because we want to support version numbers like "1.0.0-beta0.1")
description: :string
score :integer
Expected responses
GET /provenance/:id
(show)
Request body:
This field is intentionally left blank
Response body:
{
"meta": {
"status": 200,
"message": "OK",
},
"data": model in JSON format
}
Example model for show request:
Client mock response
{
id: 1,
name: "Fake Audio Event Provenance",
version: "1.0",
description: "Mock Description",
score: 0.5
}
Standard API implementation of routes
GET /provenance
(list)
GET /provenance/filter
(filter)
POST /provenance
(create)
PATCH /provenance/:id
(update)
DELETE /provenance/:id
(delete)
Event summary report
Expected server-side model (event provenances)
id :bigint not null, primary key # the id field isn't used at the moment, however, it will be used when we add the ability to save reports
name :string
generated_date :datetime not null
statistics :statistics_sub_model
event_groups :event_group_sub_model[]
site_ids :bigint[]
region_ids :bigint[]
tag_ids :bigint[]
provenance_ids :bigint[]
graphs :graphs_sub_model
Report sub-models
statistics_sub_model
Client side sub-model for report statistics
total_search_span :integer
audio_coverage_over_span :integer
analysis_coverage_over_span :integer
count_of_recordings_analyzed :integer
coverage_start_day :datetime # the date and time of the first audio recording
coverage_end_day :datetime # the date and time of the last audio recording
event_group_sub_model
Client side sub-model for report event group
provenance_id :bigint
tag_id :bigint
detections :integer
buckets_with_detections :integer
score :integer
graphs_sub_model
Client side sub-model for report graphs
accumulation_data :accumulation_data_sub_model[]
species_composition_data :composition_data_sub_model[]
analysis_coverage_data :analysis_coverage_sub_model[]
coverage_data :coverage_sub_model
accumulation_data_sub_model
date :datetime
count :integer
error :integer
composition_data_sub_model
date :datetime
tag_id :bigint
analysis_coverage_sub_model
date :datetime
audio_coverage :integer
analysis_coverage :integer
coverage_sub_model
failed_analysis_coverage :daterange # new data type (see below)
analysis_coverage :daterange
missing_analysis_coverage :daterange
recording_coverage :daterange
New data type "daterange"
daterange
startDate :datetime
endDate :datetime
Expected responses
POST /reports/audio_event_summary/filter
(filter show)
Example Request Body:
Collapsed due to large content
Expand
{
filters: {
and: [
region.id: {
in: [1,2,3,4]
},
site.id: {
in: [1,2,3,4]
},
provenance.id: {
in: [1,2,3,4]
},
tag.id: {
in: [1,2,3,4]
},
score: {
gteq: 0.6
},
bucketSize: {
eq: "day" // possible values for this enum can be found here: https://github.com/QutEcoacoustics/workbench-client/blob/master/src/app/components/reports/pages/event-summary/EventSummaryReportParameters.ts#L43-L50
},
recordedEndDate: {
greaterThan: "2020-10-10"
},
recordedDate: {
lessThan: "2020-10-11"
},
recordedEndDate: {
greaterThan: {
expressions: "local_offset", "time_of_day",
value: "12:12"
}
},
recordedDate: {
lessThan: {
expressions: "local_offset", "time_of_day",,
value: "12:13"
}
}
]
}
}
Client code that creates these filters
Example Response Body:
{
"meta": {
"status": 200,
"message": "OK",
},
"data": event summary report model in JSON format
}
Example model for filter show request
Client mock response
Collapsed due to large content
Expand
{
site_ids: [3600, 3609, 3332, 3331],
region_ids: [14, 7],
tagIds: [1, 1950, 39, 277],
provenance_ids: [1],
name: "Mock Event Summary Report",
generated-date: "2023-07-07T00:00:00.0000000",
evet_groups: [
{
provenance_id: 1,
tag_id: 1,
detections: 55,
buckets_with_detections: 0.7,
score: {
histogram: [
0.91,
0.82, 0.71, 0.71, 0.62, 0.63, 0.54, 0.52, 0.51, 0.51, 0.41, 0.4,
0.3, 0.32, 0.22, 0.13,
],
standard_deviation: 0.2,
mean: 0.5,
min: 0.1,
max: 0.9,
},
},
{
provenance_id: 1,
tag_id: 1950,
detections: 55,
buckets_with_detections: 0.7,
score: {
histogram: [
0.1, 0.2, 0.3, 0.3, 0.6, 0.6, 0.5, 0.2, 0.5, 0.5, 0.4, 0.4, 0.3,
0.3, 0.5, 0.1, 0.7, 0.7, 0.6, 0.7, 0.8, 0.8, 0.9,
],
standard_deviation: 0.4,
mean: 0.6,
min: 0.4,
max: 0.98,
},
},
{
provenance_id: 1,
tag_id: 39,
detections: 55,
buckets_with_detections: 0.7,
score: {
histogram: [
0.2, 0.5, 0.4, 0.4, 0.3, 0.3, 0.6, 0.2, 0.4, 0.3, 0.1, 0.4, 0.3,
0.3, 0.3, 0.1, 0.6, 0.7, 0.8, 0.9,
],
standard_deviation: 0.1,
mean: 0.3,
min: 0.1,
max: 0.3,
},
},
{
provenance_id: 1,
tag_id: 277,
detections: 55,
buckets_with_detections: 0.7,
score: {
histogram: [
0.9, 0.1, 0.7, 0.7, 0.6, 0.3, 0.5, 0.3, 0.5, 0.2, 0.4, 0.4, 0.3,
0.3, 1, 0.9,
],
standard_deviation: 0.2,
mean: 0.5,
min: 0.1,
max: 0.9,
},
},
],
statistics: {
total_search_span: 256,
audio_coverage_over_span: 128,
analysis_coverage_over_span: 64,
count_of_recordings_analyzed: 221,
coverage_start_day: "2023-01-01:00:00:00.0000000",
coverage_end_day: "2023-12-01:00:00:00.0000000",
},
graphs: {
accumulation_data: [
{ date: "2023-05-22", count: 0, error: 0 },
{ date: "2023-05-23", count: 3, error: 0 },
{ date: "2023-05-24", count: 9, error: 1 },
{ date: "2023-05-25", count: 15, error: 1 },
{ date: "2023-05-26", count: 17, error: 2 },
{ date: "2023-05-27", count: 18, error: 2 },
{ date: "2023-05-28", count: 18, error: 2 },
{ date: "2023-05-29", count: 20, error: 3 },
{ date: "2023-05-30", count: 21, error: 3 },
],
species_composition_data: [
{ date: "2023-05-22", tag_id: 1, ratio: 0.55 },
{ date: "2023-05-22", tag_id: 39, ratio: 0.3 },
{ date: "2023-05-22", tag_id: 277, ratio: 0.15 },
{ date: "2023-05-23", tag_id: 1, ratio: 0.45 },
{ date: "2023-05-23", tag_id: 39, ratio: 0.2 },
{ date: "2023-05-23", tag_id: 277, ratio: 0.35 },
{ date: "2023-05-24", tag_id: 1, ratio: 0.05 },
{ date: "2023-05-24", tag_id: 39, ratio: 0.25 },
{ date: "2023-05-24", tag_id: 277, ratio: 0.7 },
{ date: "2023-05-25", tag_id: 1, ratio: 0.5 },
{ date: "2023-05-25", tag_id: 39, ratio: 0.2 },
{ date: "2023-05-25", tag_id: 277, ratio: 0.3 },
{ date: "2023-05-26", tag_id: 1, ratio: 0.25 },
{ date: "2023-05-26", tag_id: 39, ratio: 0.4 },
{ date: "2023-05-26", tag_id: 277, ratio: 0.35 },
{ date: "2023-05-27", tag_id: 1, ratio: 0.15 },
{ date: "2023-05-27", tag_id: 39, ratio: 0.3 },
{ date: "2023-05-27", tag_id: 277, ratio: 0.55 },
{ date: "2023-05-28", tag_id: 1, ratio: 0.1 },
{ date: "2023-05-28", tag_id: 39, ratio: 0.2 },
{ date: "2023-05-28", tag_id: 277, ratio: 0.7 },
{ date: "2023-05-29", tag_id: 1, ratio: 0.05 },
{ date: "2023-05-29", tag_id: 39, ratio: 0.15 },
{ date: "2023-05-29", tag_id: 277, ratio: 0.8 },
{ date: "2023-05-30", tag_id: 1, ratio: 0.05 },
{ date: "2023-05-30", tag_id: 39, ratio: 0.1 },
{ date: "2023-05-30", tag_id: 277, ratio: 0.85 },
],
coverage_data: {
recording_coverage: [
{ start_date: "2023-05-22", end_date: "2023-05-24" },
{ start_date: "2023-05-26", end_date: "2023-05-27" },
{ start_date: "2023-05-28", end_date: "2023-05-29" },
],
analysis_coverage: [
{ start_date: "2023-05-22", end_date: "2023-05-23" },
{ start_date: "2023-05-28", end_date: "2023-05-28" },
],
missing_analysis-coverage: [
{ start_date: "2023-05-23", end_date: "2023-05-24" },
{ start_date: "2023-05-28", end_date: "2023-05-29" }
],
failed_analysis_coverage: [
{ start_date: "2023-05-26", end_date: "2023-05-27" },
],
},
analysisConfidenceData: [
{ date: "2023-01-02", audio_coverage: 0.5, analysis_coverage: 0.5 },
{ date: "2023-01-03", audio_coverage: 0.6, analysis_coverage: 0.5 },
{ date: "2023-01-04", audio_coverage: 0.3, analysis_coverage: 0.2 },
{ date: "2023-01-05", audio_coverage: 0.4, analysis_coverage: 0.1 },
{ date: "2023-01-06", audio_coverage: 0.8, analysis_coverage: 0.5 },
{ date: "2023-01-07", audio_coverage: 0.2, analysis_coverage: 0.1 },
{ date: "2023-01-08", audio_coverage: 0.1, analysis_coverage: 0.0 },
],
}
}
New! Filter Show
Allows you to create a single model based on filter conditions sent in the body of a POST request.
Client side tests for this new API endpoint
Routes that are not implemented
These routes can be added when we add the ability to cache/save reports
GET /reports/audio_event_summary/:id
(show)
GET /reports/audio_event_summary
(list)
GET /reports/audio_event_summary/filter
(filter)
POST /reports/audio_event_summary
(create)
PATCH /reports/audio_event_summary/:id
(update)
DELETE /reports/audio_event_summary/:id
(delete)
Let me know if you have any questions about the finalized spec