Error using LocalTestingTool_2.0.0.jar with sampledata
Closed this issue · 7 comments
I am trying to follow the instructions in Testing locally using Local Testing Tool but when I run the following command with the sampledata:
java -jar LocalTestingTool_2.0.0.jar \
--input_data_avro_file sampledata/output_debug_reports.avro \
--domain_avro_file sampledata/output_domain.avro \
--output_directory .
I get the error below:
2023-10-31 12:21:57:506 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.WorkerPullWorkService - Aggregation worker started
2023-10-31 12:21:57:545 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.WorkerPullWorkService - Item pulled
2023-10-31 12:21:57:555 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor - Reports shards detected by blob storage client: [output_debug_reports.avro]
2023-10-31 12:21:57:566 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor - Reports shards to be used: [DataLocation{blobStoreDataLocation=BlobStoreDataLocation{bucket=/Users/jonaquino/projects/aggregation-service/sampledata, key=output_debug_reports.avro}}]
2023-10-31 12:21:57:566 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.domain.OutputDomainProcessor - Output domain shards detected by blob storage client: [output_domain.avro]
2023-10-31 12:21:57:567 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.domain.OutputDomainProcessor - Output domain shards to be used: [DataLocation{blobStoreDataLocation=BlobStoreDataLocation{bucket=/Users/jonaquino/projects/aggregation-service/sampledata, key=output_domain.avro}}]
2023-10-31 12:21:57:575 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor - Job parameters didn't have a report error threshold configured. Taking the default percentage value 10.000000
return_code: "REPORTS_WITH_ERRORS_EXCEEDED_THRESHOLD"
return_message: "Aggregation job failed early because the number of reports excluded from aggregation exceeded threshold."
error_summary {
error_counts {
category: "REQUIRED_SHAREDINFO_FIELD_INVALID"
count: 1
description: "One or more required SharedInfo fields are empty or invalid."
}
error_counts {
category: "NUM_REPORTS_WITH_ERRORS"
count: 1
description: "Total number of reports that had an error. These reports were not considered in aggregation. See additional error messages for details on specific reasons."
}
}
finished_at {
seconds: 1698780117
nanos: 679576000
}
CustomMetric{nameSpace=scp/worker, name=WorkerJobCompletion, value=1.0, unit=Count, labels={Type=Success}}
2023-10-31 12:21:57:732 -0700 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.WorkerPullWorkService - No job pulled.
have same issue
❯ java -jar LocalTestingTool_${VERSION}.jar \
--input_data_avro_file ~/Downloads/output_debug_reports.avro \
--domain_avro_file ~/Downloads/output_domain\ \(2\).avro \
--output_directory .
2023-11-01 12:23:57:021 +0900 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.WorkerPullWorkService - Aggregation worker started
2023-11-01 12:23:57:114 +0900 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.WorkerPullWorkService - Item pulled
2023-11-01 12:23:57:154 +0900 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor - Reports shards detected by blob storage client: [output_debug_reports.avro]
2023-11-01 12:23:57:216 +0900 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor - Reports shards to be used: [DataLocation{blobStoreDataLocation=BlobStoreDataLocation{bucket=/Users/me/Downloads, key=output_debug_reports.avro}}]
2023-11-01 12:23:57:219 +0900 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.domain.OutputDomainProcessor - Output domain shards detected by blob storage client: [output_domain (2).avro]
2023-11-01 12:23:57:221 +0900 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.domain.OutputDomainProcessor - Output domain shards to be used: [DataLocation{blobStoreDataLocation=BlobStoreDataLocation{bucket=/Users/me/Downloads, key=output_domain (2).avro}}]
2023-11-01 12:23:57:240 +0900 [WorkerPullWorkService] INFO com.google.aggregate.adtech.worker.aggregation.concurrent.ConcurrentAggregationProcessor - Job parameters didn't have a report error threshold configured. Taking the default percentage value 10.000000
return_code: "REPORTS_WITH_ERRORS_EXCEEDED_THRESHOLD"
return_message: "Aggregation job failed early because the number of reports excluded from aggregation exceeded threshold."
error_summary {
error_counts {
category: "REQUIRED_SHAREDINFO_FIELD_INVALID"
count: 1
description: "One or more required SharedInfo fields are empty or invalid."
}
error_counts {
category: "NUM_REPORTS_WITH_ERRORS"
count: 1
description: "Total number of reports that had an error. These reports were not considered in aggregation. See additional error messages for details on specific reasons."
}
}
finished_at {
seconds: 1698809037
nanos: 449472000
}
@maybellineboon The avro files come from this repo itself. The files are:
Thanks! We're looking into the files. It looks like there's some problem with the output_debug_reports.avro. We will update you once we have more information.
Hi @maybellineboon . Is there any update?
Hi @JonathanAquino-NextRoll and @jen6 ,
Our engineers are still working on this and will publish the new avro file soon. However, temporarily, please find the attached avro file to help you test out the LocalTestingTool.
output_debug_reports.avro.zip
Thanks!
Maybelline
Yes, that file works - thank you
Sample data have been updated.