privacysandbox/aggregation-service

Could someone help me validate if I am collecting the reports correctly (attribution-report NODE JS version)

Closed this issue · 4 comments

Hello everyone, I'm currently trying to create a version of attribution-reporting in NODE JS so far so good, I managed to complete the entire journey (trigger interactions with creatives, conversion on the final website, generate event and aggregable reports)

But I got to this part where I must store the aggregatable reports before sending them to the aggregation services, I wanted to know if anyone else did this step of collecting the reports in NODE JS

Below is the code responsible for collecting and storing the reports (I took the documentation code written in GO as a reference)

*Spoiler: Each report record I receive generates an .avro file

const avro = require('avsc');

const REPORTS_AVRO_SCHEMA = {
    "name": "AvroAggregatableReport",
    "type": "record",
    "fields": [
        { "name": "payload", "type": "bytes" },
        { "name": "key_id", "type": "string" },
        { "name": "shared_info", "type": "string" }
    ]
};
const RECORD_SCHEMA = avro.Type.forSchema(REPORTS_AVRO_SCHEMA);

const registerAggregateReport = (req, res) => {
    try {
        // const report = req.body;
        // Example to illustrate what the request body would be
        const report = {
            "aggregation_coordinator_origin": "https://publickeyservice.msmt.aws.privacysandboxservices.com",
            "aggregation_service_payloads": [
                {
                    "key_id": "bbe6351f-5619-4c98-84b2-4a74fa1ae254",
                    "payload": "7K9SQLdROKqITmnrkgIDulfEXDAR76XUP4vc6uzxPwDycQql3AhR3dxeXdEw2gbUaIAldnu33RSN4SAFcFFKgDQkvnhFzPoxJjO2Yfw4osJ1S0Odp0smu0rC5k5GuG4oIu9YQofCPNmSD7KRVJ9Y6Lucz3BXoI3RQhpQkO31RDyxVJdBbJ8JiS2KBtu8naUf5Z+/mNNKp39ObsNbo7kQKI0TwyRJDSJKqv42Yi3ctoAhOT0eaaUtMfho67i9XaEtVnh8wB4Mi+nzlAfVsGIavP6aXWDe44IgKZvTS/zEKjI68+nzWkyfdRNOf7jtb2XnoB7k5iM+Yu9Ayk5ic/aT1eA1iPEzLvW/tNLcohne3UL2DefZoTLb5l9aludA7Qlf0g+kW9nuvUSmHBuTjE/fTY5s9uRExHH+b2Hjm2sL9DyrFZUFqcl/KLS+McgOT8I0ZTpPRmr+njW8+4b01Hsc2MpY3KKAn1jUDUE45pGbhj/Gqlb1ikJO9nNKS/nnWJgR7+3P8JEpHC2fkfEase4+vrNxZujWolYfTUxswJpiEZs1+fCOroEyyEY6Zjvx5qLbk+7wMNqCeCltDPA6c8WtAPtMreIUvKbco6XUUzaGSnvWLz6/WJqCxG4hjPOfcYAWXIwSboqvNyBHrRr4H5V7C0unSkIjd0j/GeB3ywgnKEqiihuvZ5PPw+O5aYqJdaR3QEFZtpLj+3Uv4OGn2+CvU1thV0A0H1XViP846Tfmb0jVejN1+ih+VO5cf/7T2TPz6oGO9sa6qitWtll5vhwxVyG3vniCo3xghGnUcHSP5ogfp6qgDGSgsGFqSvdiuOpQU+MG/HrCDUjvce0GoXJP6674UcurGxR9UKAnVwZyKRIj/q9qzUgxhWEFC3ssADMmxhZBs3X+rrAxKfhXD12MfuUluRTCzpCKZ9/YapnJQYjngGx7GIkfW6tw8eSCC8yO41vWyHGRz4nKlgNeQkwYafGPzXqUXjyEyiupMUlmSsU/zT52wdCQYLJbQg7xhNuLebb8qh9LW07jMho4Vo9DBP9l463uqA8hcZnJ"
                }
            ],
            "shared_info": "{\"api\":\"attribution-reporting\",\"attribution_destination\":\"https://cliente.com\",\"report_id\":\"4d82121f-7d62-4fa4-bda4-a70c9e850089\",\"reporting_origin\":\"https://attribution.ads.uol.com.br\",\"scheduled_report_time\":\"1714764978\",\"source_registration_time\":\"0\",\"version\":\"0.1\"}"
        }
          

        report.aggregation_service_payloads.map(payload => {
            const payloadBytes = Buffer.from(payload.payload, 'base64');
            const record = {
                payload: payloadBytes,
                key_id: payload.key_id,
                shared_info: report.shared_info,
            };

            const outputFilename = `./reports/output_reports_${Date.now()}.avro`;
            const encoder = avro.createFileEncoder(outputFilename, RECORD_SCHEMA);
            encoder.write(record);
            encoder.end()
        });
        res.status(200).send('Report received successfully.');
    } catch (e) {
        console.error('Error processing report:', e);
        res.status(400).send('Failed to process report.');
    }
};

module.exports = {
    registerAggregateReport
}

*English is not my native language so take it easy

As the death of third-party cookies is something that will affect everyone, it would be nice to have references in more commonly used languages ​​such as NodeJs, Java, etc., I hope this post can contribute in some way to this

Hi @dbrito ,

This looks about right. I see that your payload is base64 decoded and converted to a bytestring. And the key_id and shared_info are strings. Converted into the aggregation service report avro schema.

@maybellineboon thanks!
Just one question, the part where each "aggregation_service_payloads" in the report becomes an .avro file is also correct, right?

In this case, there is no scenario where I will receive more than one "aggregation_service_payloads" per report and I have to store these records in the same .avro file, right?

hi @dbrito ,

You can either collect all the json reports and convert them into a single file or you can create separate avro reports. For better performance, it is recommended to split the number of reports to the number of CPUs for your Aggregation Service instance. For example, you have 1,000 reports with 6 CPUs, you will have 6 files and the 1,000 reports will be split into 6 files.

Also, when batching, we do recommend to follow the batching strategies doc.

Thanks !
I'll take a look