Handling 500+ DataAssetType Values
nicklofaso opened this issue · 7 comments
We are receiving messages from a 3rd party which contain DataAssetTypes of 501, 502, etc. The 500+ range is defined in the openrtb spec for "exchange specific usage".
readBidRequests() fails to process messages containing these data types (stacktrace below).
I believe enum's cannot be extended in protobuf. What would be the best way to allow processing of these messages?
com.google.protobuf.UninitializedMessageException: Message missing required fields: type
at com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:501)
at com.google.openrtb.OpenRtb$NativeRequest$Asset$Data$Builder.build(OpenRtb.java)
at com.google.openrtb.OpenRtb$NativeRequest$Asset$Builder.setData(OpenRtb.java)
at com.google.openrtb.json.OpenRtbNativeJsonReader.readReqAssetField(OpenRtbNativeJsonReader.java:224)
at com.google.openrtb.json.OpenRtbNativeJsonReader.readReqAsset(OpenRtbNativeJsonReader.java:199)
at com.google.openrtb.json.OpenRtbNativeJsonReader.readNativeRequestField(OpenRtbNativeJsonReader.java:151)
at com.google.openrtb.json.OpenRtbNativeJsonReader.readNativeRequest(OpenRtbNativeJsonReader.java:113)
at com.google.openrtb.json.OpenRtbJsonReader.readNativeField(OpenRtbJsonReader.java:410)
at com.google.openrtb.json.OpenRtbJsonReader.readNative(OpenRtbJsonReader.java:396)
at com.google.openrtb.json.OpenRtbJsonReader.readImpField(OpenRtbJsonReader.java:316)
at com.google.openrtb.json.OpenRtbJsonReader.readImp(OpenRtbJsonReader.java:294)
at com.google.openrtb.json.OpenRtbJsonReader.readBidRequestField(OpenRtbJsonReader.java:159)
at com.google.openrtb.json.OpenRtbJsonReader.readBidRequest(OpenRtbJsonReader.java:145)
at com.google.openrtb.json.OpenRtbJsonReader.readBidRequest(OpenRtbJsonReader.java:126)
This seems like a larger problem than just the data asset types.
One possible solution is to add entries 500 to 999 in the protobuf enum values and mark them as exchange-specific. For example:
enum DataAssetType {
// Sponsored By message where response should contain the brand name
// of the sponsor.
// Format: Text; Max length: 25 or longer.
SPONSORED = 1;
// Descriptive text associated with the product or service being advertised.
// Format: Text; Max length: 140 or longer.
DESC = 2;
// Rating of the product being offered to the user.
// For example an app's rating in an app store from 0-5.
// Format: Number (1-5 digits) formatted as string.
RATING = 3;
// Number of social ratings or "likes" of product being offered to the user.
// Format: Number formatted as string.
LIKES = 4;
// Number downloads/installs of this product.
// Format: Number formatted as string.
DOWNLOADS = 5;
// Price for product / app / in-app purchase.
// Value should include currency symbol in localised format.
// Format: Number formatted as string.
PRICE = 6;
// Sale price that can be used together with price to indicate a discounted
// price compared to a regular price. Value should include currency symbol
// in localised format.
// Format: Number formatted as string.
SALEPRICE = 7;
// Phone number.
// Format: Formatted string.
PHONE = 8;
// Address.
// Format: Text.
ADDRESS = 9;
// Additional descriptive text associated with the product or service
// being advertised.
// Format: Text.
DESC2 = 10;
// Display URL for the text ad.
// Format: Text.
DISPLAYURL = 11;
// Text describing a 'call to action' button for the destination URL.
// Format: Text.
CTATEXT = 12;
// Exchange-specific values above 500.
EXCHANGE_SPECIFIC_DATA_ASSET_TYPE_500 = 500;
EXCHANGE_SPECIFIC_DATA_ASSET_TYPE_501 = 501;
EXCHANGE_SPECIFIC_DATA_ASSET_TYPE_502 = 502;
EXCHANGE_SPECIFIC_DATA_ASSET_TYPE_503 = 503;
...
Another option would be to add a configuration variable to ignore exchange-specific values & to drop the asset altogether. This does not seem like an optimal solution, but it may be acceptable for the majority of exchanges.
Right, the tradeoff of the strongly-typed enums is that they cannot accept extended values without further modification. There are a few alternatives:
-
Fork the proto and add extended values. Notice that you need to support multiple exchanges, which could use the same numeric value to mean different things, you can support that with a single proto with
option allow_alias = true;
; or you may find less confusing to use neutral names like VALUE_501, etc. -
Do nothing in the proto, and handle extended enum values as unknown fields (check protobuf docs on this).
The problem with option 2 is the JSON support library, it throws the exception above because the JSON reader does not support extended values. I think this can be improved, let me know if this support would be useful i.e. if you think the unknown-field proto feature is a good way of handling this, provided that JSOn mapping would work.
In some cases, I do not believe option 2 is a possibility, as the enum value is required. I believe the best way to handle this is to add support for the exchange-specific enum values in the protobuf, and to put it on the users of the library to determine the meaning of the value based on the exchange/partnership. This gives everyone the flexibility to handle these values gracefully without the need to fork the library.
I confirmed that PR #121 fixes my issue.
Is it possible to include this in the next release?
Sorry for taking some time to review this. And thanks for the patch, I understand how this is useful for scenarios like yours. But we cannot accept the patch, this is unfortunately not compatible with the design of this proto which admittedly contains a significant tradeoff in the area of strong typing vs. extended values, but it's a point that we consider important.
To put in a different way: if you need these nondescript enum names like EXCHANGE_SPECIFIC_LAYOUT_ID_591
, this is not much better than just declaring the enums as int32
and then using a plain numeric literal 591 instead of that long, ugly, inconvenient identifier. If you need sensible identifiers for the standard values, you can still have the enum defined in the proto (without the extended values) and write code like:
setLayout(OpenRtb.LayoutId.APP_WALL.getNumber())
The extended values also have some problems: it's a metric ton of new code (even if it's all autogenerated boilerplate), and they only cover a relatively short range of 500 values per enum (what happens if some other exchange that you did not consider is using even bigger values, add thousands of extra identifiers to all enums? the spec does not limit extended values to 999). However, it's a reasonable workaround to have as a fork, but we only don't want to commit to that as a supported solution and stable API for this problem so it just can't live in the main repo and the published maven libs.
As I mentioned here, we're planning to find a better workaround, maybe improving the JSON serializer, possibly reevaluating the decision of using typed enum fields instead of ints (this is the kind of review that needs to wait for a major release -- which is planned later this year: moving to proto3 syntax, which involves other major changes).
This is still an issue and is pretty annoying. It means we cannot parse some native bid requests from some exchanges. I am attaching a bid request (with any PII removed) that came from an exchange and causes an error to be thrown. I think this field should be changed to an int32 type, but a least some of the exchange specific enums suggested in the patch would help us work around this problem.
native.txt
I'm interested in this problem for how we address it in OpenRTB 3.0. It seems to me we should 1. keep the enum
definitions for commonly-used values though 2. use int32
for these fields (at least in OpenRTB 3.0) to allow for the use case described here. Please email me directly (strasler@rubiconproject.com) with feedback so we don't pollute this PR.