A Protocol Buffers standard for OpenRTB

OpenRTB has resolved integration headaches and increased speed and scale of programmatic advertising. Up next: a standard Protocol Buffers representation, which we believe will support another boost in expediency of technical integrations for programmatic.

The current OpenRTB specification uses JSON making it easy for humans to understand. We needed to define attributes and ensure clarity of definitions to ensure implementers were speaking the same language. Now, computers need to read the OpenRTB spec: enter Protocol Buffers.

With the current JSON, every technology stack needs to implement its own internal representation of OpenRTB: what fields exist, their types, what values those fields can have, and relations between those fields. This representation needs to be parsed from and serialized to JSON in order to communicate with other technology stacks. Every time ad tech advances and OpenRTB evolves, engineers across dozens of companies need to update representation, validation, parsing, and serialization code.

A Protocol Buffers standard for OpenRTB seeks to resolve this. Protocol Buffers can be read by computers and represented in code in an intuitive way, as well as handling validation, parsing, and serialization. While the markdown standard for OpenRTB remains the source of truth, everything in the standard is reproduced in Protocol Buffers in a way that is readable to both humans and machines.

There are three reasons why this is important:

This is an exciting prospect to us engineers who have spent far too much time writing and maintaining the code that translates between these models. It’s a cost that ad tech businesses likely don’t see, though it’s a drag on their ability to move quickly. With this standard those engineers will be free from all that work, able to spend the time instead on things to differentiate their businesses.

A new fault line is forming, and we have an opportunity to correct it. While several ad tech companies have already embraced Protocol Buffers for OpenRTB, without a standard to coalesce around they have introduced their own schemas to represent the same data. If we do not act, we will need even more code to translate between those schemas.

Our industry is about turning bytes into money. Protocol Buffers shortens these messages, by approximately 50%. It takes less time to read and write them: only 40-50% of the time it takes to do the same for JSON. We can send them to each other more efficiently, and that impacts the bottom line. Our businesses run faster (less work to do) and cost less to run (less electricity).

The engineer reader may point out that another IAB Tech Lab repo for Protocol Buffers for OpenRTB has been around for several years. While true, that was maintained on a best-efforts basis, and it is now being deprecated. When Protocol Buffers became the official binary representation of OpenRTB, this effort began to fold the standard schema into the OpenRTB specification itself. This release completes the process of creating a formal Protocol Buffers standard. The only breaking change is the handling of extensions – as these schemas relate to the main spec, the two are compatible. The new version makes a few improvements on the original:

Leverages Protocol Buffers Editions – this made it possible to update the schema syntax without losing some of the key features in the original (e.g. the “has” functions used to confirm which fields were populated in a message, and explicit default values).
Simplifies treatment of standard extensions – the old repo made all extensions equal. We’ve changed this to formalize the standard extensions so we can align on one implementation (and simplify the code to access them in the process).
Shareable extensions – companies can reserve index blocks, and remain free to define their own extensions. The index blocks guarantee no collisions between companies, so they will be able to share those extensions.

Engineers may also argue that JSON messages and gzip compression are just as good for efficiency and provide other benefits, such as human-readability. JSON remains a perfectly valid option, and continues to be the default encoding in the OpenRTB standard. We are not here to kill it, only to point out:

Protocol Buffers handles the de-serialization of inbound messages, does it quickly, and, crucially, provides type safety. Both of these are a great convenience, meaning engineers have less code to write and maintain, making it less expensive in the long term as updates are made to OpenRTB. A company that uses JSON must maintain its own implementation, at minimum checking data types for all fields, and extending it for each new field.
The number of OpenRTB messages any one company sees today can be in the trillions. The total is far larger than that. The fraction of those messages being read by humans is vanishingly small. It is unwise to prioritize that use case, especially when there are tools (cf. the TextFormat Java class) to render Protocol Buffers data in human-readable format.
As a responder to OpenRTB requests, it’s important to consider gzip compression. While the compression factor will be lower for Protocol Buffers data vs JSON, it is still worth the effort for messages that have lots of text (e.g., the ad markup in the response), since both message encodings use UTF-8 for strings.
Translation between JSON and Protocol Buffers is relatively straightforward. The practical choice in ad tech applications is to pivot to Protocol Buffers as the internal representation, so the data can be sent and received as-is, or translated to and from JSON at the edge as needed.

The new specification is available at https://github.com/InteractiveAdvertisingBureau/openrtb2.x/blob/proto-editions/2.6.md. This version is currently in public comment. If you have thoughts, suggestions, or want to support ongoing development please email support@iabtechlab.com . We’d also love to have you join the Programmatic Supply Chain Working Group for further discussion.

Contributors

Stan Belov (Google)

Ken McMaster (Amazon)

Daniel Miller (Google)

Osvaldo Doederlein (Google)

Simon Trasler (Amazon)

Trent Underwood (Google)