diff options
Diffstat (limited to 'content/protobuf-apiv2.article')
-rw-r--r-- | content/protobuf-apiv2.article | 282 |
1 files changed, 282 insertions, 0 deletions
diff --git a/content/protobuf-apiv2.article b/content/protobuf-apiv2.article new file mode 100644 index 0000000..952d73c --- /dev/null +++ b/content/protobuf-apiv2.article @@ -0,0 +1,282 @@ +# A new Go API for Protocol Buffers +2 Mar 2020 +Tags: protobuf, technical +Summary: Announcing a major revision of the Go API for protocol buffers. +OldURL: /a-new-go-api-for-protocol-buffers + +Joe Tsai + +Damien Neil + +Herbie Ong + +## Introduction + +We are pleased to announce the release of a major revision of the Go API for +[protocol buffers](https://developers.google.com/protocol-buffers), +Google's language-neutral data interchange format. + +## Motivations for a new API + +The first protocol buffer bindings for Go were +[announced by Rob Pike](https://blog.golang.org/third-party-libraries-goprotobuf-and) +in March of 2010. Go 1 would not be released for another two years. + +In the decade since that first release, the package has grown and +developed along with Go. Its users' requirements have grown too. + +Many people want to write programs that use reflection to examine protocol +buffer messages. The +[`reflect`](https://pkg.go.dev/reflect) +package provides a view of Go types and +values, but omits information from the protocol buffer type system. For +example, we might want to write a function that traverses a log entry and +clears any field annotated as containing sensitive data. The annotations +are not part of the Go type system. + +Another common desire is to use data structures other than the ones +generated by the protocol buffer compiler, such as a dynamic message type +capable of representing messages whose type is not known at compile time. + +We also observed that a frequent source of problems was that the +[`proto.Message`](https://pkg.go.dev/github.com/golang/protobuf/proto?tab=doc#Message) +interface, which identifies values of generated message types, does very +little to describe the behavior of those types. When users create types +that implement that interface (often inadvertently by embedding a message +in another struct) and pass values of those types to functions expecting +a generated message value, programs crash or behave unpredictably. + +All three of these problems have a common cause, and a common solution: +The `Message` interface should fully specify the behavior of a message, +and functions operating on `Message` values should freely accept any +type that correctly implements the interface. + +Since it is not possible to change the existing definition of the +`Message` type while keeping the package API compatible, we decided that +it was time to begin work on a new, incompatible major version of the +protobuf module. + +Today, we're pleased to release that new module. We hope you like it. + +## Reflection + +Reflection is the flagship feature of the new implementation. Similar +to how the `reflect` package provides a view of Go types and values, the +[`google.golang.org/protobuf/reflect/protoreflect`](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect?tab=doc) +package provides a view of values according to the protocol buffer +type system. + +A complete description of the `protoreflect` package would run too long +for this post, but let's look at how we might write the log-scrubbing +function we mentioned previously. + +First, we'll write a `.proto` file defining an extension of the +[`google.protobuf.FieldOptions`](https://github.com/protocolbuffers/protobuf/blob/b96241b1b716781f5bc4dc25e1ebb0003dfaba6a/src/google/protobuf/descriptor.proto#L509) +type so we can annotate fields as containing +sensitive information or not. + + syntax = "proto3"; + import "google/protobuf/descriptor.proto"; + package golang.example.policy; + extend google.protobuf.FieldOptions { + bool non_sensitive = 50000; + } + +We can use this option to mark certain fields as non-sensitive. + + message MyMessage { + string public_name = 1 [(golang.example.policy.non_sensitive) = true]; + } + +Next, we will write a Go function which accepts an arbitrary message +value and removes all the sensitive fields. + + // Redact clears every sensitive field in pb. + func Redact(pb proto.Message) { + // ... + } + +This function accepts a +[`proto.Message`](https://pkg.go.dev/google.golang.org/protobuf/proto?tab=doc#Message), +an interface type implemented by all generated message types. This type +is an alias for one defined in the `protoreflect` package: + + type ProtoMessage interface{ + ProtoReflect() Message + } + +To avoid filling up the namespace of generated +messages, the interface contains only a single method returning a +[`protoreflect.Message`](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect?tab=doc#Message), +which provides access to the message contents. + +(Why an alias? Because `protoreflect.Message` has a corresponding +method returning the original `proto.Message`, and we need to avoid an +import cycle between the two packages.) + +The +[`protoreflect.Message.Range`](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect?tab=doc#Message.Range) +method calls a function for every populated field in a message. + + m := pb.ProtoReflect() + m.Range(func(fd protoreflect.FieldDescriptor, v protoreflect.Value) bool { + // ... + return true + }) + +The range function is called with a +[`protoreflect.FieldDescriptor`](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect?tab=doc#FieldDescriptor) +describing the protocol buffer type of the field, and a +[`protoreflect.Value`](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect?tab=doc#Value) +containing the field value. + +The +[`protoreflect.FieldDescriptor.Options`](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect?tab=doc#Descriptor.Options) +method returns the field options as a `google.protobuf.FieldOptions` +message. + + opts := fd.Options().(*descriptorpb.FieldOptions) + +(Why the type assertion? Since the generated `descriptorpb` package +depends on `protoreflect`, the `protoreflect` package can't return the +concrete options type without causing an import cycle.) + +We can then check the options to see the value of our extension boolean: + + if proto.GetExtension(opts, policypb.E_NonSensitive).(bool) { + return true // don't redact non-sensitive fields + } + +Note that we are looking at the field _descriptor_ here, not the field +_value_. The information we're interested in lies in the protocol +buffer type system, not the Go one. + +This is also an example of an area where we +have simplified the `proto` package API. The original +[`proto.GetExtension`](https://pkg.go.dev/github.com/golang/protobuf/proto?tab=doc#GetExtension) +returned both a value and an error. The new +[`proto.GetExtension`](https://pkg.go.dev/google.golang.org/protobuf/proto?tab=doc#GetExtension) +returns just a value, returning the default value for the field if it is +not present. Extension decoding errors are reported at `Unmarshal` time. + +Once we have identified a field that needs redaction, clearing it is simple: + + m.Clear(fd) + +Putting all the above together, our complete redaction function is: + + // Redact clears every sensitive field in pb. + func Redact(pb proto.Message) { + m := pb.ProtoReflect() + m.Range(func(fd protoreflect.FieldDescriptor, v protoreflect.Value) bool { + opts := fd.Options().(*descriptorpb.FieldOptions) + if proto.GetExtension(opts, policypb.E_NonSensitive).(bool) { + return true + } + m.Clear(fd) + return true + }) + } + +A more complete implementation might recursively descend into +message-valued fields. We hope that this simple example gives a +taste of protocol buffer reflection and its uses. + +## Versions + +We call the original version of Go protocol buffers APIv1, and the +new one APIv2. Because APIv2 is not backwards compatible with APIv1, +we need to use different module paths for each. + +(These API versions are not the same as the versions of the protocol +buffer language: `proto1`, `proto2`, and `proto3`. APIv1 and APIv2 +are concrete implementations in Go that both support the `proto2` and +`proto3` language versions.) + +The +[`github.com/golang/protobuf`](https://pkg.go.dev/github.com/golang/protobuf?tab=overview) +module is APIv1. + +The +[`google.golang.org/protobuf`](https://pkg.go.dev/google.golang.org/protobuf?tab=overview) +module is APIv2. We have taken advantage of the need to change the +import path to switch to one that is not tied to a specific hosting +provider. (We considered `google.golang.org/protobuf/v2`, to make it +clear that this is the second major version of the API, but settled on +the shorter path as being the better choice in the long term.) + +We know that not all users will move to a new major version of a package +at the same rate. Some will switch quickly; others may remain on the old +version indefinitely. Even within a single program, some parts may use +one API while others use another. It is essential, therefore, that we +continue to support programs that use APIv1. + + - `github.com/golang/protobuf@v1.3.4` is the most recent pre-APIv2 version of APIv1. + + - `github.com/golang/protobuf@v1.4.0` is a version of APIv1 implemented in terms of APIv2. + The API is the same, but the underlying implementation is backed by the new one. + This version contains functions to convert between the APIv1 and APIv2 `proto.Message` + interfaces to ease the transition between the two. + + - `google.golang.org/protobuf@v1.20.0` is APIv2. + This module depends upon `github.com/golang/protobuf@v1.4.0`, + so any program which uses APIv2 will automatically pick a version of APIv1 + which integrates with it. + +(Why start at version `v1.20.0`? To provide clarity. +We do not anticipate APIv1 to ever reach `v1.20.0`, +so the version number alone should be enough to unambiguously differentiate +between APIv1 and APIv2.) + +We intend to maintain support for APIv1 indefinitely. + +This organization ensures that any given program will use only a single +protocol buffer implementation, regardless of which API version it uses. +It permits programs to adopt the new API gradually, or not at all, while +still gaining the advantages of the new implementation. The principle of +minimum version selection means that programs may remain on the old +implementation until the maintainers choose to update to the new one +(either directly, or by updating a dependency). + +## Additional features of note + +The +[`google.golang.org/protobuf/encoding/protojson`](https://pkg.go.dev/google.golang.org/protobuf/encoding/protojson) +package converts protocol buffer messages to and from JSON using the +[canonical JSON mapping](https://developers.google.com/protocol-buffers/docs/proto3#json), +and fixes a number of issues with the old `jsonpb` package +that were difficult to change without causing problems for existing users. + +The +[`google.golang.org/protobuf/types/dynamicpb`](https://pkg.go.dev/google.golang.org/protobuf/types/dynamicpb) +package provides an implementation of `proto.Message` for messages whose +protocol buffer type is derived at runtime. + +The +[`google.golang.org/protobuf/testing/protocmp`](https://pkg.go.dev/google.golang.org/protobuf/testing/protocmp) +package provides functions to compare protocol buffer messages with the +[`github.com/google/cmp`](https://pkg.go.dev/github.com/google/go-cmp/cmp) +package. + +The +[`google.golang.org/protobuf/compiler/protogen`](https://pkg.go.dev/google.golang.org/protobuf/compiler/protogen?tab=doc) +package provides support for writing protocol compiler plugins. + +## Conclusion + +The `google.golang.org/protobuf` module is a major overhaul of +Go's support for protocol buffers, providing first-class support +for reflection, custom message implementations, and a cleaned up API +surface. We intend to maintain the previous API indefinitely as a wrapper +of the new one, allowing users to adopt the new API incrementally at +their own pace. + +Our goal in this update is to improve upon the benefits of the old +API while addressing its shortcomings. As we completed each component of +the new implementation, we put it into use within Google's codebase. This +incremental rollout has given us confidence in both the usability of the new +API and the performance and correctness of the new implementation. We believe +it is production ready. + +We are excited about this release and hope that it will serve the Go +ecosystem for the next ten years and beyond! |