Experiment, Simplify, Ship 01 Aug 2019 Tags: community, go2, proposals Russ Cox * Introduction This is the blog post version of my talk last week at Gophercon 2019. .iframe //www.youtube.com/embed/kNHo788oO5Y?rel=0 309 549 We are all on the path to Go 2, together, but none of us know exactly where that path leads or sometimes even which direction the path goes. This post discusses how we actually find and follow the path to Go 2. Here’s what the process looks like. .html experiment/div-indent.html .image experiment/expsimp1.png _ 179 .html experiment/div-end.html We experiment with Go as it exists now, to understand it better, learning what works well and what doesn’t. Then we experiment with possible changes, to understand them better, again learning what works well and what doesn’t. Based on what we learn from those experiments, we simplify. And then we experiment again. And then we simplify again. And so on. And so on. * The Four R’s of Simplifying During this process, there are four main ways that we can simplify the overall experience of writing Go programs: reshaping, redefining, removing, and restricting. *Simplify*by*Reshaping* The first way we simplify is by reshaping what exists into a new form, one that ends up being simpler overall. Every Go program we write serves as an experiment to test Go itself. In the early days of Go, we quickly learned that it was common to write code like this `addToList` function: func addToList(list []int, x int) []int { n := len(list) if n+1 > cap(list) { big := make([]int, n, (n+5)*2) copy(big, list) list = big } list = list[:n+1] list[n] = x return list } We’d write the same code for slices of bytes, and slices of strings, and so on. Our programs were too complex, because Go was too simple. So we took the many functions like `addToList` in our programs and reshaped them into one function provided by Go itself. Adding `append` made the Go language a little more complex, but on balance it made the overall experience of writing Go programs simpler, even after accounting for the cost of learning about `append`. Here’s another example. For Go 1, we looked at the very many development tools in the Go distribution, and we reshaped them into one new command. 5a 8g 5g 8l 5l cgo 6a gobuild 6cov gofix → go 6g goinstall 6l gomake 6nm gopack 8a govet The `go` command is so central now that it is easy to forget that we went so long without it and how much extra work that involved. We added code and complexity to the Go distribution, but on balance we simplified the experience of writing Go programs. The new structure also created space for other interesting experiments, which we’ll see later. *Simplify*by*Redefining* A second way we simplify is by redefining functionality we already have, allowing it to do more. Like simplifying by reshaping, simplifying by redefining makes programs simpler to write, but now with nothing new to learn. For example, `append` was originally defined to read only from slices. When appending to a byte slice, you could append the bytes from another byte slice, but not the bytes from a string. We redefined append to allow appending from a string, without adding anything new to the language. var b []byte var more []byte b = append(b, more...) // ok var b []byte var more string b = append(b, more...) // ok later *Simplify*by*Removing* A third way we simplify is by removing functionality when it has turned out to be less useful or less important than we expected. Removing functionality means one less thing to learn, one less thing to fix bugs in, one less thing to be distracted by or use incorrectly. Of course, removing also forces users to update existing programs, perhaps making them more complex, to make up for the removal. But the overall result can still be that the process of writing Go programs becomes simpler. An example of this is when we removed the boolean forms of non-blocking channel operations from the language: ok := c <- x // before Go 1, was non-blocking send x, ok := <-c // before Go 1, was non-blocking receive These operations were also possible to do using `select`, making it confusing to need to decide which form to use. Removing them simplified the language without reducing its power. *Simplify*by*Restricting* We can also simplify by restricting what is allowed. From day one, Go has restricted the encoding of Go source files: they must be UTF-8. This restriction makes every program that tries to read Go source files simpler. Those programs don’t have to worry about Go source files encoded in Latin-1 or UTF-16 or UTF-7 or anything else. Another important restriction is `gofmt` for program formatting. Nothing rejects Go code that isn’t formatted using `gofmt`, but we have established a convention that tools that rewrite Go programs leave them in `gofmt` form. If you keep your programs in `gofmt` form too, then these rewriters don’t make any formatting changes. When you compare before and after, the only diffs you see are real changes. This restriction has simplified program rewriters and led to successful experiments like `goimports`, `gorename`, and many others. * Go Development Process This cycle of experiment and simplify is a good model for what we’ve been doing the past ten years. but it has a problem: it’s too simple. We can’t only experiment and simplify. We have to ship the result. We have to make it available to use. Of course, using it enables more experiments, and possibly more simplifying, and the process cycles on and on. .html experiment/div-indent.html .image experiment/expsimp2.png _ 326 .html experiment/div-end.html We shipped Go to all of you for the first time on November 10, 2009. Then, with your help, we shipped Go 1 together in March 2012. And we’ve shipped twelve Go releases since then. All of these were important milestones, to enable more experimentation, to help us learn more about Go, and of course to make Go available for production use. When we shipped Go 1, we explicitly shifted our focus to using Go, to understand this version of the language much better before trying any more simplifications involving language changes. We needed to take time to experiment, to really understand what works and what doesn’t. Of course, we’ve had twelve releases since Go 1, so we have still been experimenting and simplifying and shipping. But we’ve focused on ways to simplify Go development without significant language changes and without breaking existing Go programs. For example, Go 1.5 shipped the first concurrent garbage collector and then the following releases improved it, simplifying Go development by removing pause times as an ongoing concern. At Gophercon in 2017, we announced that after five years of experimentation, it was again time to think about significant changes that would simplify Go development. Our path to Go 2 is really the same as the path to Go 1: experiment and simplify and ship, towards an overall goal of simplifying Go development. For Go 2, the concrete topics that we believed were most important to address are error handling, generics, and dependencies. Since then we have realized that another important topic is developer tooling. The rest of this post discusses how our work in each of these areas follows that path. Along the way, we’ll take one detour, stopping to inspect the technical detail of what will be shipping soon in Go 1.13 for error handling. * Errors It is hard enough to write a program that works the right way in all cases when all the inputs are valid and correct and nothing the program depends on is failing. When you add errors into the mix, writing a program that works the right way no matter what goes wrong is even harder. As part of thinking about Go 2, we want to understand better whether Go can help make that job any simpler. There are two different aspects that could potentially be simplified: error values and error syntax. We’ll look at each in turn, with the technical detour I promised focusing on the Go 1.13 error value changes. *Error*Values* Error values had to start somewhere. Here is the `Read` function from the first version of the `os` package: export func Read(fd int64, b *[]byte) (ret int64, errno int64) { r, e := syscall.read(fd, &b[0], int64(len(b))); return r, e } There was no `File` type yet, and also no error type. `Read` and the other functions in the package returned an `errno`int64` directly from the underlying Unix system call. This code was checked in on September 10, 2008 at 12:14pm. Like everything back then, it was an experiment, and code changed quickly. Two hours and five minutes later, the API changed: export type Error struct { s string } func (e *Error) Print() { … } // to standard error! func (e *Error) String() string { … } export func Read(fd int64, b *[]byte) (ret int64, err *Error) { r, e := syscall.read(fd, &b[0], int64(len(b))); return r, ErrnoToError(e) } This new API introduced the first `Error` type. An error held a string and could return that string and also print it to standard error. The intent here was to generalize beyond integer codes. We knew from past experience that operating system error numbers were too limited a representation, that it would simplify programs not to have to shoehorn all detail about an error into 64 bits. Using error strings had worked reasonably well for us in the past, so we did the same here. This new API lasted seven months. The next April, after more experience using interfaces, we decided to generalize further and allow user-defined error implementations, by making the `os.Error` type itself an interface. We simplified by removing the `Print` method. For Go 1 two years later, based on a suggestion by Roger Peppe, `os.Error` became the built-in `error` type, and the `String` method was renamed to `Error`. Nothing has changed since then. But we have written many Go programs, and as a result we have experimented a lot with how best to implement and use errors. *Errors*Are*Values* Making `error` a simple interface and allowing many different implementations means we have the entire Go language available to define and inspect errors. We like to say that [[https://blog.golang.org/errors-are-values][errors are values]], the same as any other Go value. Here’s an example. On Unix, an attempt to dial a network connection ends up using the `connect` system call. That system call returns a `syscall.Errno`, which is a named integer type that represents a system call error number and implements the `error` interface: package syscall type Errno int64 func (e Errno) Error() string { ... } const ECONNREFUSED = Errno(61) ... err == ECONNREFUSED ... The `syscall` package also defines named constants for the host operating system’s defined error numbers. In this case, on this system, `ECONNREFUSED` is number 61. Code that gets an error from a function can test whether the error is `ECONNREFUSED` using ordinary [[https://golang.org/ref/spec#Comparison_operators][value equality]]. Moving up a level, in package `os`, any system call failure is reported using a larger error structure that records what operation was attempted in addition to the error. There are a handful of these structures. This one, `SyscallError`, describes an error invoking a specific system call with no additional information recorded: package os type SyscallError struct { Syscall string Err error } func (e *SyscallError) Error() string { return e.Syscall + ": " + e.Err.Error() } Moving up another level, in package `net`, any network failure is reported using an even larger error structure that records the details of the surrounding network operation, such as dial or listen, and the network and addresses involved: package net type OpError struct { Op string Net string Source Addr Addr Addr Err error } func (e *OpError) Error() string { ... } Putting these together, the errors returned by operations like `net.Dial` can format as strings, but they are also structured Go data values. In this case, the error is a `net.OpError`, which adds context to an `os.SyscallError`, which adds context to a `syscall.Errno`: c, err := net.Dial("tcp", "localhost:50001") // "dial tcp [::1]:50001: connect: connection refused" err is &net.OpError{ Op: "dial", Net: "tcp", Addr: &net.TCPAddr{IP: ParseIP("::1"), Port: 50001}, Err: &os.SyscallError{ Syscall: "connect", Err: syscall.Errno(61), // == ECONNREFUSED }, } When we say errors are values, we mean both that the entire Go language is available to define them and also that the entire Go language is available to inspect them. Here is an example from package net. It turns out that when you attempt a socket connection, most of the time you will get connected or get connection refused, but sometimes you can get a spurious `EADDRNOTAVAIL`, for no good reason. Go shields user programs from this failure mode by retrying. To do this, it has to inspect the error structure to find out whether the `syscall.Errno` deep inside is `EADDRNOTAVAIL`. Here is the code: func spuriousENOTAVAIL(err error) bool { if op, ok := err.(*OpError); ok { err = op.Err } if sys, ok := err.(*os.SyscallError); ok { err = sys.Err } return err == syscall.EADDRNOTAVAIL } A [[https://golang.org/ref/spec#Type_assertions][type assertion]] peels away any `net.OpError` wrapping. And then a second type assertion peels away any `os.SyscallError` wrapping. And then the function checks the unwrapped error for equality with `EADDRNOTAVAIL`. What we’ve learned from years of experience, from this experimenting with Go errors, is that it is very powerful to be able to define arbitrary implementations of the `error` interface, to have the full Go language available both to construct and to deconstruct errors, and not to require the use of any single implementation. These properties—that errors are values, and that there is not one required error implementation—are important to preserve. Not mandating one error implementation enabled everyone to experiment with additional functionality that an error might provide, leading to many packages, such as [[https://godoc.org/github.com/pkg/errors][github.com/pkg/errors]], [[https://godoc.org/gopkg.in/errgo.v2][gopkg.in/errgo.v2]], [[https://godoc.org/github.com/hashicorp/errwrap][github.com/hashicorp/errwrap]], [[https://godoc.org/upspin.io/errors][upspin.io/errors]], [[https://godoc.org/github.com/spacemonkeygo/errors][github.com/spacemonkeygo/errors]], and more. One problem with unconstrained experimentation, though, is that as a client you have to program to the union of all the possible implementations you might encounter. A simplification that seemed worth exploring for Go 2 was to define a standard version of commonly-added functionality, in the form of agreed-upon optional interfaces, so that different implementations could interoperate. *Unwrap* The most commonly-added functionality in these packages is some method that can be called to remove context from an error, returning the error inside. Packages use different names and meanings for this operation, and sometimes it removes one level of context, while sometimes it removes as many levels as possible. For Go 1.13, we have introduced a convention that an error implementation adding removable context to an inner error should implement an `Unwrap` method that returns the inner error, unwrapping the context. If there is no inner error appropriate to expose to callers, either the error shouldn’t have an `Unwrap` method, or the `Unwrap` method should return nil. // Go 1.13 optional method for error implementations. interface { // Unwrap removes one layer of context, // returning the inner error if any, or else nil. Unwrap() error } The way to call this optional method is to invoke the helper function `errors.Unwrap`, which handles cases like the error itself being nil or not having an `Unwrap` method at all. package errors // Unwrap returns the result of calling // the Unwrap method on err, // if err’s type defines an Unwrap method. // Otherwise, Unwrap returns nil. func Unwrap(err error) error We can use the `Unwrap` method to write a simpler, more general version of `spuriousENOTAVAIL`. Instead of looking for specific error wrapper implementations like `net.OpError` or `os.SyscallError`, the general version can loop, calling `Unwrap` to remove context, until either it reaches `EADDRNOTAVAIL` or there’s no error left: func spuriousENOTAVAIL(err error) bool { for err != nil { if err == syscall.EADDRNOTAVAIL { return true } err = errors.Unwrap(err) } return false } This loop is so common, though, that Go 1.13 defines a second function, `errors.Is`, that repeatedly unwraps an error looking for a specific target. So we can replace the entire loop with a single call to `errors.Is`: func spuriousENOTAVAIL(err error) bool { return errors.Is(err, syscall.EADDRNOTAVAIL) } At this point we probably wouldn’t even define the function; it would be equally clear, and simpler, to call `errors.Is` directly at the call sites. Go 1.13 also introduces a function `errors.As` that unwraps until it finds a specific implementation type. If you want to write code that works with arbitrarily-wrapped errors, `errors.Is` is the wrapper-aware version of an error equality check: err == target → errors.Is(err, target) And `errors.As` is the wrapper-aware version of an error type assertion: target, ok := err.(*Type) if ok { ... } → var target *Type if errors.As(err, &target) { ... } *To*Unwrap*Or*Not*To*Unwrap?* Whether to make it possible to unwrap an error is an API decision, the same way that whether to export a struct field is an API decision. Sometimes it is appropriate to expose that detail to calling code, and sometimes it isn’t. When it is, implement Unwrap. When it isn’t, don’t implement Unwrap. Until now, `fmt.Errorf` has not exposed an underlying error formatted with `%v` to caller inspection. That is, the result of `fmt.Errorf` has not been possible to unwrap. Consider this example: // errors.Unwrap(err2) == nil // err1 is not available (same as earlier Go versions) err2 := fmt.Errorf("connect: %v", err1) If `err2` is returned to a caller, that caller has never had any way to open up `err2` and access `err1`. We preserved that property in Go 1.13. For the times when you do want to allow unwrapping the result of `fmt.Errorf`, we also added a new printing verb `%w`, which formats like `%v`, requires an error value argument, and makes the resulting error’s `Unwrap` method return that argument. In our example, suppose we replace `%v` with `%w`: // errors.Unwrap(err4) == err3 // (%w is new in Go 1.13) err4 := fmt.Errorf("connect: %w", err3) Now, if `err4` is returned to a caller, the caller can use `Unwrap` to retrieve `err3`. It is important to note that absolute rules like “always use `%v` (or never implement `Unwrap`)” or “always use `%w` (or always implement `Unwrap`)” are as wrong as absolute rules like “never export struct fields” or “always export struct fields.” Instead, the right decision depends on whether callers should be able to inspect and depend on the additional information that using `%w` or implementing `Unwrap` exposes. As an illustration of this point, every error-wrapping type in the standard library that already had an exported `Err` field now also has an `Unwrap` method returning that field, but implementations with unexported error fields do not, and existing uses of `fmt.Errorf` with `%v` still use `%v`, not `%w`. *Error*Value*Printing*(Abandoned)* Along with the design draft for Unwrap, we also published a [[https://golang.org/design/go2draft-error-printing][design draft for an optional method for richer error printing]], including stack frame information and support for localized, translated errors. // Optional method for error implementations type Formatter interface { Format(p Printer) (next error) } // Interface passed to Format type Printer interface { Print(args ...interface{}) Printf(format string, args ...interface{}) Detail() bool } This one is not as simple as `Unwrap`, and I won’t go into the details here. As we discussed the design with the Go community over the winter, we learned that the design wasn’t simple enough. It was too hard for individual error types to implement, and it did not help existing programs enough. On balance, it did not simplify Go development. As a result of this community discussion, we abandoned this printing design. *Error*Syntax* That was error values. Let’s look briefly at error syntax, another abandoned experiment. Here is some code from [[https://go.googlesource.com/go/+/go1.12/src/compress/lzw/writer.go#209][`compress/lzw/writer.go`]] in the standard library: // Write the savedCode if valid. if e.savedCode != invalidCode { if err := e.write(e, e.savedCode); err != nil { return err } if err := e.incHi(); err != nil && err != errOutOfCodes { return err } } // Write the eof code. eof := uint32(1)<