aboutsummaryrefslogtreecommitdiff
path: root/content/declaration-syntax.article
diff options
context:
space:
mode:
Diffstat (limited to 'content/declaration-syntax.article')
-rw-r--r--content/declaration-syntax.article211
1 files changed, 211 insertions, 0 deletions
diff --git a/content/declaration-syntax.article b/content/declaration-syntax.article
new file mode 100644
index 0000000..f1d0574
--- /dev/null
+++ b/content/declaration-syntax.article
@@ -0,0 +1,211 @@
+# Go's Declaration Syntax
+7 Jul 2010
+Tags: c, syntax, ethos
+Summary: Why Go's declaration syntax doesn't look like, and is much simpler than, C's.
+OldURL: /gos-declaration-syntax
+
+Rob Pike
+
+## Introduction
+
+Newcomers to Go wonder why the declaration syntax is different from the
+tradition established in the C family.
+In this post we'll compare the two approaches and explain why Go's declarations look as they do.
+
+## C syntax
+
+First, let's talk about C syntax. C took an unusual and clever approach
+to declaration syntax.
+Instead of describing the types with special syntax,
+one writes an expression involving the item being declared,
+and states what type that expression will have. Thus
+
+ int x;
+
+declares x to be an int: the expression 'x' will have type int.
+In general, to figure out how to write the type of a new variable,
+write an expression involving that variable that evaluates to a basic type,
+then put the basic type on the left and the expression on the right.
+
+Thus, the declarations
+
+ int *p;
+ int a[3];
+
+state that p is a pointer to int because '\*p' has type int,
+and that a is an array of ints because a[3] (ignoring the particular index value,
+which is punned to be the size of the array) has type int.
+
+What about functions? Originally, C's function declarations wrote the types
+of the arguments outside the parens, like this:
+
+ int main(argc, argv)
+ int argc;
+ char *argv[];
+ { /* ... */ }
+
+Again, we see that main is a function because the expression main(argc,
+argv) returns an int.
+In modern notation we'd write
+
+ int main(int argc, char *argv[]) { /* ... */ }
+
+but the basic structure is the same.
+
+This is a clever syntactic idea that works well for simple types but can get confusing fast.
+The famous example is declaring a function pointer.
+Follow the rules and you get this:
+
+ int (*fp)(int a, int b);
+
+Here, fp is a pointer to a function because if you write the expression (\*fp)(a,
+b) you'll call a function that returns int.
+What if one of fp's arguments is itself a function?
+
+ int (*fp)(int (*ff)(int x, int y), int b)
+
+That's starting to get hard to read.
+
+Of course, we can leave out the name of the parameters when we declare a function, so main can be declared
+
+ int main(int, char *[])
+
+Recall that argv is declared like this,
+
+ char *argv[]
+
+so you drop the name from the middle of its declaration to construct its type.
+It's not obvious, though, that you declare something of type char \*[] by
+putting its name in the middle.
+
+And look what happens to fp's declaration if you don't name the parameters:
+
+ int (*fp)(int (*)(int, int), int)
+
+Not only is it not obvious where to put the name inside
+
+ int (*)(int, int)
+
+it's not exactly clear that it's a function pointer declaration at all.
+And what if the return type is a function pointer?
+
+ int (*(*fp)(int (*)(int, int), int))(int, int)
+
+It's hard even to see that this declaration is about fp.
+
+You can construct more elaborate examples but these should illustrate some
+of the difficulties that C's declaration syntax can introduce.
+
+There's one more point that needs to be made, though.
+Because type and declaration syntax are the same,
+it can be difficult to parse expressions with types in the middle.
+This is why, for instance, C casts always parenthesize the type, as in
+
+ (int)M_PI
+
+## Go syntax
+
+Languages outside the C family usually use a distinct type syntax in declarations.
+Although it's a separate point, the name usually comes first,
+often followed by a colon.
+Thus our examples above become something like (in a fictional but illustrative language)
+
+ x: int
+ p: pointer to int
+ a: array[3] of int
+
+These declarations are clear, if verbose - you just read them left to right.
+Go takes its cue from here, but in the interests of brevity it drops the
+colon and removes some of the keywords:
+
+ x int
+ p *int
+ a [3]int
+
+There is no direct correspondence between the look of [3]int and how to
+use a in an expression.
+(We'll come back to pointers in the next section.) You gain clarity at the
+cost of a separate syntax.
+
+Now consider functions. Let's transcribe the declaration for main as it would read in Go,
+although the real main function in Go takes no arguments:
+
+ func main(argc int, argv []string) int
+
+Superficially that's not much different from C,
+other than the change from `char` arrays to strings,
+but it reads well from left to right:
+
+function main takes an int and a slice of strings and returns an int.
+
+Drop the parameter names and it's just as clear - they're always first so there's no confusion.
+
+ func main(int, []string) int
+
+One merit of this left-to-right style is how well it works as the types
+become more complex.
+Here's a declaration of a function variable (analogous to a function pointer in C):
+
+ f func(func(int,int) int, int) int
+
+Or if f returns a function:
+
+ f func(func(int,int) int, int) func(int, int) int
+
+It still reads clearly, from left to right,
+and it's always obvious which name is being declared - the name comes first.
+
+The distinction between type and expression syntax makes it easy to write and invoke closures in Go:
+
+ sum := func(a, b int) int { return a+b } (3, 4)
+
+## Pointers
+
+Pointers are the exception that proves the rule.
+Notice that in arrays and slices, for instance,
+Go's type syntax puts the brackets on the left of the type but the expression
+syntax puts them on the right of the expression:
+
+ var a []int
+ x = a[1]
+
+For familiarity, Go's pointers use the \* notation from C,
+but we could not bring ourselves to make a similar reversal for pointer types.
+Thus pointers work like this
+
+ var p *int
+ x = *p
+
+We couldn't say
+
+ var p *int
+ x = p*
+
+because that postfix \* would conflate with multiplication. We could have used the Pascal ^, for example:
+
+ var p ^int
+ x = p^
+
+and perhaps we should have (and chosen another operator for xor),
+because the prefix asterisk on both types and expressions complicates things
+in a number of ways.
+For instance, although one can write
+
+ []int("hi")
+
+as a conversion, one must parenthesize the type if it starts with a \*:
+
+ (*int)(nil)
+
+Had we been willing to give up \* as pointer syntax, those parentheses would be unnecessary.
+
+So Go's pointer syntax is tied to the familiar C form,
+but those ties mean that we cannot break completely from using parentheses
+to disambiguate types and expressions in the grammar.
+
+Overall, though, we believe Go's type syntax is easier to understand than C's, especially when things get complicated.
+
+## Notes
+
+Go's declarations read left to right. It's been pointed out that C's read in a spiral!
+See [ The "Clockwise/Spiral Rule"](http://c-faq.com/decl/spiral.anderson.html) by David Anderson.