append() sometimes creates new buffers, and sometimes it doesn’t.
In the following example, we create a slice, assign it to a second variable, and use
append() on both slices:
Following the calls to
b can be treated as entirely separate slices. This is fairly intuitive.
However, by tweaking the state of the initial slice, we can create entirely different behavior:
A few things to note here:
- The code is virtually the same. Only the first line has changed.
- The content of the original slice is the same. Only its capacity has changed.
- We have a different outcome: following the calls to
bcontinue to refer to the same underlying buffer.
In both examples, the assignment expression
b := a creates a second slice header, one that happens to have the same buffer pointer, length, and capacity as the first. As long as neither
b is reassigned, they are effectively aliases of the same slice. However, if either or both are reassigned in an append expression such as
a = append(a, 1), then they may or may not continue to refer to the same buffer.
A slice’s capacity will determine whether or not
append() will create a new buffer. In our trivial example, slice capacity is immediately apparent, but that won’t necessarily be the case in more complex code. You might not know where your slices are coming from–they may arrive at your code as arbitrary function arguments, or they may be created based on user input or some other external state. You also can’t predict a slice’s capacity–it can diverge from the length of the slice either explicitly (calling
make() with a specific length and capacity) or implicitly (via
append() or re-slicing).
Unless you write explicit logic that reads the capacity of a slice, it is not really clear at runtime whether
append() will retain the underlying buffer or create a new one. In other words, if you assign a slice to a second variable, and then
append() to one of them, it’s not clear whether your two slices will still refer to the same memory.
The ambiguity here is bad. Direct assignment of slice variables, such as
b := a in the previous examples, starts to seem like a code smell, and it’s tempting to avoid it altogether. But there are other ways in which a slice can end up referencing the same buffer as another slice.
For example, through re-slicing:
Or re-assigning the result of an
Or, most treacherously, obscuring the same operations with functions, whose internal behavior may or may not be immediately apparent:
One compelling rule of thumb that emerges is to always make full copies of slices when your code really means to use copies. Copying eliminates any ambiguity as to whether two slices refer to the same memory, because they simply won’t.
There are two bits of unpleasantness here, one small and one big. The small one is that copying is ugly in Go:
The bigger issue is that it’s not always clear when you “mean” to use a copy. Consider a function
join that combines a prefix with a list of suffixes:
This code is nice and clean. It is also incorrect. Check out what happens when you run it on slices of identical content but varying capacity:
The key insight here is that
append(prefix, s...) does not consistently generate new buffers when the slice capacity varies, which can lead to aliasing and memory clobbering when you didn’t want it.
Since we can’t rely on a consistently-behaving
append(), the appropriate thing to do is to make explicit copies, which ensures that we’re operating on fresh, non-aliased buffers:
From this, we get what we expected:
I find the corrected version to be significantly uglier than the original, and not just ugly, but unpleasantly subtle, in a way that feels defensive rather than expressive. When you get used to it, you just have to shrug: the Go slice design seems like a best-of-evils compromise between high-level convenience (try doing a re-allocating append operation in pure C) and lower-level idioms (try using slices to modify the same buffer in Ruby).
As far as list abstractions go, Go slices are pretty leaky, and are probably best thought of as a thin film of syntactic sugar over a memory buffer. In other words, you really should know what’s going on internally before using them. Go’s own documentation admits to as much, and the official Go blog’s “Go slices: usage and internals” ought to be considered required reading for anyone using the language.