Go: Should I use a pointer instead of a copy of my structure?

image
Illustration created for A Journey With Go from an original gopher created by Rene French.

In terms of performance, the systematic use of pointers instead of copying the structure itself to share structures for many Go developers seems the best option. In order to understand the effect of using a pointer instead of a copy of the structure, we will consider two use cases.

Intensive data distribution


Let's look at a simple example when you want to share a structure to access its values:

type S struct {
  a, b, c int64
  d, e, f string
  g, h, i float64
}

Here is the basic structure, access to which can be shared by copy or pointer:

func byCopy() S {
  return S{
     a: 1, b: 1, c: 1,
     e: "foo", f: "foo",
     g: 1.0, h: 1.0, i: 1.0,
  }
}

func byPointer() *S {
  return &S{
     a: 1, b: 1, c: 1,
     e: "foo", f: "foo",
     g: 1.0, h: 1.0, i: 1.0,
  }
}

Based on these two methods, we can write 2 benchmarks. The first is where the structure is passed in with a copy:

func BenchmarkMemoryStack(b *testing.B) {
  var s S

  f, err := os.Create("stack.out")
  if err != nil {
     panic(err)
  }
  defer f.Close()

  err = trace.Start(f)
  if err != nil {
     panic(err)
  }

  for i := 0; i < b.N; i++ {
     s = byCopy()
  }

  trace.Stop()

  b.StopTimer()

  _ = fmt.Sprintf("%v", s.a)
}

The second - very similar to the first - where the structure is passed by pointer:

func BenchmarkMemoryHeap(b *testing.B) {
  var s *S

  f, err := os.Create("heap.out")
  if err != nil {
     panic(err)
  }
  defer f.Close()

  err = trace.Start(f)
  if err != nil {
     panic(err)
  }

  for i := 0; i < b.N; i++ {
     s = byPointer()
  }

  trace.Stop()

  b.StopTimer()

  _ = fmt.Sprintf("%v", s.a)
}

Let's run the benchmarks:

go test ./... -bench=BenchmarkMemoryHeap -benchmem -run=^$ -count=10 > head.txt && benchstat head.txt
go test ./... -bench=BenchmarkMemoryStack -benchmem -run=^$ -count=10 > stack.txt && benchstat stack.txt

We get the following statistics:

name          time/op
MemoryHeap-4  75.0ns ± 5%

name          alloc/op
MemoryHeap-4   96.0B ± 0%

name          allocs/op
MemoryHeap-4    1.00 ± 0%

------------------

name           time/op
MemoryStack-4  8.93ns ± 4%

name           alloc/op
MemoryStack-4   0.00B

name           allocs/op
MemoryStack-4    0.00

Using a copy of the structure was 8 times faster than using a pointer to it!

To understand why, let's look at the graphs generated by the trace: the

image
graph for the structure passed by the copy the

image
graph for the structure passed by the pointer

The first graph is pretty simple. Since the heap is not used, there is no garbage collector and excess gorutin.

In the second case, using pointers causes the Go compiler to move the variable to the heap and run the garbage collector. If we increase the scale of the graph, we will see that the garbage collector occupies an important part of the process:

image

This graph shows that the garbage collector starts every 4 ms.

If we zoom in again, we can get detailed information about what exactly is happening: The

image

blue, pink and red stripes are the phases of the garbage collector, and the brown ones are associated with allocation in the heap (marked “runtime.bgsweep” on the graph):

Sweeping is the release from the heap of data-related sections of memory that are not marked as used. This action occurs when goroutines try to isolate new values ​​in the heap memory. Sweeping delay is added to the cost of performing the allocation in the heap memory and does not apply to any delays associated with garbage collection.

www.ardanlabs.com/blog/2018/12/garbage-collection-in-go-part1-semantics.html

Even if this example is a bit extreme, we see how it can be expensive to allocate a variable on the heap rather than on the stack. In our example, the structure is much faster allocated on the stack and copied than created on the heap and its address is shared.

If you are not familiar with the stack / heap, and if you want to know more about their internal details, you can find a lot of information on the Internet, for example, this article by Paul Gribble.

Things can be even worse if we limit the processor to 1 using GOMAXPROCS = 1:

name        time/op
MemoryHeap  114ns ± 4%

name        alloc/op
MemoryHeap  96.0B ± 0%

name        allocs/op
MemoryHeap   1.00 ± 0%

------------------

name         time/op
MemoryStack  8.77ns ± 5%

name         alloc/op
MemoryStack   0.00B

name         allocs/op
MemoryStack    0.00

If the benchmark for placing on the stack did not change, then the indicator on the heap decreased from 75ns / op to 114ns / op.

Intensive function calls


We will add two empty methods to our structure and adapt our benchmarks a little:

func (s S) stack(s1 S) {}

func (s *S) heap(s1 *S) {}

The benchmark with the placement on the stack will create the structure and pass it a copy:

func BenchmarkMemoryStack(b *testing.B) {
  var s S
  var s1 S

  s = byCopy()
  s1 = byCopy()
  for i := 0; i < b.N; i++ {
     for i := 0; i < 1000000; i++  {
        s.stack(s1)
     }
  }
}

And the benchmark for the heap will pass the structure by pointer:

func BenchmarkMemoryHeap(b *testing.B) {
  var s *S
  var s1 *S

  s = byPointer()
  s1 = byPointer()
  for i := 0; i < b.N; i++ {
     for i := 0; i < 1000000; i++ {
        s.heap(s1)
     }
  }
}

As expected, the results are completely different now:

name          time/op
MemoryHeap-4  301µs ± 4%

name          alloc/op
MemoryHeap-4  0.00B

name          allocs/op
MemoryHeap-4   0.00

------------------

name           time/op
MemoryStack-4  595µs ± 2%

name           alloc/op
MemoryStack-4  0.00B

name           allocs/op
MemoryStack-4   0.00

Conclusion


Using a pointer instead of a copy of the structure in go is not always good. To choose a good semantics for your data, I highly recommend reading a post on value / pointer semantics written by Bill Kennedy . This will give you a better idea of ​​the strategies you can use with your structures and built-in types. In addition, profiling memory usage will definitely help you understand what is happening with your allocations and heap.

All Articles