Yesterday I got a strange race condition reported after I wrote some additional tests for pgzip. At first I couldn’t really work out why I was getting the race reported, and I suspected that the race detector was reporting wrong line numbers. The race reported was boiled down like this:
type Writer struct {
prevTail []byte
pushedErr chan error
}
func (z *Writer) A() {
go func() {
for {
// ...
err := SomeCall()
if err != nil {
z.pushError(err) // <<<--- READ
return
}
}
}()
}
func (z *Writer) B() {
// ...
z.prevTail = c // <<<--- WRITE
}
func (z Writer) pushError(err error) {
z.pushedErr <- err
close(z.pushedErr)
}
It calls A()
and afterwards B()
, and the goroutine in A is running while B is called.
It seemed strange. The weird thing was, that I wasn't reading anything related to the write on the line. The err
variable was local for the goroutine. In case the function I called was inlined, I also checked the pushError
function, but it only did "safe" channel operations.
The z.prevTail
is not being accessed outside the calling goroutine, so this all seemed very strange to me.
However, the most experienced of you are probably already screaming the solution: We are calling the pushError()
receiver function with Writer
as a value instead of a pointer. When we call a receiver with a value we must remember that it implies a copy, which in term implies a read of all member values.
So the race detector was (of course) correct, however it was quite tricky to see the actual race, since it was hidden in a receiver copy. Unfortunately it is another blow for value receivers. I try to be 'nice' and do value receivers when i can, but sometimes it is tempting to make all of them pointer receivers.