The case of the subtle race condition

Yesterday I got a strange race condition reported after I wrote some additional tests for pgzip. At first I couldn’t really work out why I was getting the race reported, and I suspected that the race detector was reporting wrong line numbers. The race reported was boiled down like this:

type Writer struct {
    prevTail []byte
    pushedErr chan error
}

func (z *Writer) A() {
    go func() {
        for {
            // ...
            err := SomeCall()
            if err != nil {
                z.pushError(err)   //  <<<--- READ
                return
            }
        }
    }()
}

func (z *Writer) B() {
    // ...
    z.prevTail = c   // <<<--- WRITE
}

func (z Writer) pushError(err error) {
    z.pushedErr <- err
    close(z.pushedErr)
}

It calls A() and afterwards B(), and the goroutine in A is running while B is called.

It seemed strange. The weird thing was, that I wasn't reading anything related to the write on the line. The err variable was local for the goroutine. In case the function I called was inlined, I also checked the pushError function, but it only did "safe" channel operations.

The z.prevTail is not being accessed outside the calling goroutine, so this all seemed very strange to me.

However, the most experienced of you are probably already screaming the solution: We are calling the pushError() receiver function with Writer as a value instead of a pointer. When we call a receiver with a value we must remember that it implies a copy, which in term implies a read of all member values.

So the race detector was (of course) correct, however it was quite tricky to see the actual race, since it was hidden in a receiver copy. Unfortunately it is another blow for value receivers. I try to be 'nice' and do value receivers when i can, but sometimes it is tempting to make all of them pointer receivers.

Flattr this!

  • Anders Brander

    You’re implying that value receivers is to be preferred over pointer receivers. Why is that?

    • Klaus Post

      Value receivers have the advantage that in almost all other cases, you don’t risk side-effects, when dealing with basic types in a receiver, since you operate on a copy.

      Also, value receivers could potentially help some processor optimizations. I sort of see them as as “const” definitions in C.

      But I may just be grasping at straws to try to find a reason for why they are in the language.

      • Often it’s recommended to keep all receivers of the same type though, so if one is a pointer, make them all a pointer.