✌🏽 😵 👩🏻‍🤝‍👨🏽 GO Scheduler: Now Not Cooperative? ⚗️ 🌬️ ❌

If you read the release notes for version GO 1.14, you probably noticed some pretty interesting changes in the runtime of the language. So I was very interested in the item: "Goroutines are now asynchronously preemptible." It turns out that GO scheduler (scheduler) is now not cooperative? Well, after reading the corresponding proposal diagonally, curiosity was satisfied.

However, after a while I decided to research the innovations in more detail. I would like to share the results of these studies.

System requirements

The things described below require from the reader, in addition to knowledge of the GO language, additional knowledge, namely:

understanding of the principles of the scheduler (although I will try to explain below, “on the fingers”)
understanding how the garbage collector works
understanding what GO assembler is

In the end, I will leave a couple of links that, in my opinion, cover these topics well.

Briefly about the planner

First, let me remind you what cooperative and non-cooperative multitasking is.

With non-cooperative (crowding out) multitasking, we are all familiar with the example of the OS scheduler. This scheduler works in the background, unloads threads based on various heuristics, and instead of unloaded CPU time, other threads begin to receive.

The cooperative planner is characterized by a different behavior - he sleeps until one of the goroutines clearly wakes him up with a hint of readiness to give his place to another. The planner will then decide for himself whether it is necessary to remove the current goroutine from the context, and if so, whom to put in its place. That’s how the GO scheduler worked.

In addition, we consider the cornerstones that the scheduler operates with:

P - logical processors (we can change their number with the runtime.GOMAXPROCS function), on each logical processor one goroutine can be executed independently at one time.
M - OS threads. Each P runs on a thread from M. Note that P is not always equal to M, for example, a thread can be blocked by syscall and then another thread will be allocated for its P. And there is CGO and other and other nuances.
G - gorutins. Well here it is clear, G must be executed on each P and scheduler monitors this.

And the last thing you need to know, and when does the scheduler actually call goroutine? It's simple, usually instructions for calling are inserted by the compiler at the beginning of the body (prologue) of the function (a little later we will talk about this in more detail).

And what is the problem actually?

From the beginning of the article, you already understood that the principle of the scheduler’s work has changed in GO, let’s look at the reasons why these changes were made. Take a look at the code:

under the spoiler

func main() {
	runtime.GOMAXPROCS(1)
	go func() {
		var u int
		for {
			u -= 2
			if u == 1 {
				break
			}
		}
	}()
	<-time.After(time.Millisecond * 5) //    main   ,         

	fmt.Println("go 1.13 has never been here")
}

If you compile it with version GO <1.14, then the line “go 1.13 has never been here” you will not see on the screen. This happens because, as soon as the scheduler gives processor time to the goroutine with an infinite loop, it completely captures P, no function calls occur inside this goroutine, which means we will not wake the scheduler anymore. And only an explicit call to runtime.Gosched () will let our program terminate.

This is just one example where goroutine captures P and for a long time prevents other goroutines from executing on this P. More options when this behavior causes problems can be found by reading the proposal.

Parse proposal

The solution to this problem is quite simple. Let's do the same as in the OS scheduler! Just let GO run out the goroutine from P and put another one there, and for this we will use the OS tools.

OK, how to implement this? We will allow the runtime to send a signal to the flow on which goroutine works. We will register the processor of this signal on each stream from M, the task of the processor is to determine whether the current goroutine can be supplanted. If so, we will save its current state (registers and the state of the stack) and give resources to another one, otherwise we will continue to execute the current goroutine. It is worth noting that the concept with a signal is a solution for UNIX-base systems, while, for example, the implementation for Windows is slightly different. By the way, SIGURG was selected as a signal for sending.

The most difficult part of this implementation is determining whether goroutine can be forced out. The fact is that some places in our code should be atomic, from the point of view of the garbage collector. We call such places unsafe-points. If we squeeze goroutine at the moment of code execution from unsafe-point, and then GC starts up, then it will replace the state of our goroutine, taken in unsafe-point'e, and can do things. Let's take a closer look at the safe / unsafe concept.

Did you go there, GC?

In versions prior to 1.12, runtime Gosched used safe-points in places where you can definitely call the scheduler without fear that we would end up in the atomic section of the code for GC. As we have already said, safe-points data are located in the prologue of a function (but not of every function, mind you). If you disassembled the go-shn assembler, you could object - no obvious scheduler calls are visible there. Yes it is, but you can find the runtime.morestack call instruction there, and if you look inside this function, a scheduler call will be found. Under the spoiler, I will hide the comment from the GO sources, or you can find the assembler for morestack yourself.

found in source

Synchronous safe-points are implemented by overloading the stack bound check in function prologues. To preempt a goroutine at the next synchronous safe-point, the runtime poisons the goroutine's stack bound to a value that will cause the next stack bound check to fail and enter the stack growth implementation, which will detect that it was actually a preemption and redirect to preemption handling.

Obviously, when switching to a crowding-out concept, a crowding-out signal can catch our gorutin anywhere. But the GO authors decided not to leave safe-points, but declare safe-points everywhere! Well, of course, there is a catch, almost averywhere in fact. As mentioned above, there are some unsafe-points where we will not force out anyone. Let's write a simple unsafe-point.


j := &someStruct{}
p := unsafe.Pointer(j)
// unsafe-point start
u := uintptr(p)
//do some stuff here
p = unsafe.Pointer(u)
// unsafe-point end

To understand what the problem is, let's try on the skin of a garbage collector. Each time we go to work, we need to find out the root nodes (pointers on the stack and in the registers), with which we will begin marking. Since it is impossible to say in runtime whether 64 bytes in memory are a pointer or just a number, we turn to the stack and register cards (some cache with meta information), kindly provided to us by the GO compiler. The information in these maps allows us to find pointers. So, we were woken up and sent to work when GO performed line number 4. Arriving at the place and looking at the cards, we found that it was empty (and this is true, because uintptr from the point of view of GC is a number and not a pointer). Well, yesterday we heard about the allocation of memory for j, since now we can’t get to this memory - we need to clean it up, and after removing the memory we go to sleep.What's next? Well, the authorities woke up, at night, shouting, well, you yourself understood.

That's all with theory, I propose to consider in practice how all these signals, unsafe points and register cards and stacks work.

Let's move on to practice

I double-ran (go 1.14 and go 1.13) an example from the beginning of the article by the perf profiler in order to see what system calls are happening and compare them. The required syscall in the 14th version was found quite quickly:

15.652 ( 0.003 ms): main/29614 tgkill(tgid: 29613 (main), pid: 29613 (main), sig: URG                ) = 0

Well, obviously the runtime sent SIGURG to the thread on which goroutine is spinning. Taking this knowledge as a starting point, I went to look at the commits in GO to find where and for what reason this signal is sent, and also to find the place where the signal handler is installed. Let's start with sending, we will find the signal sending function in runtime / os_linux.go


func signalM(mp *m, sig int) {
	tgkill(getpid(), int(mp.procid), sig)
}

Now we find places in the runtime code, from where we send the signal:

when goroutine is suspend, if it is in a running state. The suspend request comes from the garbage collector. Here, perhaps, I will not add code, but it can be found in the file runtime / preempt.go (suspendG)
if the scheduler decides that goroutine is working too long, runtime / proc.go (retake)
```
if pd.schedwhen+forcePreemptNS <= now {
	signalM(_p_)
}
```
forcePreemptNS - constant equal to 10ms, pd.schedwhen - time when the scheduler for the pd stream was called last time
as well as all streams this signal is sent during a panic, StopTheWorld (GC) and a few more cases (which I have to bypass, because the size of the article will already go beyond the bounds)

We figured out how and when the runtime sends a signal to M. Now let's find the handler for this signal and see what the stream does when it is received.


func doSigPreempt(gp *g, ctxt *sigctxt) {
	if wantAsyncPreempt(gp) && isAsyncSafePoint(gp, ctxt.sigpc(), ctxt.sigsp(), ctxt.siglr()) {
		// Inject a call to asyncPreempt.
		ctxt.pushCall(funcPC(asyncPreempt))
	}
}

From this function it is clear that in order to “lock in” you need to go through 2 checks:

wantAsyncPreempt - we check whether G wants to be forced out, here, for example, the validity of the current goroutine status will be checked.
isAsyncSafePoint - check if it can be crowded out right now. The most interesting of the checks here is whether G is in a safe or unsafe point. In addition, we must be sure that the thread on which G is running is also ready to preempt G.

If both checks are passed, instructions will be called from the executable code that save state G and call the scheduler.

And more about unsafe

I propose to analyze a new example, it will illustrate another case with unsafe-point:

another endless program


//go:nosplit
func infiniteLoop() {
	var u int
	for {
		u -= 2
		if u == 1 {
			break
		}
	}
}

func main() {
	runtime.GOMAXPROCS(1)
	go infiniteLoop()
	<-time.After(time.Millisecond * 5)

	fmt.Println("go 1.13 and 1.14 has never been here")
}

As you might guess, the inscription “go 1.13 and 1.14 never been here” we will not see in GO 1.14. This is because we have explicitly forbidden to interrupt the infiniteLoop (go: nosplit) function. Such a ban is implemented just using unsafe-point, which is the entire body of the function. Let's see what the compiler generated for the infiniteLoop function.

Caution Assembler


        0x0000 00000 (main.go:10)   TEXT    "".infiniteLoop(SB), NOSPLIT|ABIInternal, $0-0
        0x0000 00000 (main.go:10)   PCDATA  $0, $-2
        0x0000 00000 (main.go:10)   PCDATA  $1, $-2
        0x0000 00000 (main.go:10)   FUNCDATA        $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (main.go:10)   FUNCDATA        $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (main.go:10)   FUNCDATA        $2, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (main.go:10)   XORL    AX, AX
        0x0002 00002 (main.go:12)   JMP     8
        0x0004 00004 (main.go:13)   ADDQ    $-2, AX
        0x0008 00008 (main.go:14)   CMPQ    AX, $3
        0x000c 00012 (main.go:14)   JNE     4
        0x000e 00014 (main.go:15)   PCDATA  $0, $-1
        0x000e 00014 (main.go:15)   PCDATA  $1, $-1
        0x000e 00014 (main.go:15)   RET

In our case, the PCDATA instruction is of interest. When the linker sees this instruction, it does not convert it to a "real" assembler. Instead, the value of the 2nd argument with the key equal to the corresponding programm counter (the number that can be observed to the left of the function name + line) will be placed in the register or stack map (determined by the 1st argument).

As we see on lines 10 and 15, we put the values $ 2 and -1 in the map $ 0 and $ 1, respectively. Let's remember this moment and take a look inside the isAsyncSafePoint function, to which I have already drawn your attention. There we will see the following lines:

isAsyncSafePoint


	smi := pcdatavalue(f, _PCDATA_RegMapIndex, pc, nil)
	if smi == -2 {
		return false
	}

It is in this place that we check whether goroutine is currently in the safe-point. We turn to the map of registers (_PCDATA_RegMapIndex = 0), and passing it the current pc we check the value, if -2 then G is not in a safe-point'e, which means it cannot be crowded out.

Conclusion

I stopped my “research” on this, I hope the article was useful for you too.
I post the promised links, but please be careful, because some of the information in these articles could be outdated.

GO scheduler - once and twice .

Assembler GO.

GO Scheduler: Now Not Cooperative?