A simple way to limit the number of simultaneous clients of a Go net/http server

April 13, 2016 by Paul Smith

This is a simple and easily generalizable way to put an upper-bound on the maximum number of simultaneous clients to a Go net/http server or handler.

The idea is to use a counting semaphore, modeled with a buffered channel, to cause new clients to queue which arrive after the nth current client, where n is the size of the buffer.

Ideally, we wouldn't want to limit the amount of concurrency to our application, but practically, there are limits on underlying resources, and forcing clients to queue after a certain limit gives us control over that resource utilization.

Let's say we have a simple HTTP handler that requests access to some expensive resource, like a database or complex computation:

package main

import (
    "io"
    "log"
    "net/http"
)    

func main() {
     http.Handle("/", http.HandleFunc(func(w http.ResponseWriter, r *http.Request) {
         res := getExpensiveResource()
         io.WriteString(w, res.String())
     })

     log.Fatal(http.ListenAndServe(":8080", nil))
}

The handler can be requested by an unbounded number of clients, potentially exhausting our resources.

Let's add a counting semaphore that will gate entry into the handler:

func main() {
     const maxClients = 10
     sema := make(chan struct{}, maxClients)

     http.Handle("/", http.HandleFunc(func(w http.ResponseWriter, r *http.Request) {
         sema <- struct{}{}
         defer func() { <-sema }()

         res := getExpensiveResource()
         io.WriteString(w, res.String())
     })

We make a channel of type struct{}, because we are only interested in the send/receive semantics of the channel, not its value. The first statement of the handler is a send on the channel, which will succeed up to maxClients number of simulatenous requests. Think of the buffered channel as having empty slots, and being able to send on it means that you can fill a slot and proceed. If there are no empty slots, in other words, the length of the channel is equal to the buffer size, then the send will block, and will have to wait to proceed until a slot frees up. The next statement defers until after the handler has returned or panicked, and frees a slot by receiving from the channel.

If we have more than one handler to limit access to, we can move the semaphore into a middleware and wrap the original handler, leaving the body of it unchanged:

package main

import (
    "io"
    "log"
    "net/http"
)    

func maxClients(h http.Handler, n int) http.Handler {
     sema := make(chan struct{}, n)

     return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
         sema <- struct{}{}
         defer func() { <-sema }()

         h.ServeHTTP(w, r)
     })
}

func main() {
     handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
         res := getExpensiveResource()
         io.WriteString(w, res.String())
     })

     http.Handle("/", maxClients(handler, 10))

     log.Fatal(http.ListenAndServe(":8080", nil))
}

Note that this implementation will cause clients beyond the maximum number to queue without bound, until they hit the system limit of the listen(2) backlog.

This pattern can be used to control the amount of concurrency to any resource, not just net/http handlers.

Recent posts