The SO_REUSEPORT socket option
4 stars based on
Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing By Michael Kerrisk March 13, One of the features merged in the 3. The new socket option allows multiple sockets on the same host to bind to the same port, and is intended to improve the performance of multithreaded network server applications running on top of multicore systems.
Multiple servers processes or threads can bind to the same port if they each set the option as follows: The requirement that the first server must specify this option prevents port hijacking—the possibility that a rogue application binds to a port already used by an existing bind multiple udp sockets same port in order to capture some of its incoming connections or datagrams.
With TCP sockets, it allows multiple listening sockets—normally each in a different thread—to be bound to the same port. Each thread can then accept incoming connections on the port by calling accept.
This presents an alternative to the traditional approaches used by multithreaded servers that accept incoming connections on a single socket. The first of the traditional approaches is to have a single listener thread that accepts all incoming connections and then passes these off to other threads for processing. The problem with this approach is that the listening thread can become a bottleneck in extreme cases. Given that sort of number, it's unsurprising to learn that Tom works at Google.
The second of the traditional approaches used by multithreaded servers operating on a single port is to have all of the threads or processes perform an accept call on a single listening socket in a simple event bind multiple udp sockets same port of the form: At Google, they have seen a factor-of-three difference between the thread accepting the most connections and the thread accepting the fewest connections; that sort of imbalance can lead to underutilization bind multiple udp sockets same port CPU cores.
The traditional approach is that all threads would compete to perform recv calls on a single shared socket. As with the second of the traditional TCP scenarios described above, this can lead to unbalanced loads across the threads. There are two other noteworthy points about Tom's patches. The first of these is a useful aspect of the implementation. Incoming connections and datagrams are distributed to the server sockets using a hash based on the 4-tuple of the connection—that is, the peer IP address and port plus the local IP address and bind multiple udp sockets same port.
This means, for example, that if a client uses the same socket to send a series of datagrams to the server port, then those datagrams will all be directed to the same receiving server as long as it continues to exist.
This eases the task of conducting stateful conversations between the client and server. If the number of listening sockets bound to a port changes because new servers are started or existing servers terminate, it is possible that incoming connections can be dropped during the three-way handshake.
The problem is that connection requests are tied to a specific listening socket when the initial SYN packet is received during the handshake. In this case, the client connection will be reset, and the server is left with an orphaned request structure. A solution to the problem is still being worked on, and may consist of implementing a connection request table that can be shared among multiple listening sockets.
It seems to offer a useful alternative for squeezing the maximum performance out of network applications running on bind multiple udp sockets same port systems, and thus is likely to be a welcome addition for some application developers. Posted Mar 14, 7: Posted Mar 14, At the moment you need to have a unix socket between the servers, send over the tcp socket file handle, start accept ing in the new server and then shutdown the old one.
Posted Jun 8, 5: So, no, this doesn't support seamless server restarts. Ironically it's the BSD semantics which support seamless server restarts. That allows the old server to drain its queue and retire without worrying about any dropped connections.
Posted Mar 15, That probably explains it. If you used this technique on multiple threads accepting on the same traditional socket, you would be fixing one thing and bind multiple udp sockets same port another. Today, if a thread is blocked in accept and no other thread is, and a connection request arrives, the thread bind multiple udp sockets same port it.
Posted Feb 8, In other news, what Linux comes up is setting standards, because there have not been any standards before: Posted Mar 15, 2: Hey, Linux is THE trendsetter here. Linux hackers and organizations behind its development are certainly not going to wait for some standards body like POSIX, IETF, or whoever to get off their keisters to address this issue! Posted Aug 24, But generally, Unix "standards" have always trailed implementations.
Posted Oct 1, Posted Mar 15, 5: Posted Mar 28, 6: I am just wondering does that really matter? I mean all it matters is that whoever as CPU cycles should pick up the next workload.
Now I am not sure why the thread bind multiple udp sockets same port picking up the load is actually relevant. The CPU being underutilized seems to solve by itself. I mean if a core is saturated then effectively an extra thread will pick up the next connection and this should work well with what the CPU scheduler is already doing for balancing. Posted Aug 2, Posted Apr 30, 2: Posted Sep 19, bind multiple udp sockets same port Is this socket option valid and usable for SCTP also?
Posted Oct 27, 3: Posted Jul 26, That's because each socket has its own queue, and when the 3-way handshake completes a connection is assigned to exactly one of those queues.
That creates a race bind multiple udp sockets same port between accept 2 and close 2. For example, if i have spawned few stateless servers, and if a connection request arrives, it would simply sit in queue till the original server is ready?
Most typical architecture for stateful servers is to spawn 3 of them on different physical machines for reliability purposes and have load balancer perform sticky sessions.
Another option is to spawn stateless servers with distributed cache to maintain cache. The ability to route connections and even datagrams to the same tuple is really an "application" level concern. What is the original server tuple is not running any more? Credit where it's due. Imagine a thread pinned to each core. For our case which is a HTTP server with multiple threads and each one with its own epoll 7 queue, its critical to decide just after the accept 2 which thread will work that new connection to keep a balanced load.
I read this to mean that single threaded processes that have established connections will not receive new connections because they are not waiting in accept. After testing and reading some code it's clear that processes that aren't in accept continue to receive new connections.
Please consider signing up for a subscription and helping to keep LWN publishing.