Writing a TCP/IP Service

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search

Introduction

So you want to write a TCP/IP server? It's really not that hard, once you understand sockets.

Overview

Your project is going to have to do the following:

  1. Listen on one or more ports for incoming connections.
  2. On connection, start a handler process to communicate with the client.

That's really all there is to it. There's a lot more here, particularly because your server has to do some sort of multi-tasking. But it's not terribly difficult if you wrap your head around some basic concepts.

Two Approaches: Threaded and Asynchronous

There are really two approaches to the problem of multi-tasking on the server. Each has its own benefits and drawbacks. Each one requires a particular coding style.

Threaded or Multi-Process Approach

The first is the threaded or multi-process model. In this model, each connection is given its own thread or process. The listening socket has its own thread or process as well, usually called the parent process or thread.

Although this seems to be easier to program, you'll run into a number of roadblocks. First, sharing state among threads is not trivial. For something like HTTP where each connection is mostly independent of every other connection, threading is a great way to go about the problem. Any shared state among threads is handled with some kind of database. But for servers like MUDs, this is a serious challenge, because every connection shares a lot of state with every other connection. Chat clients also have the same problem as well.

Threading won't solve the problem with asynchronous protocols such as IMAP. For these, you have to choose whether to spawn a thread or a process for each command, or whether to use an asynchronous model.

Asynchronous Model

The asynchronous model is much more difficult to get right, but has some tremendous advantages.

First, by eliminating context switches, the server can be many more times efficient, in terms of time and memory usage, than a threaded server. That is, the same amount of memory and processor resources can be used to handle 10 times, or maybe 100 times more connections simultaneously.

Second, each connection can share a tremendous amount of state without resorting to a database or shared memory.

The drawbacks are that the programming style is unique and different. Nowhere can you ever block on a disk or network operation. Any time an operation may block, you have to somehow remember your state for that operation and continue it when the resource is ready to accept the operation without blocking.

See Using asyncore to write a mud.