Creating a Multi-Language, Serverless Chat Program - Part 1: Protocol Design

Written By: Nathan Baker

- 11 Aug 2006 -
















Description: In order to explore how different languages approach GUI programming, network sockets, and object-oriented concepts, we will embark on a journey in creating a serverless chat protocol, and clients for it in many different languages. This is the first installment, where I describe the protocol and the project.

  1. Introduction
  2. Protocol Overview
  3. The Distributed Protocol
  4. The Network Format
  5. Outro

Protocol Overview

Since this is the first installment, I need to actually design the protocol and describe what it is I'm going to help you create. This will take some extra time, but it's very important when beginning a project to have the requirements firmly in mind beforehand. A good design can mean the difference between a successful project and a failed one. Especially when you're designing something for network serialization, it's important to spec out exactly what needs to be sent over the wire and in what format. But enough of me talking about it; let's actually do it.

First off, serverless is a bit of a misnomer. Instead of having just one server, each client has its own server built-in. Technically, this is a distributed chat program, but that seems a little formidable, so I chose the term serverless. The server communicates with the client, exchanging data. The server also communicates with remote servers, exchanging data. Clients do not talk to remote servers, and remote servers do not talk to clients. Also, to get some terminology out of the way, each individual program instance is called a node. So, to help you visualize, I designed a fancy-schmancy picture for you.

Distributed (Serverless) Chat Diagram

As you can see, there are two types of data being exchanged: users, and messages. Although both of these are represented on the network as streams of bytes, we will differentiate between the two for architectural purposes. Which brings me to one of my laws for designing an architecture: never go lower-level than you have to.

Also, you will see that I represented the network as an uninteresting ellipse labeled "Network", rather than replicating my client/server diagram. This is because, with a network application, it doesn't really matter what you are talking to, so long as it speaks your language. It might be that someone later on decides to write a server for this chat protocol that is decoupled from the client. It would be possible for the distributed client and the standalone server to talk with one-another. All the server sees is a bunch of network connections.

So now we have our architecture mostly set up, at least at a high level. Now let's talk about messages and users.

Users

A user is, well, someone who is using our chat program. There is generally one user per node, and that's the way our program is going to work too. A user is represented by a server instance. In other words, if my client wants to communicate with your client, it has to use your server to do so. Users also have names, and we'll also give users a status. Thus, a user is three things: a name and a status, which mean something to the user, and a server instance, which means something to the client.

However, security is the big buzzword, and nobody wants more personal information given away than can be helped. Since we're a distributed protocol, we can get fancy. Let's say that I'm running a client. You connect to me. Now we can chat. You think this is way cool, so you invite your buddy, cleverly named Buddy, to connect as well. Buddy connects to you. Your server then informs my server that a new person, Buddy, has entered the chat. However, your server tells me Buddy's name and status, but for its server instance, it gives me itself. Therefore, I see two people chatting (besides myself), both on your server. Now, when I send a message, my server is smart enough to only send it one time, to your server. Your server then sends the message along to Buddy's server, as well as sending it to your client. Furthermore, if I were to send a message only to Buddy, it would go to your server, and your server would send it to Buddy.

This is a security tradeoff. On one hand, my private message is going through your server, so maybe you can snoop on it, or stop it entirely. However, I can't see anything about Buddy, so if he doesn't trust me, that's OK: you act as a proxy for him.

Those of you who know a bit more about networking will realize that this works similar to a routing algorithm. This means that message propagation time is slower, but the server load is shared, and there is no centralized server to go down. There is also no centralized server to be raided, in case the government or ISP decide to get uppity. However (as in the case of a netsplit on IRC), if one node goes down, all of a sudden there is a break in the chain. There are some different, fancy ways to prevent this, but we'll ignore that issue for the sake of simplicity. This is already complicated enough.

Messages

All communication is done via messages that are passed from server to server. Some of these messages will be passed on to the client (such as chat messages), while others are merely one server telling another about some important action. These are called state messages. Some state messages may be passed on to the client (like notification of a user joining), and others may not (if a node shuts down and all its connected nodes need to be connected to someone else).

The basic message will have a type. This type is simply a number from 0 to 255, where each number corresponds to a given message classification. For example, type 0 might be just a bare message, where the type is all that's necessary. Type 1 might contain text, and type 2 might contain a user status update. I say 'might' because we'll actually hammer out what each message means later. This also means that the protocol can be seamlessly extended--older clients that don't understand the new messages will just ignore them, while newer clients can choose to handle the new messages if they want. This can lead to some problems--for example, if one group says that message type 5 means "I am sending you a data transfer", and another uses type 5 to mean "here is my public key for a secure exchange", this can lead to confusion. If this were a formal specification, it would be necessary to resolve this confusion (most likely by increasing the range of message types and encouraging new types to use high numbers, decreasing the probability of collision). Since this is not, we will ignore this detail.

It should also be mentioned at this point that users, when exchanged between servers, will be contained in messages. This is not necessarily the case for communication between client and server.

<< Previous

Next >>