In the following sections, we will try to build up a very simple clustered architecture to be used with PigeonAir.
If you take a look to the various modules documentation, you can realize it is very easy to build up very different solutions.
Ok, in the previous sections we talked a lot about the various components of a mail clusters. Let's say we want to build a PigeonAir cluster, and let's say we decided to use LDAP as database backend, and let's say we want to have a clear picture of the whole structure.
From a logical point of view, the task of building a PigeonAir cluster mainly involves a decision of how many components of a single type to use, on which machine to put them, which components we don't want to use and which components should share the same hardware resources.
As you should know by now, PigeonDeliver uses modules to deliver emails. You can also decide to distribute those modules among different machines, even if that is not usually necessary, and won't be discussed in this document.
So, let's start by drawing a scheme of all the components we may make use of.

All components we can use, in practice
In more descriptive words, we need to complete a puzzle and all the pieces we have are:
one or more balancers, to split the load.
one or more LDAP servers, to safely keep control data.
one or more SMTP/IMAP/POP3 proxy, to redirect connections to the correct servers.
one or more shared storages, to keep users' data and to split the load.
one or more web servers for the Web Mail.
one or more web servers for the Administrative Interface (Web Admin).
the real SMTP/IMAP/POP3 servers.
Ok, now a very simple example. We were just asked by a brand new ISP to build up his own Mail Server. The ISP himself doesn't care to have a cluster or a single machine, he just wants something able to scale well in case things go well, and he hopes things will go very well. So, we are wondering about how to configure his server...
At a first glance, we could just configure a normal mail server and be happy with that. However, the provider told us that he hopes things will go real well in a very short time, so he is probably planning to grow... additionally, we really like the interface those folks of the PigeonAir project have developed, and we'd like to use them...
Ok, let's try to write down our hardware shopping list to build up our mail server:
uhm... balancer... do we need this? No, since we have a single server, we don't need it. And we are always in time to change a bounce of DNS records or to put a balancer in the middle in case we need.
LDAP. We do need a LDAP server. So 1 LDAP server.
IMAP/POP3 Proxy. We don't need it right now, but we may need it later.
Shared storage... what's that? We don't really need it... in case of growth, we'll use the provided proxy.
Well, ok, now I still need a web server for the web mail, one for the web admin and one for the real SMTP/IMAP/POP3 server.
Well, ok, and if I wanted to start from the beginning with a cluster with two nodes? Did I need like 100 machines to build it up? No. Two machines would still suffice. You use the DNS as balancer, and you put all services on all machines. Probably not the best solution, but splitting up services is really useful when the load gets real high, in order to simplify the configuration of every single machine and to allow administrators to fine tune hardware requirements and to optimize the load by adding/removing machines specifically configured to perform their own task.
Ok, so we have to install all the needed software, and finally install PigeonAir and everything else in a pretty straight setup. Not much to say, PigeonAir should support something like this almost out of the box, obtaining something like:

Hardware setup of a single machine cluster
So, let's say the provider who used our server actually grew as expected. We were thus contacted to prepare the service for new users to join in.
Ok, so we finally need to expand our own cluster, we basically have to add one node, without interrupting the service.
So, let's first prepare the second machine, connect it to the network, and slightly modify the original setup in order to make it working.
Before going on, let's rethink a bit about our previous service distribution. Since the load increased, we could also start wondering about reliability of the services and stuff like that.
Basically, we need the same services as above. Since the load is quite low, it still doesn't make much sense to keep, for example, web services on a machine and let the other handle SMTP/POP3, so lets sneak with the same distribution as above.
The only weird thing about this setup would be the LDAP server. We would end up with one LDAP database and two servers. While I believe there would be no problem in term of CPU load, it would become for sure a SPF. A failure of the first server would bring both servers down. Additionally, the first one would have access to a local database, while the second one would access a remote database, over the network.
This may seem a little overhead, but consider that:
the PigeonDeliver database structure requires few queries but transfers lot of data
the caching layer in the datatree handling is not complete yet
in the average, there will be few changes to the database compared to the number of accesses (a few accesses for every received email/POP3/IMAP connection, compared to a few writes for every administrative operation on the database).
We should also consider that the LDAP database is probably the most important data stored on our servers. Well, I mean, if we loose emails, it's bad... but if we loose the list of our users or their passwords, it is even worse, and for sure we are in big troubles.
So, how do we configure this second node? It is enough to enable the POP3/IMAP Proxy on the first node, replicate the configuration of the first one over this second, change a couple configurations (hostname, IP address, ...), setup the LDAP Server to behave as a Slave, and finally tell the balancer it can start to redirect connections to this node too.
Nothing more. That way, you can add as many nodes as you want. You just need to setup another machine with the same configurations, make sure the model still does make sense, and configure the balancer to connect to this machine too. The information in the shared control data (LDAP stored informations), will be enough for the second node to correctly talk with the first one (given the correct authentication key) and for the first one to know when to contact the second.
Translate all of this in command lines is left as an exercise to the reader. (At time of writing, PigeonDeliver code is accessible only using TLA/CVS. So it doesn't make much sense describing the practical steps to install/configure something that still uses a developers-only installation procedure -- %TODO%).
Ok, well... the final setup may look something like:

Hardware setup of a single machine cluster
Note that in this setup it may be a good idea to provide the two nodes with a second network interface to connect them using a private network, to allow an higher load to be handled without overloading the public network.
Using the above described setup the load of the internal network will always be less than that of the external network, and as long as both networks have the same bandwidth, it will not become a bottleneck. This sentence cannot be stated in case a shared storage is being used, and on most setup of a certain size NFS does become a bottleneck, unless mixed solutions are used.
At time of writing, the PigeonAir project does not provide any load balancer.
However, a load balancer is very easy to build and configure using existing technologies.
As a first choice, you could even decide to buy an appliance to handle the load balancing. Usually very efficient and very reliable... but take a look to your own dealer catalog to know more.
A second choice could be to setup a Linux IPVS. IPVS is mainly a kernel module allowing you to setup a Linux BOX as a network balancer, acting at a low level of the IP Stack. IPVS is also very reliable, and with some work you can also configure the whole cluster to use more than one balancer, for take over and for improved scalability. To know more about setting up Linux IPVS for PigeonAir, take a look at %TODO%.
As a last resort, you could simply use the DNS. The DNS is very easy to configure to balance requests over a cluster of mail servers, but requires you to add many records in each of the virtual domains handled by PigeonAir and is not very configurable in splitting the load (eg, if something bad happens, it takes a long time for the DNS to stop sending it connections and requires manual intervention). However, it is extremely reliable and under normal conditions perform a fairly good job. Keep also in mind that I wouldn't suggest to use the DNS for any cluster with more than 2/4 nodes. Take a look at %TODO% to know how to configure a DNS to balance the load over a PigeonAir cluster.
Ok, let's see what happens when a remote server needs to send a mail to one of our users:
First of all, the remote server searches the domain of the recipient (our user) in the DNS, looking for a MX record.
Given the sender can find the MX record, the remote server will contact the first server found with lowest priority. In case more than one MX record have the same priority, the DNS will take care of mixing them all before returning the answer to the remote server.
In case an appliance or IPVS server is used as balancer, the MX record will contain the IP address of the balancer. The remote server will thus contact the balancer that will take care to redirect the connection to one of the nodes of the cluster.
One of the nodes SMTP servers, probably Postfix, will receive the connection. The SMTP server will lookup the user in the database, to verify it is its duty to deliver emails to the given user. If it is not, it will either drop the connection returning an error, or accept it anyway if configured as a relay. If it is, the mail will be queued for local delivery.
PigeonDeliver will be called to perform the local delivery. For more details about the delivery, please refer to %TODO%.
PigeonDeliver will call the modules involved with the delivery that can be executed directly by the local server (for example, the mailAntivirus or mailForward module). All the modules will be contacted using the server_map and service_map. Those maps allow single modules to reside on different machines.
Once the email gets to the mailStore module, depending on the clustered model being used and on user configurations, it will either:
forward the email to the mailStore service of the server where all other user emails are kept
store the email directly on the shared storage
Now, let's see what happens when a user tries to download his own emails:
First of all, his client will look for the IP address of the POP3/IMAP server in the DNS.
If DNS balancing is being used, the DNS server will return the list of all POP3/IMAP servers available, with the server listed in a different order for each request, in order to provide simple balancing.
The client will connect to the IP of the IMAP/POP3 server. If there is a balancer on the way, the balancer will be configured to redirect the connection to one of the POP3/IMAP Proxies.
The PigeonDeliver IMAP/POP3 Proxy will authenticate the user, lookup its home directory in the database, and redirect the connection to the real IMAP/POP3 server. This server will be configured to accept connections only from redirectors, and will relay on authentication data and configurations provided by the Proxy, in order to avoid unnecessary lookups. Proxy/Server connections are obviously authenticated using a simple challenge response mechanism.
PigeonDeliver will call the modules involved with the delivery that can be executed directly by the local server (for example, the mailAntivirus or mailForward module). All the modules will be contacted using the server_map and service_map. Those maps allow single modules to reside on different machines.
Once the email gets to the mailStore module, depending on the clustered model being used and on user configurations, it will either:
forward the email to the mailStore service of the server where all other user emails are kept
store the email directly on the shared storage