MONARC-99/?? - version 2.2
This document compiles some observations on the Objectivity/DB AMS protocol. This document is based on some Objectivity 4.0.2/Solaris tests done by me in Feb 1999. The part about large VArrays over the AMS is based on tests done in June 1999 using Objectivity 5.1/Solaris. Thanks to Eva Arderiu Ribera for information on large VArray tests over the AMS done earlier by Marcin Nowak.
Note that this document does not concern the FTO/DRO option: I tested remote access to unreplicated databases, FTO/DRO may introduce additional protocol complications.
As TCP/IP is the base AMS transport mechanism, all the usual issues with TCP/IP apply to AMS communications. Some machines may have kernel limitations on the number of TCP/IP connections. Also, on high-latency links, the relatively small TCP window size may cause a bottleneck even though the available bandwidth has not been saturated. In that case multiple remote clients, each having their own connection to the AMS, would be needed to saturate the link. Feedback mechanisms inside TCP/IP, which ensure that any one connection takes only its fair share of the available bandwidth, may cause nonlinear behaviour in some cases. On performance issues related to TCP/IP, see the usual literature.
For reading and writing of large (>64KB) VArray contents, there is a difference between local and remote databases. With a local database, the large VArray contents are read and written in a single filesystem I/O operation, no matter what the VArray size is. When the client reads or writes the contents of a large VArray though the AMS, this happens with a sequence of AMS communications. The client divides the VArray in 64 KB chunks and does one AMS communication for each subsequent chunk. This 64 KB chunk size number is independent of the federation page size (from 2 to 64 KB). If 64 KB is not divisible by the chosen page size, the chunk size may actually be the smaller, to make it a multiple of the page size, I did not check this.
Objectivity clients always bring the whole VArray in memory when the first element is accessed, no matter whether the array is in a local or a remote database.
I have never seen a client request two or more pages at once: it always requests a single page and waits for it to arrive. This means that reading is synchronous and non-pipelined. One can thus expect extra delays when the round trip time between the client and the AMS grows.
At the time (of day) of the test, the CERN-Caltech round trip was 0.25 seconds. Theoretically, looking at the AMS reading protocol and assuming no extra roundtrip overheads due to TCP, reading a 32 KB page will take at least 0.25 seconds, so a single client can never pull more than 32/0.25 = 128 KB/s, regardless of the bandwidth of the link.
In the small test I could pull ~50 KB/s with an Objectivity client reading from a DB file at Caltech. With an ftp client I could pull the file over at ~60 KB/s.
Note that it is unknown to me how much CPU power an AMS server actually needs to serve a certain amount of data. I have once seen a machine which used 100% of its single CPU providing AMS services, but I expect that a very large part of the CPU power consumption went to the TCP/IP protocol stack and filesystem code inside the kernel, rather than the user-level AMS code itself.
The lack of multithreadedness means that with the current AMS system, to scale up AMS traffic, one should install an additional (small) machine with an additional (small) filesystem, rather than double the CPU power and filesystem throughput of the existing machine. The AMS processes on different machines do not have to communicate with each other so I expect linear scaling, as long as the network link to these machines is not a bottleneck. Note again that this is the non-FTO/DRO, non-replicated case.
Of course, many small AMS data servers with many small filesystems are more difficult to manage than a few large servers.
The BaBar solution is likely to allow for multiple AMS processes (or threads) running on a single machine, all providing access to databases on the filesystem(s) of this machine. Each client will connect permanently to one of these AMS processes (or threads) after a negotiation phase which ensures some degree of load balancing between the AMS instances. The multithreaded solution will allow all CPU capacity in the machine to be used for AMS tasks. No fine-grained synchronisation between the different AMS processes or threads running on a machine will be needed, so I expect linear performance scaling if more CPUs and AMS processes are added to the machine, until network or filesystem throughput limits provide a ceiling.