March 8, 2012 4:25 PM
Maximising the use of our network resources
The addition of Colt's Pan European Network to Verne Global's network offering at its data centre in Keflavik, Iceland has certainly increased the options for how our customers connect to their environmentally powered data centre resources.
Recently, the Verne network team had an opportunity to work on testing out the throughput capacity on a 1GigE dedicated link from our data centre in Keflavik to one of Colt's many PoPs in central Europe. In this case, we used a link that was routed over the 5.1 Terabit per second cable that crosses from southern Iceland to Denmark. Because of the direct routing of this cable, the packet delivery latency to this central European location was measured out at under 26 milliseconds.
For many customers, though, it isn't about latency. Almost all applications are designed to accommodate latencies much larger than the latencies that our customers see from Iceland to North America and Europe and beyond, especially now that most of today's applications are configured to run smoothly over both terrestrial and mobile networks. For many customers, it is about filling the pipe - maximising the use of bandwidth to bring large streams of data more quickly to their stakeholders. Is it really possible for someone to ensure that they can rapidly move large amounts of data between their European, North American and Icelandic data centres? The answer is yes.
For our testing, we wanted to demonstrate that we could achieve near perfect throughput while transmitting large files over a 1GigE pipe from our facility through to the Colt Central European PoP. Our first question was could we rely on the Layer 1 physical installation? So we tested it. Colt's network team reported no Ethernet frame losses over the course of our initial testing. The fabric for our circuit was intact. On with the testing.
A bit of research finds that many users are frustrated that despite a large pipe, the throughput of large files is ultimately limited. For example on a 1 GigE circuit, I can expect that I should be able to transmit 1,000,000,000 bits of information each second. If I assume about 7% overhead for the Ethernet Frames, the gaps between the Frames, and the overhead for the IP and TCP headers, we should expect that somewhere around 930,000,000 bits of payload data should be able to be transmitted each second. With 8 bits in every byte, you'll be looking at somewhere around 116 Megabytes per second of theoretically maximum throughput of data within the TCP frames. That's a whole lot of HD Video!
The fact that standard TCP frames are limited to be pretty small ends up being a problem for throughput - even for really low latency links.
Regardless of the size of the pipe, the stream stops periodically to make sure that the TCP packets are reaching the destination and that there are no errors or interruptions in the data flow. As the latency grows, even by milliseconds, the standard TCP window sizes cause the network to do more waiting than transmitting.
But there are solutions and, believe it or not, they've been around for a long time. RFC 1323, originally issued for comment in 1993, expanded the header for the TCP packets and added a concept called TCP window scaling. As networks became stronger and the equipment transmitting the information became more robust with RAM and functionality, more and more systems could allow TCP window sizes to grow and thereby improve performance on low and high latency links alike. Thankfully for us, with a little knowledge and a few adjustments to our server configuration files, we can push our TCP data fields up to match the capacity of our 1GigE pipe. That means that we spend much much more time streaming critical data and much less time waiting for confirmation of receipt. With additional features such as resilient error handling and soft start suppression, we can push performance to near perfection for TCP data streams.
All of that is fine through Layer 4 (Physical, Data Link, Network, and
Transport), but what about the Application layer? For example, what if our customer wants to take advantage of SSH tools and encrypt its data stream?
SSH runs on the software layer and adds yet another data window within our TCP payload. While there are hardware solutions (WAN optimisers for example) for many applications, SSH specifically presents an interesting consideration. Firstly, standard WAN optimisers can't really touch the SSH stream since it is encrypted and non-compressible. Secondly, SSH, similar to the standard TCP window size issue, is limited to the size of the data that it will transmit at one time without confirmation.
Standard SSH data window sizes are a mere 64 kilobytes. Once again, a big pipe is trunked down to wait for a confirmation after only sending 64 kilobytes of data. There is a reason for this limitation. SSH is a resilient standard that is supposed to work well across any platform - latent, slow, fast, old, new. The window sizes are held down in size so that even the slowest computers can keep up with the computational work of encryption of the data stream. While we appreciate the benefits of cross-compatibility, we of the data centre ilk are interested in top performance - matching a Ferrari with a Ferrari and not worrying about the possibility of communicating with a Yugo on the other end.
Even today's lower end servers and indeed laptops themselves have Ferrari performance when it comes to SSH and the CPU requirements for rapid encryption. As you might expect, the solution is available and we need to look no further than the experts at the University of Pittsburgh Supercomputer Center.
A High performance computing environment, such as a supercomputer is typically configured to be a batch processor that will take parameter files for an engineering or scientific problem and then fire out in depth results, usually by way of a very large output file. Since supercomputing HPC clusters are centralised to reduce costs, they are often accessed by engineers and scientists who are remote from the cluster. It is certainly easier to push photons than it is to force scientists from different backgrounds and different skill sets to converge on a single geographical location. With this remoteness, comes the requirement of shipping large output files from the HPC batches. We appreciate this model as it is exactly the type of application that works splendidly in our data centre in Keflavik.
The results of our testing show that with proper implementation of readily available software tools, our customers can achieve excellent throughput from their computational environments in Iceland. With our networks in place and our network partners such as Colt ready to serve their needs, the horizon of applications hosted in our Keflavik data centre is growing. And that growth is creating unique TCO savings opportunities for the stakeholders of those applications.