By Charles Chance
Manchester is back and it’s big, in fact it’s EPYC! Dad jokes aside, I have some exciting tales about flashing lights and shiny boxes to share..
In Simon’s 2022 ‘Christmas letter’ he mentioned how we’d been working on bringing up a new compute site in Newport (South Wales), new optical connectivity between London (Volta), Manchester and Newport, and a rebuild of Manchester to our latest generation of architecture. Today Manchester comes fully back into service as an active Availability Zone (AZ) and has resumed workload shifted elsewhere in the interim.
Firstly, on the network, our prior UK architecture was around three AZs being London, Manchester and Slough. Both Slough and Manchester suffer from Equinix tax although Slough is the worst, with most networks preferring to interconnect in London and avoid it given it is so close; Manchester is far enough away for interconnection to be justifiable in more cases than not. If we want an AZ outside London where nobody cross-connects therefore, there are other options – and we opted for Newport for a number of reasons. The intention is that Slough will fall away in due course as we build Newport as a new AZ local to major cloud providers and geographically more diverse from others than Slough is. Slough already enjoys optical connectivity back into our London sites but we’ve now completed a new optical ring around Manchester, Newport and London where we join our metro-ring. We’re running our current generation Arista switching in these sites, and continue to strongly encourage customers to connect to us directly in at least one but ideally all AZs.
Last year we upgraded the reference architecture for what we call our VCNs – Voice Compute Nodes. These are the units that process media, handling any transcoding required, so they’re pretty critical. At the time we specified them such that in any single AZ we had enough resources to handle 4x the peak load on the entire network with no change. With Manchester coming fully live again to the new spec, this gives 12x our previous media capacity overall. That’ll be handy next time there’s a pandemic, or one of our customers hosts the phone-in for a major fundraiser or music concert; not that we skipped a beat any time prior. We conducted another significant network load-test for a major Government department last week and the VCNs didn’t get above 0.5% load, and they sit around that level during our peak times.
Finally: compute. Our CNs – Compute Nodes – are the workhorses of the network. It was 2016/17 we ditched virtualisation in favour of containerisation on our own network stack which enabled anycast and other innovations. The CNs run our hundreds of container images driving everything from billing databases and daemons through to web caches. All of those are deployed from standard images that are evolving under our Continuous Integration Continuous Deployment model. Our old CNs work hard and have done amazingly well but for Manchester we’ve updated our reference architecture. We’re using the AMD EPYC chipset which is incredible. Manchester now enjoys 128 cores per socket and broadly speaking has 3.2x the CPU and 2.6x the RAM of other sites, all while using what looks like being 50% of the power! This architecture will be rolled around other AZs in the ordinary course of upgrades now giving us loads of headroom for your growth.
That’s all for now but as I don’t write here very often, do let me know in our Community Slack if you’d like to hear more from us on technical things or of course any feedback/questions!