In this two part episode, Steve talks to Colin Corbett about his experiences working for companies such as Paypal, Google, Netflix and Youtube.
Colin is the proprietor of 7 Hills Consulting which helps companies with global network and datacenter architecture and design. With a networking and datacenter background spanning over 25 years, Colin has built and led the entire network and datacenter infrastructure for early stage startups including PayPal, YouTube and Dropbox. Additionally he has had various networking and datacenter roles for growth companies such as eBay, Google, Twitch, and Netflix among others.
Colin has rolled out infrastructure within 5 continents for various sized deployments. He has also designed and built numerous large networks, both in terms of egress traffic, and locations deployed. Additional specialties include: making repeatable cookie-cutter scalable deployments, site selection, lease/contract negotiation, backbone/transit/peering discussions, hardware selection, vendor negotiation etc..
This is the second part of a two part interview, find part one here.
So we've talked about getting in a great data center contract, making sure the data center is gonna have the spec that you really need for for your location in the best ways. You talked about having integrators kind of roll in a rack, but in terms of what that rack should look like (I know you're as detail oriented as just about anyone that I've talked to about that), what makes for a really great data center rack and rack installation? How do you optimize? What's your approach there?
So the first thing I try to do is I try to limit it to a certain number of skews. So if you have like a web, a video, a database in a network that's basically four types of skews so, if your integrator only has to build a certain types of things, that's great because then they're not having to stock a bunch of different equipment. You've maybe got it down to three or four different types of racks, and then sure, overtime you're going to end up having to deal with the vision control, but try to do revisions.,
Not every single build should be a unique snowflake. You should try and start to have lots of constant revisions where everything looks pretty much in lockstep for a, three, six or nine month period let's say. So things like the network rack would have one vendor's worth of gear and it's laid out perfectly as well as the webs, the videos and the database and sure as Intel or AMD release new CPUs, you're going to have to iterate that. As hard drives become obsolete or end of life by the vendor; and you have to swap that. You're going to have to iterate, but try to do it at a... Certainly never try and say “well this rack’s got half of the old gear and half of the new gear.” Work with your integrator to make sure that you've got a very clear delineation of: this is version A and this is version B etc.
Yeah. I've seen teams look at that not only to make sure that the rack can be well maintained, but even to get to the fact where even the cage has incredibly similar detail between multiple cages as you start to have lots of data centers and lots of data center cages just to make sure that site operations people can move in and efficiently do the right thing and not spend too much time figuring out how the whole cage installation even works.
Right, so at YouTube for instance, we ended up asking for a very specific type of cage from our vendor—it was basically a nine rack cage. And it started with the first rack as the network rack, the rest of the racks for this [were] all video, full of video POPs, and the rest of the racks were all video. And those looked [good] and that was all very clean. But it got to the point that we even had bins for all of our spare tools—things like fiber, copper, spare hard drives. But across all of our POPs, you knew that the top left most bin was one meter LC-duplex fiber, and the bottom right bin you know, eight rows over five bins down was spare hard drives.
And it became very clear and it was great because you could teach your technicians what it would look like, but also when you would have to go and call for remote hands at night you knew that every data center was exactly the same, and it was really helpful especially when looking for spare parts or optics or things like that.
And then in terms of the rack itself, what’s your approach for optimizing for power —both power that’s available to the rack and providing that power to the rack?
So for that I usually will ask for a circuit that's going to be bigger than what I need so that I usually either have room to burst, or I'll ask for something where I'll never really run in jeopardy of exceeding breakers. You certainly can have cases where, that being said, the nameplate rating on all the hardware is far in excess of what you actually will see in standard use. A lot of that you end up refining over as you actually see the equipment and your workload.
Sometimes though, that does change as you keep having bad pushes, so you need to accommodate headroom for that too because you can’t have bad pushes that increase your workload extensively. But yeah, usually you work with the data center provider and especially in the wholesale world, you can ask to provision whichever circuit you want. And so usually you try to settle on one that has certainly got enough headroom, and it allows me to scale, and as long as there's sufficient cooling for that, you just try to go as dense as you can within your limitation, because the density—while you end up with [fewer] overall switches, you end up with more rack space free or more rack positions free, etcetera.
You mentioned I believe that you started working with your teams with a kind of standardizing on a DMARC rack? Do you want to tell us about that?
Yes. So that was one of the interesting things we ran into with... as you start doing a lot of wholesale deployments: the idea of eventually we wanted to kind of start being in a wholesale deployment, but also having the flexibility to swap other infrastructure out. So if you had let's say, network and server racks that you knew were going to keep evolving rapidly, could you go and be in a standard data center and not shut the data center down completely, still keep it operating in some fashion, but still be able to go and upgrade it in bulk?
One of the things we came up with—and this was mainly for retail locations, was land all your cross connects at a unified DMARC, and you and all your fiber from your network racks to the DMARC, and all your cross connects out to network providers, and your peers land there, and then with enough coordination and planning, you can swap out your server racks, you can swap out your network racks and the DMARC stays the same.
Also if you plan really far ahead you have, and in the data center that you've never really made your first delivery to, you send the DMARC rack first because it really only has maybe fiber panels and maybe a couple of power strips and not much else in it. If something bad happens to that rack let's say, it falls off the truck, the overall dollar value of that is actually pretty low. And so you can kind of work out a lot of problems about doing the delivery very far in advance with a rack that is very quickly built and has an overall low dollar value compared to some of the hundred thousand dollar to a million dollar racks you can have out there.
That's cool. How do you think about the vendor solutions that are out there for customized racks, say the APC racks and Dell racks and rackables, and there's kind of a lot of consumer options today?
I’ve had some overall good and bad between them. So the main thing is just making sure that they all meet your need. I've certainly run into questions where the racks look great and I say “OK can I fully load this and load it into the data center?” They go ‘well this is its static rating,’ which is very different than its rolling rating, so if you actually wanted to put twenty five hundred or three thousand pounds worth of gear, you can't roll it in with three thousand pounds worth of gear. So that's one of the interesting things to run into, or you say “hey I have a 60 AMP three phase power plug” and they say “well that's great but there is no hole in the top of the rack to actually fit that out of all. You need to order a special top or you need to run it with no top to kind of make sure that all fits.”
So those are all interesting issues. For me when I've worked with integrators, we've also seen things like racks where the customer has come to them saying “I have three different types of airflow either side to side or side to back, or front to back” and all these kind of stranger things. And a good integrator will certainly help you make sure that all of that's correct. They'll do the air baffling to make sure that all works. And also things like balancing all the phases, especially when you have things like three phase power.
As for working with some of the larger vendors, I've had a lot of strange issues when I want to specify certain types of gear, so things like if I have a favorite RAID or NIC card that I want to use in my infrastructure that I've certified from, that we've approved from using our white box solutions. We've definitely had cases where the bigger vendors have not been able to source it or they can't get it in a reasonable amount of time.
Also we've run into cases where they'll sell a standard ‘off the shelf’ hard drive but with their custom firmware, and if they no longer support that drive or that drive's not sold through that vendor, I can't replace that drive or buy new drives straight from the manufacturer anymore because they don't have the custom firmware. So I've certainly run into those issues and that's not been fun. Trying to resolve that... there has become a few interesting issues there.
How about as you move to custom storage arrays, blade servers or even hyper converged infrastructure?
So there's some interesting things there too. A lot of the problems actually end up being physical. Some of the newer ways I've seen, they're basically 36 inches deep, but most data centers especially retail data centers, by standard will only ever give you a 36 inch deep rack—or they have aisles designed around 36 inch deep racks. What that means is that you end up needing like a 42 inch or 48 inch deep rack, which generally does not fit in the data center or doesn't give you the ability to service them. So if you ever had to go and work on the rack in the back, let's say to pull out the switch that you've rear mounted, you might not be able to either open the door or you might not even be able to pull the switch out to go and service it and swap it.
It's really interesting that you can run into those issues. Also you'll see things like, shallower Caldow racks where they'll try to go and make the Cladow so small that you can't actually bring out that 36 inch deep drive away because there's just not enough room in the aisle. So a lot of physical planning goes into supporting some of the blade servers, a lot more than you would think. But as you talk about blade servers you have things like flexibility questions, things like can you get 10/25/40/50 and100GB out of the back of the blade servers? In some cases you do it by just exposing the actual networks onto the chassis or from the back of the chassis, or sometimes if it has dedicated switches in the middle, you end up with limitations in terms of capacity.
For me I've usually erred on the side of going with dense servers that don't have an integrated switch. Something more like the super micro-2 twins or the equivalent Dells or some of the open compute stuff that also looks interesting too.
Right so you bring up that there is a difference between trying to add compute density and then integrating network density into it—as opposed to just using the kind of standard network switching you've chosen for your data center?
Yes very much so. Maybe that's just more of my networking background but I'm very happy with... I would like to dictate the network's capacity and build it and scale it according to what I want, as opposed to having an integrated switch with limited capacity or limited uplink and you're not quite sure about what the redundancy looks like. Whereas this way you can build it your own way and dictate it that way.
What's your take on with storage arrays—particularly there's alternate networking that becomes available. Is that something that you favor or not? Kind of the iSCSIs of the world and Fibre Channel over Ethernet (FCoE).
I've recently looked into a few of things like Fiber Channel, Fiber Channel over Ethernet, Lucky, Lucky-2 things like that. I will say my heart is mainly in the Ethernet world, so I usually predominately stay on Ethernet, and a lot of the technologies that have come out, have been pretty affordable. And now with something like 25GBs really about 20% more expensive than 10GB, and 100GB really seems to be only about a 20% uplift over a 25GB. Eventually you might have some density problems or capacity problems at the switch layer, but those seem to be really good choices, and now you are starting to see things like 400GB out there. And there's some really interesting developments happening there.
So you're saying Ethernet is keeping up?
Ethernet seems to be keeping up and there's things like priority buffering, data center bridging that seem to try to straddle the world of making you not go with dedicated Infinity Band or Fibre Channel switches on the back end.
So when you're in the data center, connecting it all relies (as you mentioned earlier) on cross connects. I know all of us in the data center world have had a lot of experience with cross connects, portals and making sure we can make those connections. I know you've thought a lot about it. What's your kind of current take on the world there?
One of the hardest things is when you want to order a circuit from a provider, is trying to figure out: is the provider a built in, and more importantly, if they are built and you want multiple circuits from them, are they built in redundantly? To actually see, to actually work with the provider and say, “I have a circuit that's on path A that lands on equipment A and I want a circuit on path B that's on equipment B and they both take separate paths out of the building.” That's something you need to really work with your vendor to find out, and in a lot of cases, your data center vendor, if you ask them for information as to ‘Is that possible?’ they'll usually just push you back on the provider.
In terms of getting cross connects there's a lot of hurdles. My strong recommendation is to pre wire panels yourself so that you know where everything is on your side and then you start assigning specific ports on... as you want to cross connects, so that you know exactly what that matches to, exactly what port that goes to, so that everything should theoretically come up sooner. If you can do the pre wiring, the other thing is when you assign a port before you open the cross connect, is to actually go and start sending light down your port, so that then as they're working on the circuit, you can also put in there standard notes like: ‘I'm sending light. Make sure you see my light all the way down at the far end of this circuit to make sure that it comes up.’
I've certainly run into cases though, well I'll tell you a quick story, I tried working with a large global data center provider to say “How could I do a global deployment with them across multiple continents?” And I said “Is there a globally supported patch panel I can use so that I can do this just standard cookie cutter layout?” And in one case they said, “No, we don't have a global standard. Everything's individually reviewed and everything can be individually rejected.” That was kind of a nightmare, so I did eventually get that fixed but it took a good amount of work to get to that point.
Yeah it's interesting, as their world is evolving to that, maybe at times folks like yourself have had to shape their world—not just the other way around.
Colin, you've given us great insight into the data center and the details about working with it and optimizing it. There are some who would say data centers don't matter to enterprises in the era of the cloud. The data center is someone else's problem, whether that's Amazon or Azure or Google or other. Why do you think data centers still matter for corporations and enterprises?
So I think it still has a lot of fits, but depending upon need. So it's great at the edge as you can work directly at the networks you need to connect to. In the case of content, working with the eyeball networks and interact with them to make sure you have the performance you need to then network. Also with a large amount of egress, as you send your data out it will be cheaper if you send your traffic out of your own infrastructure than if you are paying on a... In the cloud world, you pay on what's called a gigabyte transfer method, whereas if you are doing it on your own infrastructure, you're doing it either through a settlement free interconnect through peering or you're doing paid peering or you're doing transit, usually on like a 95th percentile basis. That will be cheaper if you have enough traffic to make that account.
This is a key point that enterprises should be aware of if they are not, which is that the public clouds make a lot of their money on this line item that you're mentioning—on the data coming out of their data center. So if you're pushing a lot of traffic, it can really become prohibitive.
Very much so. Also some of the other interesting things are if you do something like a lot of expensive cloud processing by running - if you took some of that standardized workload and actually moved it to your infrastructure, you can go and save the money by running it there. That is also kind of a mixed bag. I mean the hardware you're looking at there is on like a three year depreciation cycle. But some types of the hardware: things like GPUs are changing at a faster cycle. But at the same time you've also got the cloud which is charging you a 12 to 18 month depreciation cycle for the hardware anyway.
So depending upon how you do it, it may make sense to go move some of it back to your own infrastructure. But then again, all that being said, you still need to avoid a single point of failure. So, even if you had your own data center and you run your own infrastructure, you still need to be paranoid enough that either the fire marshal comes in, could go and shake your building down and hit the EPO at any point, ora fire could break out that results in the data center having to lose power or things like that. So you should always try and keep everything to a point where you have redundancy and that is the main thing to keep an eye on.
Absolutely. And you mentioned, edge. Mostly we've been talking about it in the context I would say is the network edge, as it as you talked about how providers that are moving a lot of traffic and move closer to their customers. Obviously there's a lot of momentum around just the term “edge” in general and in edge data centers. We're usually talking about moving sometimes closer to customers but also IoT and industrial IoT. How do you view the concept of edge data centers?
So there's a lot of really a really cool ideas there. After traditionally most people started in the original six or seven Tier 1 markets and then after they've got that solved, they ended up moving to the Tier 2 markets to get better end use performance. The edge data center seems to be solving a lot of the Tier 3 locations as well as things like cell towers and some of the backhaul. You know in the in the quest to get better performance uses, it's a good idea.
With a lot of things like the offnet caches and all of the updating algorithms, it’ll be very helpful for a lot of the big consumers who can support hundreds of thousands of like edge locations, solve their problems whether it's to serve more ads, get more use of telemetry, maintain a high quality bitrate for video, that'll be huge. It's not clear though, that a lot of the small consumers will be there at the moment.
And you know the corollary to that is—and this is a problem I ran into—is that if this is super successful, how do you guarantee that that space in those small edge data centers doesn't get gobbled up by the same handful of customers, which then stop the next great startup from getting space there at a reasonable price? And since some of these spaces are as small as a standard semi container, they don't have a lot of rack space to spare, and they also have cooling power and fiber distribution that... space rapidly becomes a concern.
So Colin, I just kind of met a comment here. I just took a quick look at your answer to ‘composable’ and it's pretty cool. Do you mind if I just ask you [about] that as a final question?
Right. So some would look at the new concept of composable infrastructure as really taking the architecture of the data center apart and reassembling it in a different way. What do you think about composable infrastructure?
So I haven't really tried this extensively. At the moment Kubernetes seems to be the thing and it looks pretty interesting, but yeah haven't really had too much time to test it. That being said though, there are many needs where people need both bare metal and NVMs, instead of just one or the other. So being able to utilize the cloud, old wheel in racks, and have them available as additional resources, that's a great thing. That being said, I do think the ability of capacity plan, which was a core part of doing data center infrastructure originally, has kind of been lost and so, yeah the idea of the bursts is very much needed these days as a result.
So with capacity planning, you're saying, in the past you'd run an exercise looking at those specific systems and the needs for those systems, and now that's all kind of too dynamic to plan for, is that what you mean?
So traditionally (getting data center space) if you needed a rack worth of gear from an integrator, usually the lead time there is about 6-8 weeks, with no warning, no planning let's say. And as a result, usually that means you're running your infrastructure with either some headroom or with the ability to basically say, “If everything goes south, do we have enough to... if I needed to order a rack today, so I have 8 weeks worth of headroom in my infrastructure?”
Now with things here, I can just go and open up extra, just order additional capacity from the cloud, or just add servers to my cart and all of a sudden I have servers, rather the idea around capacity planning hasn't really, has kind of been lost. At scale though you really need to know what your hardware needs will be, not just two months away if you order servers today, but also months down the line, and maybe even years down the line, and as you get to a really big scale, you run into cases where the cloud providers may not have capacity of they'll invite you to their capacity planning meetings as well. They'll want to know what your forecast is, if you're big enough for the cloud provider.
Interesting. Well great thanks very much Colin. I appreciate your answers today and ongoing from the past and looking forward to continuing the conversation in the future.
Thank you very much. It was a great time chatting.