Archive

Archive for the ‘IBM’ Category

Are blade servers right for my environment?

July 15th, 2009 No comments

IT like most industries has it’s “fad”s.  Whether it be virtualization, or SAN’s, or blade servers.  Granted these three technologies play really nicely together, but once in a while you need to get off the bandwagon for a moment and think about what these technologies really do for us. While they are very cool overall and can make an extremely powerful team, as with anything, there is a right place, time, and situation/environment for their use.  Blades are clearly the “wave of the future” in many respects, but you must be cautious about the implications of implementing them today.

Please do not read this article and come away thinking I am “anti-blade” as that is certainly not the case.  I just feel they are all too often pushed into service in situations they are not the correct solution for and would like to point out some potential pitfalls.

Lifecycle co-termination

When you buy a blade center, one of the main selling points is that the network, SAN, and KVM infrastructure is built in.  This is great in terms of ease of deployment and management, however, on the financial side of things you must realize that the life span of these items is not normally the same.  When buying servers I typically expect them to be in service for 4 years, KVM’s (while becoming less utilized actually), can last much longer under most circumstances (barring changes in technology from PS/2 to USB, etc…), network switches I expect to use in some capacity or another for seven years, and SAN switches will probably have a similar life-cycle to the Storage Arrays they are attached to which I generally target at 5 year life spans.

So what does this mean?  Well, if your servers are showing their age in 4 years you are likely to end up replacing the entire blade enclosure at that point which includes the SAN and network switches.  It is possible the vendor will still sell blades that will fit in that enclosure, however, you are likely to be wanting a SAN or network upgrade before the end of those second set of servers life-cycles which will likely result in whole new platforms being purchased anyway.

Vendor lock

You have just created vendor lock such that with all the investment in enclosures you can’t go buy someone elses servers (this really sucks when your vendor fails to innovate on a particular technology).  All the manufacturers realize this situation exists and will surely use it to their advantage down the road.  It is hard to threaten not to buy Dell blades to put in your existing enclosures when that would mean throwing away your investment in SAN and network switches.

San design

Think about your SAN design – Most shops hook servers to a SAN switch which is directly attached to the storage array their data lives on.  Blade enclosures encourage the use of many more smaller SAN switches which often requires hooking the blade enclosure switches to other aggregation SAN switches which are then hooked to the Storage Processor.  This increases the complexity, increases failure points, decreases MTBF, and increases vendor lock.  Trunking SAN switches together from different vendors can be problematic and may require putting them in a compatibility mode which turns off useful features.

Vendor compatibility

Vendor compatibility becomes a huge issue- Say that you buy a blade enclosure today with 4 gig Brocade SAN switches in it for use with your existing 2 gig Brocade switches attached to an EMC Clarion CX500, but then next year you want to replace that with a Hitachi array attached to new Cisco SAN switches.  There are still many interop issues between SAN switch vendors that make trunking switches problematic.  If you had bought physical servers you may have just chosen to re-cable the servers over to the new Cisco switches directly.

Loss of flexibility

Another pitfall that I have seen folks fall into with blade servers is the loss of flexibility that comes with having a stand alone physical server.  You can’t hook up that external hard drive array full of cheap disks directly to the server, or hook up that network heartbeat crossover cable for your cluster, or add an extra NIC or two to a given machine that needs to be directly attached to some other network (that is not available as a VLAN within your switch)….

Inter-tying dependencies

You are creating dependencies on the common enclosure infrastructure so for full redundancy you need servers in multiple blade enclosures.  The argument that the blade enclosures are extremely redundant does not completely hold water to me.  I have needed to completely power cycle entire blade enclosures before to recover from certain blade management module failures.

Provisioning for highest common denominator

You must provision the blade enclosure for the maximum amount of SAN connectivity, network connectivity, and redundancy that is required on any one server within the enclosure.  Say for instance you have a authentication server that is super critical, but not resource intensive.  This requires your blade center to have fully redundant power supplies, network switches, etc…  Then say you have a different server that needs four 1 gig network interfaces, and yet another DB server that needs only two network interfaces, but it needs four HBA connections to the SAN.  You now need an enclosure that has four network switches and four SAN switches in it just to satisfy the needs of three “special case” servers.  In the case of the Dell M1000 blade enclosures, this configuration would be impossible since they can only have six SAN/Network modules total.

Buying un-used infrastructure

If you purchase a blade center that is not completely full of blades then you are wasting infrastructure resources in the form of unused network ports, SAN ports, power supply, and cooling capacity.  Making the ROI argument for blade centers is much easier if you have need to purchase full enclosures.

Failing to use existing infrastructure

Most environments have some amount of extra capacity on their existing network and SAN switches, as when they were purchased, they planned for the future (probably not with blade enclosures in mind).  Spending money to re-purchase SAN and network hardware within a blade enclosure to allow the use of blades can kill the cost advantages of going with a blade solution.

Moving from “cheap” disks to expensive SAN disks

You typically can not put many local disks into blades.  This is in many cases a huge loss as not everything needs to be on the SAN (and in fact, certain things would be very stupid to put on the SAN such as SWAP files).  I find that these days many people overlook the wonders of locally attached disk.  It is the *cheapest* form of disk you can buy and also can be extremely fast!  If your application does not require any of the advanced features a SAN can provide then DONT PUT IT ON THE SAN!

Over-buying power

In facilities where you are charged for power by the circuit the key is to manage your utilization such that your un-used (but paid for) power capacity is kept to a minimum.  With a blade enclosure, on day 1 you must provide (in this example) two 30 amp circuits for your blade enclosure, even though you are only putting in 4 out of a possible 16 severs.  You are going to be paying for those circuits even though you are nowhere near fully utilizing them.  The Dell blade enclosures as an example require two three phase 30 amp circuits for full power (though depending on the server configurations you put in them you can get away with dual 30 amp 208v circuits).

Think about the end of the life-cycle

You can’t turn off the power to a blade enclosure until the last server in that enclosure is decommissioned.  You also need to maintain support and maintenance contracts on the SAN switches, network switches, and enclosure until the last server is no longer mission critical.

When are blades the right tools for the job?

  • When your operational costs of operations and maintenance personnel far outweigh the cost inefficiencies of blades.
  • When you are buying enough servers that you can purchase *full* blade enclosures that have similar connectivity and redundancy requirements (i.e. each needs two 1 gig network ports and two 4 gig SAN connections).
  • When you absolutely need the highest density of servers offered (note that most datacenters in operation today can’t handle the density of power required and heat that blades can put out).

An example of a good use of blades would be a huge Citrix farm, or VMWare farms, or in some cases webserver farms (though I would argue very large web farms that can scale out easily should be on some of the cheapest hardware you can buy which typically does not include blades).

Another good example would be compute farms (say even lucene cache engines) – as long as you have enough nodes to be able to fill enclosures with machines that have the same connectivity and redundancy requirements.

Conclusion

While blades can be great solutions, they need to be implemented in the right environments for the right reasons.  It may indeed be the case that the savings in operational costs of employees to setup, manage, and maintain your servers far outweighs all of the points raised above, but it is important to factor all of these into your purchase decision.

As always, if you have any feedback or comments, please post below or feel free to shoot me an email.

-Eric

Categories: Cisco, Dell, HP, IBM, Network, Sun, Systems Tags: