Decommissioning of Unused Servers

Description

  • Surveys of data centers often identify aged servers with no use still running–so–called "comatose" servers. Estimates of the prevalence of comatose servers vary:
    • Surveys have found that 8 to 10% of servers with no use are still running -- 150 of 1800 in one study; 354 of 3500 in another.16
    • According to Kenneth Brill, executive director of the Uptime Institute, "unless you have a rigorous program of removing obsolete servers at the end of their lifecycle...is very likely that between 15% and 30% of the equipment running in your data center is comatose. It consumes electricity without doing any computing."17
    • A recent New York Times article described a large data center near Atlanta that found more than half of its servers were comatose.18
  • In addition to the one workload, one box approach, which has led to server sprawl and low utilization of servers, a Green Grid study points to the following reasons for the "comatose" server phenomenon:19
    • Depleted IT staffs are busy maintaining/upgrading existing devices and deploying new products. They do not have time to address unused servers.
    • Traditional monitoring tools focus on server availability and performance and can overlook unused servers as a result.
    • Worried about service level agreements, IT staff may hesitate to turn off a server.
    • IT staff may not be diligent about tracking virtual machines. For example, staff may spin up a virtual server for a specific test and leave it running after the test is complete.
    • IT staff may like to have spare servers available and running "just in case" there is a spike in user demand, a need to migrate back to the old server, etc.
  • In that same Green Grid study, a survey revealed that one-third of IT managers have never tried to identify their unused servers and the remaining two-thirds assume the server is unused if:
    • No complaints are received following a server's unplanned outage (1%),
    • No complaints are received after switching server off (9%),
    • Very low utilization is reported through automated monitoring tools (22%), or
    • Application owners, polled periodically, indicate an application is no longer being used (35%).

Savings and Costs

  • Decommissioning allows you to retire servers and/or defer purchases of new servers, thus decreasing electricity consumption and waste heat. One watt-hour of energy savings at the server level results in roughly 1.9 watt-hours of facility-level energy savings from reducing energy waste in the power infrastructure (power distribution unit, UPS, building transformers) and reducing energy needed to cool the waste heat produced by the server.20
  • According to the Uptime Institute, decommissioning a single 1U rack server can annually save $500 in energy, $500 in operating system licenses, and $1,500 in hardware maintenance costs.21
  • Sun Microsystems cut 8 to 10% of IT equipment load and 11 to 14% of total load simply by decommissioning unused servers.22
  • AOL, winner of the Uptime Institute's first annual "Server Roundup", decommissioned over 9,484 servers for a total savings of close to $5 million. The effort included eliminating inefficient or abandoned servers, shutting down or merging extraneous applications, and increasing utilization in a private cloud data center.23 The decommissioning resulted in a savings of $1.4 million of its $13-million annual electric bill, $2.2 million in licensing costs, $62,400 in maintenance costs, and revenue of $1.2 million from recycling, scrapping, and reselling old equipment.
  • "Comatose" virtual servers can be costly as well. One organization found that 42 virtual servers that had been offline for 90 days had cost them $50,000 in disk and licensing costs. 24

Considerations 25

  • Seek and obtain upper-management support with a total cost of ownership message. This will help push IT staff to adjust workloads to accommodate a decommissioning effort.
  • Take inventory. It is easy to lose track of servers, for reasons discussed above. Take inventory of all systems and servers, and identify the equipment types, location, and Service Level Agreements (SLAs).
  • Identify unused servers.
    • Define secondary and tertiary work common across servers in the data center so that baseline utilization can be defined and unused servers identified. For example, if all the inbound network activity to a server is from the backup server, domain controller, and antivirus server, then the server may be unused.
    • Data Center Infrastructure Management (DCIM)26 can be used to identify unused servers by examining CPU utilization of each server.27
    • At Sun Microsystems, systems were turned off but then kept in place for 90 days. If there were no complaints after crossing into the next fiscal quarter, the unused systems were removed.28
    • Examine the possibility of sub-optimal virtualization that leads to unused virtual servers.
  • Mitigate risks.Emphasize the fact that virtualization, with the ability to deploy a virtual server in minutes, can offset the risks of a mistaken server decommissioning.
  • Develop a lifecycle plan. Develop a clear set of guidelines regarding what to do with a decommissioned server, such as recycling or repurposing,

16 Finding the Green in Green Computing, by Mark A. Monroe, Director, Sustainable Computing, Sun Microsystems, Inc. Sep, 2009. Slide 34.
17 Kill Comatose Computers, Forbes, by Ken Brill, 11/19/2008. 
18 Power, Pollution and the Internet, New York Times, by James Glanz, 9/22/2012. 
19 Unused Servers Survey Results Analysis, The Green Grid, 2010. 
20 New Strategies for Cutting Data Center Energy Cost and Boosting Capacity, Emerson Network Power presentation, 2012, p.8. 
21 Important to Recognize the Dramatic Improvement in Data Center Efficiency, Uptime Institute, 9/25/12. 
22 Energy Logic: Reducing Data Center Energy Consumption by Creating Savings that Cascade Across Systems, Emerson Network Power, 2008. 
23 AOL's Strategic Server Decommissioning Sets Out to "Clear the Cruft", Uptime Institute Blog, 7/30/2012. 
24 Virtual Sprawl: A Symptom of a Hidden Ailment, Embotics weblog on CIO.com, March 31, 2010. 
25 Section was mainly developed examining this document: Unused Servers Survey Results Analysis, The Green Grid, 2010: 
26 DCIM is the integration of information technology (IT) and facility management disciplines to centralize monitoring, management and intelligent capacity planning of a data center's critical systems. Essentially it provides a significantly more comprehensive view of ALL of the resources within the data center.
27 Unlock Your Capacity by Unplugging Your Ghost Servers", Industry Perspectives, 10/11/2012. 
28 Energy Logic: Reducing Data Center Energy Consumption by Creating Savings that Cascade Across Systems, Emerson Network Power, 2008.