Better Management of Data Storage
Description
Data storage is one of the fastest growing sectors in IT. In a 2012 Information Week survey of over 300 IT professionals,1 over half said their annual growth rate for storage was between 10% and 24%. 17% stated the annual storage growth rate was 25% to 49%. The concept of energy efficient data storage is simple — use less storage to use less energy — and can result from the better data storage best practices available today through storage resource management tools. In addition, there are certain storage hardware devices that use much less energy. These best practices and technologies are summarized below. 2
Best Practices for Using Less Storage
- Automated Storage Provisioning. Automated storage provisioning: 1) improves storage efficiency through right-sizing; 2) identifies and re-allocates unused storage, and; 3) increases capacity of by improving utilization of existing storage.
- Data Compression. For a long time, data compression has been used to minimize transmission traffic and reduce the amount of data stored. In fact, a recent survey revealed that over half of IT administrators used data compression. 3 Be aware that some formats are already compressed (e.g., JPEG, MPEG and MP3, etc.) and that data should be compressed before encryption on writes and decrypted before decompression on reads. However, compressing data has an energy overhead in the act of compressing and decompressing the data. Files that are rarely accessed are better candidates for compression than files that are regularly accessed. Savings from storage compression have been estimated at 15 to 30%.
- Deduplication. Over half of the total volume of a typical company’s data is in the form of redundant copies. Deduplication software can condense the amount of data stored at many organizations by more than 95%, by finding and eliminating unnecessary copies. Storing less data requires fewer hardware resources, which in turn consumes less energy. Deduplication works by retaining unique files or data blocks and providing pointers to duplicates. The percentage of IT administrators using deduplication rose from 38% in 2011 to 45% in 2012.4 Savings from deduplication range from 40 to 50%.
- Snapshots. A form of deduplication and formally known as “delta snapshots” and sometimes referred to as “cloning”, snapshots are particularly important when running simulations or modeling on large data sets. Instead of using additional space for complete copies of live data, snapshots create temporary “copies” of data that only include data changes. A recent survey revealed that 56% of IT administrators used storage-based snapshots. 4 Compared to point-in-time copies, snapshots can save 80 to 95% in energy use.
- Thin Provisioning. In the past, servers were allocated storage based on anticipated requirements. Over provisioning (i.e., “fat” provisioning) of storage would result because applications would suffer performance issues if these limits were then exceeded. Thin provisioning allocates storage on a just-enough, just-in-time basis by centrally controlling capacity and allocating space only as applications require the space. Thus you can allocate space for an application with data storage needs that you expect to grow in the future, but power only storage that is currently in use. A recent survey revealed that 28% of IT administrators used thin provisioning. 4 A good example of thin provisioning is Gmail. Every Gmail account has a large amount of allocated capacity but, because most Gmail users only use a fraction of the allocated capacity, this "free space" is "shared" among all Gmail users. Storage utilization, which typically averages around 30%, can reach over 80% utilization. Savings from thin provisioning have been estimated at between 40 to 60 percent.
- RAID Level. RAID (redundant array of independent disks) is a storage technology that combines multiple disk drive components into a single logical unit. Different RAID levels are defined based on the level of redundancy and performance required. RAID 1 creates a duplicate copy of disk data but also doubles your storage and power consumption. For storage that is not mission critical, RAID 5 guards against a single disc drive failure in your RAID set by reconstructing the failed disc information from distributed information on the remaining drives. Requiring only one extra redundant disc, RAID 5 saves energy although it does sacrifice some reliability and performance. Going to an 11-disc RAID 5 from a 20-disc RAID 1 configuration would save 45% of data storage energy use.
- Tiering Storage. Stored data typically becomes less used/accessed more infrequently as it ages and therefore does not have to be stored on high-performance drives. Storage tiering (also referred to as Information Lifecycle Management or Hierarchical Storage Management) increases efficiency and lowers costs by storing data according to the relative demand for that data. Using this method, you store low-priority data rarely accessed information on higher latency equipment that uses less energy. Meanwhile, you store data that is, or could be expected to be, in immediate demand on low latency storage equipment that consumes the more energy. The larger the capacity and slower its operating speed, generally the more efficient its energy use. For example, you can use high-speed drives only where necessary, and use slower drives for applications that do not require instantaneous response. Use of automated tiering has increased from 13% to 20% from 2011 to 20124 as more IT administrators are seeing the value of relieving themselves of manual data storage tiering.
Using Storage Equipment That Uses Less Energy
- Lower Speed Drives. Higher spin speeds on high performance hard disc drives (HDDs) (e.g., 15K rpm SAS drives) mean faster read/write speeds. All things being equal, power use is proportional to the cube of disc spin speed. To reduce energy use of storage, the policy should be the look for the slower drives (e.g., 7.5 K rpm SATA drives) available to accommodate the specific tasks at hand. Please note that performance degradation is still small from the users perspective and accessing the data is still capable within fractions of a second.
- Massive Array of Idle Discs (MAID). MAID is more energy-efficient than older systems and is often a good solution for tier 3 storage (data accessed infrequently). MAID saves power by shutting down idle disks. It powers the disks back up only when an application needs to access the data. A MAID system can have hundreds or even thousands of individual drives and often can replace tape libraries. The percentage of IT administrators using MAID rose from 9% in 2011 to 11% in 2012.4
- Solid State Drives (SSDs). With no spinning disks to power, energy-saving solid-state storage is increasingly becoming an option because of “read” speeds that are orders of magnitude faster than hard discs. Becoming more popular, the percentage of IT administrators using SSDs on disk arrays and servers was up to 20% in 2012.4 However, please be aware that:
- SSDs actually “write” more slowly, or at the same speed, than hard discs so are well suited for low write/high read applications (the overwhelming percentage of applications) and may not be the best fit for or high “write” applications like email and databases.
- Solid-state storage is substantially more expensive per gigabyte than hard-disk drives. However, SSDs have the potential to be cost-effective for IT shops that were compelled to bring in large amounts of high-performance HDDs or short-stroke disks in order to squeeze out better performance with especially heavy I/O-intensive read workloads.
1“State of Storage 2012”, Information Week, February 2012, by Kurt Marko, Figure 2.
2 Two Storage Networking Industry Association (SNIA) sources were heavily relied upon in the development of this section 1) “Technologies for Green Storage,” SNIA presentation, 2012, Alan G. Yoder, Net App; 2) “Best Practices for Energy Efficient Storage Operations Version 1.0”, Tom Clark, Brocade and Alan Gl. Yoder, NetApp, October 2008. For more information, go to snia.org/emerald
3 “State of Storage 2012”, Information Week, February 2012, by Kurt Marko, Figure 7.