pchelplinks.com

Data lifecycle management: hard drives are not enough

Posted by: software on: 30 Aug, 2009

Today, there is not a catchall application that can satisfy all the compliance questions and requirements; however, fixed-disk compliance solutions can go a long way in meeting the initial requirements to the compliance problem.

There are a number of pieces to the compliance puzzle; identifying what data needs to be saved, how fast that data is growing, how fast the data needs to be accessed, how long should it be retained, what federal/state/local regulations govern the data in question, and should the data be disposed of once it’s met its end of life.

Limiting the number of times data must be migrated reduces the chance of data loss or corruption, as well as reducing the cost of managing the data over time. The use of long-term media limits the number of times data must be migrated within the same media type throughout its life cycle. Media life Most of the fixed-disk compliance solutions use low-cost: ATA disks. The usable life of these disks is in the range of 3-5 years. All of these solutions use RAID technology to protect the data from failing disks; however, this is a cost that can be mitigated with longer-term removable media.

(I will leave this theme for another article. As a result, fixed-disk solutions should be backed up. Most compliance applications do not have a means of dealing with backup sets. Backups are not archives and should not be treated as such. That is the reason for redundancy. Whether the system is backed up to disk or tape is not really an issue; however, there is a cost and management issue regarding backing up data. This presents a compliance issue in that if data is disposed of on the compliance CAS system, how is the data on the backup set disposed of? However, redundancy is not enough to protect critical data on magnetic disk. Either the administrator or the application must dispose of the data on the backup set.

Backup All hard drives fail more often than we: would like. ) Backups by definition create a second set of data.

Long-term data is data that needs to be accessible for more than three years, yet has settled into its final version and is unlikely to change further. There are a number of reasons to support a tiered architecture beyond fixed-disk solutions.

Fixed disk does have its limitations with long-term data. However, data migration should not stop there. For compliance data that is still in a changing state, or for compliance data that needs to be accessed often, fixed disk makes a lot of sense.

All vendors marketing diskbased storage market – their solution in terms of “raw” storage and “usable” storage. As a result of redundancy, each object must be stored in multiple locations, at least partially.

Volume: As stated above, fixed-disk solutions must have redundancy to protect the disks from failure. Cost per megabyte should be determined in terms of usable storage to get an accurate cost of ownership. Usable storage is the amount of storage that the customer can actually use after redundancy and overhead. This redundancy drives up the volume of stored objects and, therefore, increases the cost of fixed-disk Content Addressable Storage (CAS). Raw storage is the total amount of storage a user must buy as part of the system. This is not new to primary storage; however, it is a requirement for fixed-disk solutions in a critical environment.

This API can also cause a significant performance hit to the system.

This API also takes up disk space in the system, as much as 25% of the usable storage. So, there is no performance advantage to using disk-based solutions over removable storage solutions. The API can also perform a number of other tasks which are specific to the vendor that produced it, such as reducing redundant objects, encrypting data for security protection, and so on. The tasks can vary greatly; however, one of those tasks is to assign a retention period and lock the data in a non-alterable state for the life of the data based on specific policies. Overhead: In order to meet write-once requirements for magnetic disk solutions, the data must go through an Application Programmatic Interface (API) that forms a data protection layer of non-eraseability while the data is under management in the compliance system. Our experience is that data transfer rates for fixed-disk transfer devices are about the same as removable optical transfer rates (2-4 MB/sec) depending upon object size and volume.

This volume growth generally is measured in terabytes or blades of storage. Blades take up rack space, and rack space takes up room. This will be problematic in the coming months and years as data centers outgrow the fixed-disk storage systems containing CAS information.

Pick your analyst to get a number; however, all agree archives are growing by more than 100%. The solution will be to either buy more disks in the form of blades, increase the size of the disks when possible with upgrades or move the data to a next tier solution that is removable. In any case, blades cost money. Data Growth Compliance: data is growing at an alarming rate. On a fixed-disk compliance system, the only way to add more volume is to add more disks. The amount of cost then tracks in a linear growth curve relative to your future storage needs.

No Responses to "Data lifecycle management: hard drives are not enough"

Comments are closed.