As we look at the many ways to improve storage utilization, data deduplication often pops up as a potential technique. Data deduplication, or sometimes referred to as “intelligent compression” or “single-instance storage”, is a method of reducing storage needs by eliminating redundant data. Deduplication is quite similar to data compression, but it looks for repeating sequence of very large chunks of data across very large comparison windows. Long sequences are compared to the history of other such sequences, and where matched, only one unique instance of the data sequence is actually retained on storage media. Redundant data is replaced with a pointer to that first unique data sequence copy.
For example, a typical email system might contain 300 instances of the same two megabyte (2 MB) file attachment. If the email platform is backed up or archived, all 300 instances are saved, requiring 600 MB storage space. With data deduplication, only one instance of the attachment is actually stored; each subsequent instance is just referenced back to the one saved copy. In this example, a 600 MB storage demand could be reduced to just 2 MB. Imagine the huge economic benefits! Of course, in a storage system, this is all hidden from users and applications, so the whole file is readable after having been written.
Posts Tagged ‘Storage’
Understanding Data Deduplication
Thursday, November 12th, 2009Virtualization – why you cannot ignore Storage?
Friday, October 30th, 2009There are many ways to optimize the data center. But as with all things good, it is necessary to take each step in moderation. Consolidation and virtualization is one way to achieve higher utilization of IT assets. But shrinking the IT assets by more than 30% overnight will create a sudden over-capacity at the physical facility layer. The way to achieve a sustainable overall efficiency in the data center is through finding the right balance between optimization the physical facility and IT consolidation & virtualization.
Virtualization technology coupled with the right management tool is a powerful method to help reduce the total number of IT assets by consolidating multiple workloads into each IT asset, hence, maximizing its utilization, reducing the complexity and management of disparate s
erver hardware and OS platforms. Nonetheless, server virtualization is only one component in a truly virtualized enterprise infrastructure. The other critical component is storage virtualization. Both of these works in tandem – one addresses the compute side of the equation, while the other addresses the data side of it.
RESTful APIs for Cloud
Monday, October 12th, 2009Over the past month, I have put together some thoughts around the components that should make up the composition of a Cloud Computing service, from a service producer’s perspective, in what I termed as the Big Picture. That posting can be found here and a more extended description posted on my web page here.
For this week, I’d like to shift my focus on some of the ingredients that make up the Cloud service. Lets start peeling off the skin of this onion..