how do the various cloud storage services compare in terms of data durability, and which one offers the highest level of durability?
Guys, does anyone know the answer?
get how do the various cloud storage services compare in terms of data durability, and which one offers the highest level of durability? from screen.
Cloud Storage Reliability: Availability vs Durability
Knowing the difference between Storage Availability and Durability can help inform your decision on the wide selection of cloud object storage providers.
Bunny Academy CDN
Overview CDN Networking Security
Storage Reliability: Durability vs. Availability
The differences between storage durability and availability.
Storage Reliability: Durability vs. Availability
Storage is bound to fail — regardless of whether you are using enterprise-grade mediums (both SSDs and HDDs) or not, data integrity is not guaranteed. Many factors can cause storage to fail or become unavailable: bit-flips, RAID failure — even a fibre cut can disrupt your access to stored information. In the case of web-hosting, or content delivery networks (CDNs with storage APIs or cloud storage providers), the term “Storage durability” and “Storage availability” are often used incorrectly — leading to, at best, no loss of data — or worse, losing valuable (unrecoverable) information.
Having said that, the distinction between the two terms is small, but succinct: durability refers to how safe data is from being lost, while data availability refers to how often you can access your stored data.
When a provider mentions the term “reliability,” or “uptime,” they are often speaking of your guaranteed access (you may be promised, say, that your data is accessible 99.95% of the time during any given month; this is equal to around 43 seconds a day of downtime or almost 5 hours of downtime a year). While reliability can refer to the durability of your stored objects, it is crucial to have this clarified to avoid any unintended consequences; otherwise, a small fire, or events of force majeure (floods, earthquakes, etc. that can damage drives), may wipe out a large portion (or even all of) of data.
One idea that has not been mentioned yet is how storage durability can be improved. When storing important information, cloud platforms often employ “RAID” setups and backups. The most important, but often overlooked feature, is off-site replication.
A simple analogy can explain: imagine if all of your data is stored on your thumb-drive. If there is a flood, your flash drive might become irrecoverably damaged and data is lost. If you had two mirrored USB drives, albeit having one stored in another state or province, the latter drive will still be intact, thus maintaining the integrity of your data. This can be further improved by having local mirrors, or “RAID,” setup among both drives — this ensures that in the event of one site being lost, the other site still has a layer of redundancy.
Simply put: from data corruption, to complete data loss, data durability is a measure of how safe data is from being lost. If a provider advertises 99.99995% data durability, this means that there will be a 0.00005% chance of data being lost (this does not necessarily have to be a complete loss — data loss refers to a loss of 1 or more bits) during a given year.
The easiest category of “reliability” to understand is availability. While providers often strive to guarantee network stability and have mirrors to ensure uptime during maintenance, there are always factors that can defeat even the most reliable infrastructure.
If “host A” states that they will guarantee availability for 99.99% of the time, while “host B” states that they will only guarantee availability for 99% of the time, we are looking at almost a four day difference in possible downtime a year.
(for easier calculations, you can refer to https://uptime.is)
Depending on the use-case, a single second of lost availability might cost hundreds — even millions of dollars in lost revenue. So, taking storage availability into account is significant and as important as durability.
When browsing for cloud providers, or even building your own server, data durability and availability are key factors to deciding on a suitable solution. Data loss or corruption can occur at any time — even with the biggest of cloud providers. Furthermore, data availability should not be overlooked either (not being able to access your objects can mean not being able to load product images during, ex. A Christmas sale — resulting in significant financial penalties).
In essence, while they may seem the same, taking both data availability and data durability into account will ensure that no unexpected surprises occur.
Service availability comes in a few forms: network availability, to a server's guaranteed uptime. This is known as "availability" -- some companies also offer guarantees, which is noted under a SLA (Service Level Agreement).
HDD refers to a spinning disk drive; a hard drive.
SSD, or Solid State Drive, refers to a form of flash (NAND) based drive.
Cloud Storage Durability vs. Availability: What Are the Differences?
Two key metrics for measuring cloud storage performance are durability and availability. We look at what they are and why they are important.
What’s the Diff: Durability vs Availability
June 20, 2019 by Roderick Bauer // 2 Comments
When shopping for a cloud storage provider, customers should ask a few key questions of potential storage providers. In addition to inquiring about storage cost, data center location, and features and capabilities of the service, they’re going to want to know the numbers for two key metrics for measuring cloud storage performance: durability and availability.
We’ve discussed cloud storage costs and data center features in other posts. In this post we’re going to cover the basics of durability and availability.
Add Redundancy, Support Compliance, and More With Backblaze Cloud Replication
Backblaze Cloud Replication is a new service that allows customers to automatically store and sync data in different locations—across regions, across accounts, or in different buckets within the same account. There are three main reasons you might want to use Cloud Replication:Data Redundancy: Replicating data for security, compliance, and continuity purposes.Data Proximity: Bringing data closer to distant teams or customers for faster access.Replication Between Environments: Replicating data between testing, staging, and production environments when developing applications.
Click here to learn more and get started today if you’re not yet a Backblaze B2 customer.
What is Cloud Durability?
Think of durability as a measurement of how healthy and resilient your data is. You want your data to be as intact and pristine on the day you retrieve it as it was on the day you stored it.
There are a number of ways that data can lose its integrity.1. Data loss
Data loss can happen through human accident, natural or manmade disaster, or even malicious action out of your control. Whether you store data in your home, office, or with a cloud provider, that data needs to be protected as much as possible from any event that could damage or destroy it. If your data is on a computer, external drive, or NAS in a home or office, you obviously want to keep the computing equipment away from water sources and other environmental hazards. You also have to consider the likelihood of fire, theft, and accidental deletion.
Data center managers go to great lengths to protect data under their care. That care starts with locating a facility in as safe a geographical location as possible, having secure facilities with controlled access, and monitoring and maintaining the storage infrastructure (chassis, drives, cables, power, cooling, etc.)2. Data corruption
Data on traditional spinning hard drive systems can degrade with time, have errors introduced during copying, or become corrupted in any number of ways. File and operating systems and utilities have ways to double check that data is handled correctly during common file and data handling operations, but corruption can sneak into a system if it isn’t monitored closely or if the storage system doesn’t specifically check for such errors as is common with systems with ECC (Error Correcting Code) RAM. Object storage systems will commonly monitor for any changes in the data, and often will automatically repair or provide warnings when data has been changed.
How is Durability Measured?
Object storage providers express data durability as an annual percentage in nines, as in two nines before the decimal point and as many nines as warranted after the decimal point. For example, eleven nines of durability is expressed as 99.999999999%. What this means is that the storage vendor is promising that your data will remain intact while it is under their care without losing any more than 0.000000001 percent of your data in a year (in the case of eleven nines annual durability).
Of the major vendors, Azure claims 12 nines and even 16 nines durability for some services, while Amazon S3, Google Cloud Platform, and others claim 11 nines, or 99.999999999% annual durability. Backblaze has calculated durability at 11 nines but, more importantly, we’re the only ones to disclose our math. We’ve also described why it doesn’t really matter.
How is Durability Maintained?
Generally, there are two ways to maintain data durability. The first approach is to use software algorithms and metadata such as checksums to detect corruption of the data. If corruption is found, the data can be healed using the stored information. Examples of these approaches are erasure coding and Reed-Solomon coding.
Another tried and true method to ensure data integrity is to simply store multiple copies of the data in multiple locations. This is known as redundancy. This approach allows data to survive the loss or corruption of data in one or even multiple locations through accident, war, theft, or any manner of natural disaster or alien invasion. All that’s required is that at least one copy of the data remains intact. The odds for data survival increase with the number of copies stored, with multiple locations an important multiplying factor. If multiple copies (and locations) are lost, well, that means we’re all in a lot of trouble and perhaps there might be other things to think about than the data you have stored.
Data Availability vs. Durability – An Important Difference
Data Availability vs. Durability: these are two very different aspects of data accessibility, that serve different objectives. Here's what you should know.
Data Availability vs. Durability – An Important Difference You Should Know
5 Min Read
November 9, 2016
Things can, and do, go wrong. It’s a fact of life, and businesses spend time and money preparing for unexpected hiccups. The data storage industry has spent decades enhancing the reliability of storage architectures by assuming their components will fail at some point. All elements in the chain—cables, power supplies, cooling, drives, software (and even the sys admins) —can possibly fail, without forewarning, and disrupt access to users’ data.
[Tweet “#Data Availability vs. Durability – An Important Difference You Should Know #datacenter”]
Reliability has often been equated with data accessibility, i.e. ensuring data access at a given SLA. A more contemporary perspective sees this differently and with two key, separate, measurements: availability and durability.
Let’s explore some of the differences and how they matter for your business.
Data Availability vs. Durability – They’re Not The Same Thing
Availability and durability are two very different aspects of data accessibility. Availability refers to system uptime, i.e. the storage system is operational and can deliver data upon request. Historically, this has been achieved through hardware redundancy so that if any component fails, access to data will prevail. Durability, on the other hand, refers to long-term data protection, i.e. the stored data does not suffer from bit rot, degradation or other corruption. Rather than focusing on hardware redundancy, it is concerned with data redundancy so that data is never lost or compromised.
Availability and durability serve different objectives. For data centers, availability/uptime is a key metric for operations as any minute of downtime is costly. The measurement focuses on storage system availability. But what happens when a component, system or even the data center goes down? Will your data be intact when the fault is corrected?
This illustrates the equal importance of data durability. When an availability fault is corrected, it is essential that access to uncorrupted data is restored. With the explosion of data created, the potential of mining, and growing needs for longer retention rates (for everything) you can imagine how this is paramount for business success.
Consider the potential competitive, financial or even legal impact of not being able to retrieve the archived master/reference copy of data. Hence, both data availability and data durability are essential for short- and long-term business success.
Ensuring Data Availability – RAID or Rateless Erasure Coding?A common approach to ensuring data availability has been through RAID-based architectures. Striping data across multiple drives can protect against the failure of one or two drives, but performance can fall dramatically during rebuild operations, which can have negative impacts on business operations. Years of data center experience shows that drive failures are usually not isolated incidents: when one drive in a RAID group fails, the likelihood of other group member failing increases. An Unrecoverable Read Error during a rebuild operation means data is now permanently lost, which places your business at risk.
As drive capacities have greatly increased, so too have rebuild times. What formerly took minutes can now take hours, or even days. In addition, this requires replacement of the failed drive ASAP, be it weekends, holidays or the middle of the night.
Object storage achieves data availability through advanced erasure coding whereby data is combined with parity information and then sharded and distributed across the storage pool. Since only a subset of the shards are needed to rehydrate the data, there is no rebuild time or degraded performance, and failed storage components can be replaced when convenient.
Data Durability – RAID Alone Doesn’t Deliver
As you have probably surmised, achieving data availability is not quite the same as having access to the data that was originally stored. A media failure such as bit rot, where a portion of the drive surface or other media becomes unreadable, corrupts data thus making it impossible to retrieve the data in its original unaltered form. Simply protecting against a complete hard drive failure such as with RAID, does not protect against the gradual failure of the bits stored on magnetic media.
The combination of widely distributed erasure coded data (say with an 18/8 coding policy) and data scrubbing technology that continuously validates the data written on the media can enable you to achieve 15 nines of data durability. In simpler terms: for every 1,000 trillion objects, only one would not be readable. How’s that for data durability? It’s not surprising that hyperscale data centers and cloud services providers use object-based storage to meet the needs for the highest data availability and data durability.
Guys, does anyone know the answer?