The report examined , commercial hard drives, ranging from 80GB to GB in capacity, used at Google since Not very, as Google found, and many
in the industry already knew. This is a fairly surprising result, which could indicate that data center or server designers have more freedom than
previously thought when setting operating temperatures for equipment that contains disk drives. MTBF figures are just like any other storage
performance statistic: And, oddly enough, their definition makes drives look more reliable than what you and I see. MTBF, therefore, says nothing
about how long any particular drive will last. News services Your news when you want it. For example, after the first scan error, they found a drive
was 39 times more likely to fail in the next 60 days than normal drives. Since failures are sometimes the result of a combination of components i.
On hard drives we found this: He is absolutely correct. Hard drive failure trends - Google PDF. Surprisingly, we found that temperature and
activity levels were much less correlated with drive failures than previously reported. Monday, 19 February , SMART will alert you to some
issues, but not most, so the industry should get cracking and come up with something more useful. We present data collected from detailed
observations of a large disk drive population in a production Internet services deployment. Back up regularly, and if you do get one of these errors,
get a new drive. As the figure here shows, failure rates do not increase when the average temperature increases. In the lower and middle
temperature ranges, higher temperatures are not associated with higher failure rates. I corrected the post. Despite this high correlation, we
conclude that models based on SMART parameters alone are unlikely to be useful for predicting individual drive failures. Hard disk test 'surprises'
Google. The guerilla plant How the world's oldest clove tree defied an empire. A wide variety of manufacturers and models were included in the
report, but a breakdown was not provided. So while your disk drive might crash without warning at any time, they did find that there are four
SMART parameters where errors are strongly correlated with drive failure:. Despite their importance, there is relatively little published work on the
failure patterns of disk drives, and the key factors that affect their lifetime. Lower temperatures are associated with higher failure rates Google
report Hard drives less than three years old and used a lot are less likely to fail than similarly aged hard drives that are used infrequently, according
to the report. Most available data are either based on extrapolation from accelerated aging experiments or from relatively modest sized field
studies. It was also thought that hard drives preferred cool temperatures to hotter environments. We might have an insight about the temperature
vs. E-mail this to a friend Printable version. Google file system eval. Google found surprising results in five areas: Second, vendors look at their
returned unit data. The report said that there was a clear trend showing "that lower temperatures are associated with higher failure rates". The BBC
is not responsible for the content of external internet sites. Drive age has an effect, but again, only at very high temperatures. Also, I fixed my
arithmetic, so the vendors look even worse. After the first year, the AFR of high utilization drives is at most moderately higher than that of low
utilization drives. Workload numbers call into question the utility of architectures, like MAID, that rely on turning off disks to extend life. Is that
what Google found? A teenager might want you to believe that, but the Googlers found little correlation between disk workload and failure rates.
CERN's data corruption research. Disk MTBF numbers significantly understate failure rates. Vendors typically look at two types of data. Google
buys large quantities of a certain drive model, but only for a few months, until the next good deal comes along. Lower temperatures are associated
with higher failure rates. Good news for internet data center managers. Almost 4 years to the day after I posted this an alert reader pointed out a
mistake in the AFR calculation above. Home RAID vs backup. So shake that new drive out while it is still under warranty. An open source SAN.