Checking the health of your SSD from GNU/Linux


As you already know, SSDs degrade over time, as its memory cells support a limited amount of writes (a sacrifice that many people, including myself, find worthy in exchange to get rid of those slow and noisy mechanical disks).

This morning, while talking to a colleague, an interesting question was raised: how do we know when our SSD is near to day of its defunction?

Expensive, enterprise-grade SSD cards, like the ones from TMS, come with monitoring tools and a nice entry in “/proc” that can be easily checked by a script. But I had no idea if there’s some similar for consumer SSDs.

Turned out it was quite easy. Most models provide health info via the S.M.A.R.T. feature, so you can obtain it with smartctl (on Fedora, this utility came into the package smartmontools:

[slopez@slp-work ~]$ sudo smartctl -a /dev/sda (...) SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 9 Power_On_Hours 0x0032 099 099 --- Old_age Always - 23 12 Power_Cycle_Count 0x0032 099 099 --- Old_age Always - 34 177 Wear_Leveling_Count 0x0013 099 099 --- Pre-fail Always - 1 178 Used_Rsvd_Blk_Cnt_Chip 0x0013 077 077 --- Pre-fail Always - 458 190 Airflow_Temperature_Cel 0x0022 067 051 --- Old_age Always - 33 235 Unknown_Attribute 0x0012 099 099 --- Old_age Always - 10 (...)

The most relevant attribute while checking the health of our SSD, is Wear Leveling Count (more info on Wear_leveling on Wikipedia). But don’t get fooled by its name, it’s not really a count, but an indicator of how healthy are the cells in your disk, where 100 is the best, and 0 the worst.

When this value falls below 20, you should start considering backing up your data and buying a new disk.