26th February 2015
A few days ago my smartd daemon (from the smartmontools package) notified me about a +1 increase in Current_Pending_Sector (197) and Offline_Uncorrectable (198) SMART attributes. The 2.5″ Fujitsu laptop hard-drive these appeared on is very old, and it also has been working 24/365 since a little over a year.
Running a short SMART self-test (
sudo smartctl -t short /dev/sdc) produced a read error at sector 1289:
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
1 Short offline Completed: read failure 80% 22339 1289
Looking at the partition table of /dev/sdc, we see that this sector is outside of the only RAID partition on the disk, which starts at sector 2048:
Device Boot Start End Blocks Id System
/dev/sdc1 2048 117209087 58605088 fd Lnx RAID auto
To make sure that sector 1289 is re-allocated, some data needs to be written to it, e.g. with
sudo dd if=/dev/zero of=/dev/sdc count=1 seek=1289.
You may also try to read the sector first, then – if successful – write it back to the disk:
i=1289 ; sudo dd if=/dev/sdc of=/tmp/sector count=1 skip=$i && sleep 1 && sudo dd if=/tmp/sector of=/dev/sdc count=1 seek=$i
Another solution (untested!) would be to read/write a bunch of sectors around the problematic one (this is similar to what
badblocks -n does):
while [ $i -lt 1300 ]
do echo $i
# read once (count=1) 512 bytes (default ibs/obs values of dd) to a temporary file, skipping first $i ibs-sized blocks (skip=$i);
# if successful, then (wait a bit and) write the same data back to disk, skipping $i obs-sized blocks (seek=$i)
dd if=/dev/sdc of=/tmp/sector count=1 skip=$i && sleep 1 && dd if=/tmp/sector of=/dev/sdc count=1 seek=$i
After sector re-allocation both Reallocated_Sector_Ct (5) and Reallocated_Event_Count (196) SMART attributes increased from 0 to 1, while Current_Pending_Sector (197) decreased from 1 to 0. In addition, running
badblocks /dev/sdc and
diskscan --output diskscan-sdc-out-25-02-2015.json /dev/sdc (both in read-only mode, of course) has not shown any read errors, and another short SMART self-test also finished successfully. So, is the problem solved?
Unfortunately, Offline_Uncorrectable (198) stayed at 1, and I kept getting warning emails. Apparently, my HDD simply does not decrease the Offline_Uncorrectable (198) attribute after sector re-allocation.
In this case the proper solution is to edit
/etc/smartd.conf so that it only sends emails if Offline_Uncorrectable (198) attribute increases, and not if it is non-zero. Just add this option to your HDD scan line in