Simulate a faulty block device with read errors?

Asked 8/12, 2009 at 23:48 Answered 25/6, 2020 at 19:13

I'm looking for an easier way to test my application against faulty block devices that generate i/o read errors when certain blocks are read. Trying to use a physical hard drive with known bad blocks is a pain and I would like to find a software solution if one exists.

I did find the Linux Disk Failure Simulation Driver which allows creating an interface that can be configured to generate errors when certain ranges of blocks are read, but it is for the 2.4 Linux Kernel and hasn't been updated for 2.6.

What would be perfect would be an losetup and loop driver that also allowed you to configure it to return read errors when attempting to read from a given set of blocks.

Laminitis answered 8/12, 2009 at 23:48 Comment(1)

In addition to the answers see the list of Linux disk fault injection mechanisms over on the Special File that causes I/O error Unix & Linux question. – Kelsi 24/10, 2017 at 20:7

It's not a loopback device you're looking for, but rather device-mapper.

Use dmsetup to create a device backed by the "error" target. It will show up in /dev/mapper/<name>.

Page 7 of the Device mapper presentation (PDF) has exactly what you're looking for:

dmsetup create bad_disk << EOF
  0 8       linear /dev/sdb1 0
  8 1       error
  9 204791 linear /dev/sdb1 9
EOF

Or leave out the sdb1 parts to and put the "error" target as the device for blocks 0 - 8 (instead of sdb1) to make a pure error disk.

See also The Device Mapper appendix from "RHEL 5 Logical Volume Manager Administration".

There's also a flakey target - a combo of linear and error that sometimes succeeds. Also a delay to introduce intentional delays for testing.

Alyss answered 9/12, 2009 at 1:11 Comment(6)

Worked perfectly -- just what I needed. Thanks! – Laminitis 10/12, 2009 at 1:3

The PDF above doesn't explain (at least I didn't understand) the command line syntax for "dmsetup create". The man page explains it's: dmsetup create dev_name dev_mapper_table. The second arg (dev_mapper_table) is a text file that that describes how blocks are mapped. The dmsetup man page is terse and doesn't explain the syntax of this table. Here's a link that explains it... link – Lianaliane 25/8, 2012 at 18:52

Will the disk "error" resolve itself after I restart the machine? – Suzetta 8/11, 2018 at 17:8

@Maksim: Even without restarting, you can still read from /dev/sdb1 directly. But yes, unless you put that dmsetup command or equivalent config in a startup file, that virtual device which gives an error-injected view of /dev/sdb1 won't survive a reboot. – Alyss 8/11, 2018 at 21:7

@PeterCordes what I meant is this: How do you undo this once I forcefully caused bad disk? Will the restart of the machine/vm fix the "error" by making my volume ReadWrite? – Suzetta 8/11, 2018 at 21:52

@Maksim: This doesn't cause a bad disk. To do that, you'd use hdparm --make-bad-sector, which would actually affect your hardware. (The man page warns that that is very dangerous.) Creating a new virtual block device with device mapper doesn't create errors on the underlying device. – Alyss 9/11, 2018 at 0:21

It seems like Linux's built-in fault injection capabilities would be a good idea to use.

Blog: http://blog.wpkg.org/2007/11/08/using-fault-injection/
Reference: https://www.kernel.org/doc/Documentation/fault-injection/fault-injection.txt

Hachmin answered 27/4, 2013 at 20:48 Comment(2)

While links are good for further reading, you should explicitly present a concrete answer here, I guess. – Borzoi 5/5, 2017 at 11:12

It also depends which kernel you are using: For example in SLES11 SP4 kernel the feature is not compiled in. – Borzoi 8/5, 2017 at 12:59

The easiest way to play with block devices is using nbd.

Download the userland sources from git://github.com/yoe/nbd.git and modify nbd-server.c to fail at reading or writing on whichever areas you want it to fail on, or to fail in a controllably random pattern, or basically anything you want.

Flessel answered 9/12, 2009 at 4:2 Comment(0)

I would like to elaborate on Peter Cordes answer.

In bash, setup an image on a loopback device with ext4, then write a file to it named binary.bin.

imageName=faulty.img
mountDir=$(pwd)/mount

sudo umount $mountDir ## make sure nothing is mounted here

dd if=/dev/zero of=$imageName bs=1M count=10
mkfs.ext4 $imageName
loopdev=$(sudo losetup -P -f --show $imageName); echo $loopdev
mkdir $mountDir
sudo mount $loopdev $mountDir
sudo chown -R $USER:$USER mount

echo "2ed99f0039724cd194858869e9debac4" | xxd -r -p > $mountDir/binary.bin

sudo umount $mountDir

in python3 (since bash struggles to deal with binary data) search for the magic binary data in binary.bin

import binascii

with open("faulty.img", "rb") as fd:
    s = fd.read()
    
search = binascii.unhexlify("2ed99f0039724cd194858869e9debac4")

beg=0
find = s.find(search, beg); beg = find+1; print(find)

start_sector = find//512; print(start_sector)

then back in bash mount the faulty block device

start_sector=## copy value from variable start_sector in python
next_sector=$(($start_sector+1))
size=$(($(wc -c $imageName|cut -d ' ' -f1)/512))
len=$(($size-$next_sector))

echo -e "0\t$start_sector\tlinear\t$loopdev\t0" > fault_config
echo -e "$start_sector\t1\terror" >> fault_config
echo -e "$next_sector\t$len\tlinear\t$loopdev\t$next_sector" >> fault_config

cat fault_config | sudo dmsetup create bad_drive
sudo mount /dev/mapper/bad_drive $mountDir

finally we can test the faulty block device by reading a file

cat $mountDir/binary.bin

which produces the error:

cat: /path/to/your/mount/binary.bin: Input/output error

clean up when you're done with testing

sudo umount $mountDir
sudo dmsetup remove bad_drive
sudo losetup -d $loopdev
rm fault_config $imageName

Balustrade answered 25/6, 2020 at 19:13 Comment(0)

Recommended topics

Hot tags