Jump to content

Disk replication with dd


Pierre Baco
 Share

Recommended Posts

Hi, guys (and girls).

 

Config: Athlon 2800 XP + 512 MB RAM + 2 x MAXTOR 80 Gb disks (brand new) and Mdk 10.0 official fully updated.

 

First disk (IDE 0-0) /dev/hda is partioned like 5 Gb for / (far enough), 1 Gb for swap (in case I add some RAM), and the rest for /home (will be a big samba server). / and /home are ext3.

 

Second disk (IDE1-0) /dev/hdb was installed "blank". It is never "mount"ed.

 

Both disks in IDE removable racks (no hotplug).

 

I've also a 3rd 80Gb Maxtor in a USB2 "hotplug" external case for normal file-dir backups using a "cron"ed tar script.

 

Objective: In addition to the file-dir (full tgz ) backup, I wand to make a daily (cron) /dev/hda image on /dev/hdb, so whenever /dev/hda fails, I can simply switch the disks to have a running system within seconds. RAID won't help for this. Solution MUST be simple and affordable (no fancy RAID controllers or complex operations to restart the server after a shutdown). Total hardware cost so far is less than 650 Euros.

 

Right now, I use a simple "dd bs=512 if=/dev/hda of=/dev/hdb" , to make a "full raw" copy of my primary disk, including the MBR, to the second one. I use the bs=512 to stay at sector size, so if there's a read/write error on a given read sector, only this one will be affected (corrupted) in the written image.

 

Takes approx 35 minutes to complete the "raw copy".

"halt" + switch off. Swap the disks. Switch on. Bingo, it boots nicely (except normal msg about system not switched off correctly). I've done it many times without any problem.

 

But:

- From time to time (not always), dd complains about "input/output error" on /dev/hda or /dev/hdb, but does the full copy anyway.

- cfdisk fails on /dev/hdb (complains about partition error).

 

I guess dd may have some problems when reading or writing raw data from/to the disks. Although there are strictly identical, bad sectors are not identical (number, location). Unfortunately, dd is not verbose about errors (where, when, how many).

 

I've tried the same trick at runlevel 1 or 3, at partition level, with dd if=/dev/hda1 (/), /dev/hda2 (swap), /dev/hda3 (/home). Same results (random i/o errors, cfdisk fails). Tried additional dd options (conv=noerror, notrunc), same results. But once disks are swapped and system is restarted I always have a running system in less than 2 minutes. I can't figure out how-where the i/o errors affect the system health once restarted.

 

Q: Am I doing something wrong ? These i/o errors don't make me feel comfortable with this solution. Is there another (simple and cheap) way-trick to do this (quick restart after a primary disk failure)?

Edited by Pierre Baco
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

×
×
  • Create New...