I have been hosting my web site since Dec 2000. I have a huge collection of photos. Photos scanned from film before 2002 and photos from digital camera since September 2002. I also have a huge collection of MP3s ripped from my collection of CDs (3500+ songs). All this data is stored on my home file server. With such a large amount of data (and not-so-infrequent disk failures) over the years I have built a nice data storage solution for myself.
Before I get into my storage description let me tell you something about the computers I have at home.
Machine Name | Processor | RAM | Hard Disk | Optical Drives | Operating System |
AMD64 | Athlon 64, 3400+ | 1 GB DDR400 | 2x250 GB, SATA (RAID 1) | CD-RW/DVD, DVD Writer | Windows XP Pro |
DHAA | Pentium 4, 2.8 GHz, HT | 512 MB DDR400 | 1x120 GB, ATA100 | CD-RW/DVD, DVD Writer | Windows XP Pro |
AMIT | Athlon XP, 2600+ | 784 MB, DDR266 | 3x60 GB ATA100 (RAID 5), 1x40 GB ATA100 | None | Linux, Centos 4.3 |
LINUX | Pentium 4, 2.0 GHz | 512 MB RDRAM | 2x80 GB ATA100 | DVD | Linux, Fedora Core 4 |
MUSIC | Core Solo, 1.5 GHz | 1 GB DDR2-667 | 1x60 GB SATA | CD-RW/DVD | Mac OS X, 10.4 |
UBUNTU | Pentium 4, 2.6 Ghz | 512 MB, DDR-266 | 20 GB ATA100 | DVD | Linux, Ubuntu 6.06 |
'AMIT' is file/web server and also a NAT firewall. LINUX is the mail server hosting Zimbra.
All my important data including my photos, music, documents resides on the RAID 5 volume on the file server. I have written a nifty backup program that runs nightly and takes a full backup on 1st, 11th and 21st of each month. On all other days it takes a snapshot of only the files that have changed since last full backup. This allows me to restore any piece of data within last month to a single day granularity. This program uses some PERL scripts and rsync to achieve this. The 'backup.pl' can also restore any backup to any days snapshot on a different location. I take a daily mysqldump which is also included in the backup. All my snapshots are replicated daily to my AMD64 machine via Windows SyncToy power tool which runs daily at night after the daily backup is finished.
With this setup I have heavy redundancy built into my solution. The data is replicated on two different machines. With only one disk failure on 'AMIT' I don't lose any data. As long as I'm able to replace the failed disk quickly I am fine. Even during degraded RAID 5, the daily backups and replication continues. So even if second disk on 'AMIT' fails I lose maximum 1 days worth of changes (if any), since I have the previous day's backup replicated on AMD64.
I planning to write another article about how my backup program works.