Wednesday, April 28, 2010

How Data Recovery Works

Data recovery works different ways and depends on how the data is lost. Although computer storage can be extremely complex, the simplest case is that of a home computer that has one hard disk for storage and that hard disk has only a single partition.


Background


A home computer runs an operating system (Windows, Macintosh, etc.). The operating system knows about files. The hard disk does not know about files; it knows about sectors. Sectors are small chunks of data, usually 512 bytes. A file system (NTFS, HFS+, etc.) translates between files and hard disk sectors. Data loss is when the operating system can no longer read one or more files from the hard disk.








The File System


In its simplest form, a file system is just a table on the hard disk with one column for the file name and another column for the sector numbers used by the file. When the operating system creates a file, the file system finds an empty row in the table, then writes the file name and the sector's numbers. When the operating system reads a file, the file system looks up the file name and then reads the sectors from the disk.


Deleted Files


When you empty the trash, the operating system tells the file system to delete each file in the trash. Most file systems do not actually delete the information; they simply mark the row in the table as available. That row in the table may be reused by the file system in the future. Eventually, those sectors may be reused on the disk, but they are not reused immediately. Until they are reused, the file can be recovered by reading the file system information directly and acting as if the row were available. Saving files and defragmenting your disk can make deleted files unrecoverable by reusing file table rows and hard disk sectors.








Corrupted Files


A file is corrupted when it cannot be read back in. A file can be corrupted by a software bug, a sector failure or by a corrupted file system. Although portions may be recovered, corrupted files generally cannot be recovered completely.


Corrupted File System


The file system information is stored on the disk, so it can be corrupted like any other file. Most modern file systems either keep a duplicate of the file system information or keep a journal that can be played back to recreate the information. Even when a file system cannot be completely restored, files may still be recovered by reading the file system information directly. Some files are usually lost when restoring corrupted file systems.


Data Carving


Many file types contain distinct hexadecimal strings or signatures that identify them. JPEG files, for example, begin with the hexadecimal string 0xFFD8FF. Reading the hard disk sectors directly and looking for available sectors that begin with 0xFFD8FF may allow you to recover a JPEG file. However, this technique, called "data carving," suffers from several limitations. It is extremely time consuming. Non-JPEG files may also begin with 0xFFD8FF. You may find many files that are not JPEGs. Sectors are reused and files can be fragmented. Sometimes, only part of the file is recovered. Not all file types can be recognized by a simple signature and so not all files can be recovered.

Tags: hard disk, file system, operating system, file system, file system information, system information