Wednesday, June 11, 2008

File system verification and recovery

The fsck command checks and interactively repairs inconsistent file systems.
You should run this command before mounting any file system. You must be able
to read the device file on which the file system resides (for example, the /dev/hd0
device).
Normally, the file system is consistent, and the fsck command merely reports on
the number of files, used blocks, and free blocks in the file system. If the file
system is inconsistent, the fsck command displays information about the
inconsistencies found and prompts you for permission to repair them. If the file
system cannot be repaired, restore it from backup.
Mounting an inconsistent file system may result in a system crash. If you do not
specify a file system with the FileSystem parameter, the fsck command will
check all the file systems with the attribute check=TRUE in /etc/filesystems.
Note: By default, the /, /usr, /var, and /tmp file systems have the check
attribute set to false (check=false) in their /etc/filesystems stanzas. The
attribute is set to false for the following reasons:
 The boot process explicitly runs the fsck command on the /, /usr, /var, and
/tmp file systems.
 The /, /usr, /var, and /tmp file systems are mounted when the /etc/rc file is
run. The fsck command will not modify a mounted file system, and fsck
results on mounted file systems are unpredictable.
Chapter 7. LVM, file system, and disk problem determination 153
Fixing a bad superblock
If you receive one of the following errors from the fsck or mount commands, the
problem may be a corrupted superblock, as shown in the following example:
fsck: Not an AIX3 file system
fsck: Not an AIXV3 file system
fsck: Not an AIX4 file system
fsck: Not an AIXV4 file system
fsck: Not a recognized file system type
mount: invalid argument
The problem can be resolved by restoring the backup of the superblock over the
primary superblock using the following command (care should be taken to check
with the latest product documentation before running this command):
# dd count=1 bs=4k skip=31 seek=1 if=/dev/lv00 of=/dev/lv00
The following is an example of when the superblock is corrupted and copying the
backup helps solve the problem:
# mount /u/testfs
mount: 0506-324 Cannot mount /dev/lv02 on /u/testfs: A system call received a
parameter that is not valid.
# fsck /dev/lv02
Not a recognized filesystem type. (TERMINATED)
# dd count=1 bs=4k skip=31 seek=1 if=/dev/lv02 of=/dev/lv02
1+0 records in.
1+0 records out.
# fsck /dev/lv02
** Checking /dev/lv02 (/u/tes)
** Phase 0 - Check Log
log redo processing for /dev/lv02
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Inode Map
** Phase 6 - Check Block Map
8 files 2136 blocks 63400 free
Once the restoration process is complete, check the integrity of the file system by
issuing the fsck command:
# fsck /dev/lv00
154 IBM ^ Certification Study Guide - AIX 5L Problem Determination Tools and Techniques
In many cases, restoration of the backup of the superblock to the primary
superblock will recover the file system. If this does not resolve the problem,
recreate the file system and restore the data from a backup.
7.4.4 Sparse file allocation
Some applications, particularly databases, maintain data in sparse files. Files
that do not have disk blocks allocated for each logical block are called sparse
files. If the file offsets are greater than 4 MB, then a large disk block of 128 KB is
allocated. Applications using sparse files larger than 4 MB may require more disk
blocks in a file system enabled for large files than in a regular file system.
In the case of sparse files, the output of the ls command is not showing the
actual file size, but is reporting the number of bytes between the first and last
blocks allocated to the file, as shown in the following example:
# ls -l /tmp/sparsefile
-rw-r--r-- 1 root system 100000000 Jul 16 20:57 /tmp/sparsefile
The du command can be used to see the actual allocation, since it reports the
blocks actually allocated and in use by the file. Use du -rs to report the number
of allocated blocks on disk.
# du -rs /tmp/sparsefile
256 /tmp/sparsefile
Using the dd command in combination with your own backup script will solve this
problem.
7.4.5 Unmount problems
A file system cannot be unmounted if any references are still active within that file
system. The following error message will be displayed:
Device busy
or
A device is already mounted or cannot be unmounted
Note: The tar command does not preserve the sparse nature of any file that
is sparsely allocated. Any file that was originally sparse before the restoration
will have all space allocated within the file system for the size of the file. New
AIX 5L options for the backup and restore command are useful for sparse
files.
Chapter 7. LVM, file system, and disk problem determination 155
The following situations can leave open references to a mounted file system.
 Files are open within a file system. These files must be closed before the file
system can be unmounted. The fuser command is often the best way to
determine what is still active in the file system. The fuser command will return
the process IDs for all processes that have open references within a specified
file system, as shown in the following example:
# umount /home
umount: 0506-349 Cannot unmount /dev/hd1: The requested resource is busy.
# fuser -x -c /home
/home: 11630
# ps -fp 11630
UID PID PPID C STIME TTY TIME CMD
guest 11630 14992 0 16:44:51 pts/1 0:00 -sh
# kill -1 11630
# umount /home
The process having an open reference can be killed by using the kill
command (sending a SIGHUP), and the unmount can be accomplished. A
stronger signal may be required, such as SIGKILL.
 If the file system is still busy and still cannot be unmounted, this could be due
to a kernel extension that is loaded but exists within the source file system.
The fuser command will not show these kinds of references, since a user
process is not involved. However, the genkex command will report on all
loaded kernel extensions.
 File systems are still mounted within the file system. Unmount these file
systems before the file system can be unmounted. If any file system is
mounted within a file system, this leaves open references in the source file
system at the mount point of the other file system. Use the mount command to
get a list of mounted file systems. Unmount all the file systems that are
mounted within the file system to be unmounted.
7.4.6 Removing file systems
When removing a JFS, the file system must be unmounted before it can be
removed. The command for removing file systems is rmfs.
In the case of a JFS, the rmfs command removes both the logical volume on
which the file system resides and the associated stanza in the /etc/filesystems
file. If the file system is not a JFS, the command removes only the associated
stanza in the /etc/filesystems file, as shown in the following example:
# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
loglv00 jfslog 1 1 1 open/syncd N/A
156 IBM ^ Certification Study Guide - AIX 5L Problem Determination Tools and Techniques
lv02 jfs 2 2 1 open/syncd /u/testfs
# rmfs /u/testfs
rmfs: 0506-921 /u/testfs is currently mounted.
# umount /u/testfs
# rmfs /u/testfs
rmlv: Logical volume lv02 is removed.
# lsvg -l testvg
testvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
loglv00 jfslog 1 1 1 closed/syncd N/A
This example shows how the file system testfs is removed. The first attempt fails
because the file system is still mounted. The associated logical volume lv02 is
also removed. The jfslog remains defined on the volume group.
7.4.7 Different output from du and df commands
Sometimes du and df commands are used to get a free block value. df is used to
report the total block count, and then the value returned by du -s
/filesystem_name is subtracted from that total to calculate the free block value.
However, this method of calculation yields a value that is greater than the free
block value reported by df. At AIX Version 4.1 and later, both df and du default to
512-byte units. Sample output from the du and df commands is below:
# du -s /tmp
152 /tmp
# df /tmp
Filesystem 512-blocks Free %Used Iused %Iused Mounted on
/dev/hd3 24576 23320 6% 33 1% /tmp
Here (total from df) - (used from du) + (false free block count): 24576 - 152 =
24424.
24424 is greater than 23320. The reason for this discrepancy involves the
implementation of du and df. du -s traverses the file tree, adding up the number
of blocks allocated to each directory, symlink, and file as reported by the stat()
system call. This is how du arrives at its total value. df looks at the file system
disk block allocation maps to arrive at its total and free values.
7.4.8 Enhanced journaled file system
The enhanced journaled file system (JFS2) contains several architectural
differences over the standard JFS, including:
 Variable number of i-nodes for enhanced journaled file system
JFS2 allocates i-nodes as needed. Therefore, the number of i-nodes
available is limited by the size of the file system itself.
Chapter 7. LVM, file system, and disk problem determination 157
 Specifying file system block size
File system block size is specified during the file system's creation with the
crfs and mkfs commands or by using the SMIT. The decision of file system
block size should be based on the projected size of files contained by the file
system.
 Identifying file system block size
The file system block size value can be identified with the lsfs command or
the System Management Interface Tool (SMIT). For application programs,
the statfs subroutine can be used to identify the file system block size.
 Compatibility and migration
The enhanced journaled file system (JFS2) is a new file system and is not
compatible with AIX Version 4.
 Device driver limitations
A device driver must provide disk block addressability that is the same or
smaller than the file system block size.
 Performance costs
Although file systems that use block sizes smaller than 4096 bytes as their
allocation unit might require substantially less disk space than those using the
default allocation unit of 4096 bytes, the use of smaller block sizes can incur
performance degradation.
 Increased allocation activity
Because disk space is allocated in smaller units for a file system with a block
size other than 4096 bytes, allocation activity can occur more often when files
or directories are repeatedly extended in size. For example, a write operation
that extends the size of a zero-length file by 512 bytes results in the allocation
of one block to the file, assuming a block size of 512 bytes. If the file size is
extended further by another write of 512 bytes, an additional block must be
allocated to the file. Applying this example to a file system with 4096-byte
blocks, disk space allocation occurs only once, as part of the first write
operation. No additional allocation activity is performed as part of the second
write operation since the initial 4096-byte block allocation is large enough to
hold the data added by the second write operation.
 Increased block allocation map size
More virtual memory and file system disk space might be required to hold
block allocation maps for file systems with a block size smaller than 4096
bytes. Blocks serve as the basic unit of disk space allocation, and the
allocation state of each block within a file system is recorded in the file system
block allocation map.
158 IBM ^ Certification Study Guide - AIX 5L Problem Determination Tools and Techniques
 Understanding enhanced journaled file system size limitations
The maximum size for an enhanced journaled file system is architecturally
limited to 4 Petabytes. I-nodes are dynamically allocated by JFS2, so you do
not need to consider how many i-nodes you may need when creating a JFS2
file system. You need to consider the size of the file system log.
 Enhanced journaled file system log size issues
In most instances, multiple journaled file systems use a common log
configured to be 4 MB in size. When file systems exceed 2 GB or when the
total amount of file system space using a single log exceeds 2 GB, the default
log size might not be sufficient. In either case, scale log sizes upward as the
file system size increases. The JFS log is limited to a maximum size of 256
MB.
 JFS2 file space allocation
File space allocation is the method by which data is apportioned physical
storage space in the operating system. The kernel allocates disk space to a
file or directory in the form of logical blocks. A logical block refers to the
division of a file or directory contents into 512, 1024, 2048, or 4096 byte units.
When a JFS2 file system is created the logical block size is specified to be
one of 512, 1024, 2048, or 4096 bytes. Logical blocks are not tangible
entities; however, the data in a logical block consumes physical storage
space on the disk. Each file or directory consists of zero or more logical
blocks.
 Full and partial logical blocks
A file or directory may contain full or partial logical blocks. A full logical block
contains 512, 1024, 2048, or 4096 bytes of data, depending on the file system
block size specified when the JFS2 file system was created. Partial logical
blocks occur when the last logical block of a file or directory contains less than
the file system block size of data.
For example, a JFS2 file system with a logical block size of 4096 with a file of
8192 bytes is two logical blocks. The first 4096 bytes reside in the first logical
block and the following 4096 bytes reside in the second logical block.
Likewise, a file of 4608 bytes consists of two logical blocks. However, the last
logical block is a partial logical block containing the last 512 bytes of the file's
data. Only the last logical block of a file can be a partial logical block.
 JFS2 file space allocation
The default block size is 4096 bytes. You can specify smaller block sizes with
the mkfs command during a file system's creation. Allowable fragment sizes
are 512, 1024, 2048, and 4096 bytes. You can use only one block’s size in a
file system.
Chapter 7. LVM, file system, and disk problem determination 159
The kernel allocates disk space so that only the last file system block of data
receives a partial block allocation. As the partial block grows beyond the limits
of its current allocation, additional blocks are allocated.
Block reallocation also occurs if data is added to logical blocks that represent
file holes. A file hole is an "empty" logical block located prior to the last logical
block that stores data. (File holes do not occur within directories.) These
empty logical blocks are not allocated blocks. However, as data is added to
file holes, allocation occurs. Each logical block that was not previously
allocated disk space is allocated a file system block of space.

1 comment:

Ranjith PJ said...

Ponnus.. So you also started passion... great.... But it cud have been bit different like Passion of Suresh.. or Suresh's AIX Depot .. than Santosh's passion.