Wednesday, June 11, 2008

LVM, file system, and disk

The following topics are discussed in this chapter:
 Logical Volume Manager (LVM) problems
 Replacement of physical volumes
 JFS problems and their solutions
 Paging space creation and removal, as well as recommendations about
paging space
To understand the problems that can happen on an AIX system with volume
groups, logical volumes, and file systems, it is important to have a detailed
knowledge about how the storage is managed by the Logical Volume Manager.
This chapter does not cover the fundamentals of the LVM; they are considered
prerequisite knowledge required to understand the issues addressed in this
chapter.
7
138 IBM ^ Certification Study Guide - AIX 5L Problem Determination Tools and Techniques
7.1 LVM data
The Logical Volume Manager (LVM) data structures that are required for the
LVM to operate are stored in a number of structures. This logical layout is
described in the following sections.
7.1.1 Physical volumes
Each disk is assigned a Physical Volume Identifier (PVID) when it is first
assigned to a volume group. The PVID is a combination of the serial number of
the machine creating the volume group and the time and date of the operation.
The PVID is stored on the physical disk itself and is also stored in the Object
Data Manager (ODM) of a machine when a volume group is created or imported.
You should not use the dd command to copy the contents of one physical volume
to another, since the PVID will also be copied; this will result in two disks having
the same PVID, which can confuse the system.
7.1.2 Volume groups
Each volume group has a Volume Group Descriptor Area (VGDA). There are
(commonly) multiple copies of the VGDA in a volume group. A copy of the VGDA
is stored on each disk in the volume group. The VGDA stores information about
the volume group, such as the logical volumes and the disks in the volume
group.
The VGDA is parsed by the importvg command when importing a volume group
into a system. It is also used by the varyonvg command in the quorum voting
process to decide whether a volume group should be varied on.
For a single disk volume group, there are two VGDAs on the disk. When a
second disk is added to make a two disk volume group, the original disk retains
two VGDAs and the new disk gets one VGDA.
Adding a third disk results in the extra VGDA from the first disk moving to the
third disk for a quorum of three with each disk having one vote. Adding this
additional disk adds a new VGDA per disk.
A volume group with quorum checking enabled (the default) must have at least
51 percent of the VGDAs in the volume group available before it can be varied
on. Once varied on, if the number of VGDAs falls below 51 percent, the volume
group will automatically be varied off.
Chapter 7. LVM, file system, and disk problem determination 139
In contrast, a volume group with quorum checking disabled must have 100
percent of the VGDAs available before it can be varied on. Once varied on, only
one VGDA needs to remain available to keep the volume group online.
A volume group also has a Volume Group Identifier (VGID), a soft serial number
for the volume group similar to the PVID for disks.
Each disk in a volume group also has a Volume Group Status Area (VGSA), a
127 byte structure used to track mirroring information for up to the maximum
1016 physical partitions on the disk.
7.1.3 Logical volumes
Each logical volume has a Logical Volume Control Block (LVCB) that is stored in
the first 512 bytes of the logical volume. The LVCB holds important details about
the logical volume, including its creation time, mirroring information, and mount
point (if it contains a journaled file system [JFS]).
Each logical volume has a Logical Volume Identifier (LVID) that is used to
represent the logical volume to the LVM libraries and low-level commands. The
LVID is made up of VGID.num, where num is the order in which it was created in
the volume group.
7.1.4 Object Data Manager (ODM)
The Object Data Manger is used by the LVM to store information about the
volume groups, physical volumes, and logical volumes on the system. The
information held in the ODM is placed there when the volume group is imported
or when each object in the volume group is created.
There exists an ODM object known as the vg-lock. Whenever an LVM
modification command is started, the LVM command will lock the vg-lock for the
volume group being modified. If for some reason the lock is inadvertently left
behind, the volume group can be unlocked by running the varyonvg -b
command, which can be run on a volume group that is already varied on.
7.2 LVM problem determination
The most common LVM problems are related to disk failures. Depending on the
extent of the failure, you may be able to recover the situation with little or no data
loss. However, a failed recovery attempt may leave the system in a worse
condition. This leaves restoring from backup as the only way to recover.
Therefore, always take frequent backups of your system.
140 IBM ^ Certification Study Guide - AIX 5L Problem Determination Tools and Techniques
7.2.1 Data relocation
When a problem occurs with a disk drive, data relocation may take place. There
are three types of data relocation, namely:
 Internal to the disk
 Hardware relocation ordered by LVM
 Software relocation
Relocation typically occurs when the system fails to perform a read or write due
to physical problems with the disk platter. In some cases, the data I/O request
completes but with warnings. Depending on the type of recovered error, the LVM
may be wary of the success of the next request to that physical location, and it
orders a relocation to be on the safe side.
The lowest logical layer of relocation is the one that is internal to the disk. These
types of relocations are typically private to the disk and there is no notification to
the user that a relocation occurred.
The next level up in terms of relocation complexity is a hardware relocation
called for by the LVM device driver. This type of relocation will instruct the disk to
relocate the data on one physical partition to another portion (reserved) of the
disk. The disk takes the data in physical location A and copies it to a reserved
portion of the disk (location B). However, after this is complete, the LVM device
driver will continue to reference physical location A, with the understanding that
the disk itself will handle the true I/O to the real location B.
The top layer of data relocation is the soft relocation handled by the LVM device
driver. In this case, the LVM device driver maintains a bad block directory, and
whenever it receives a request to access logical location A, the LVM device
driver will look up the bad block table and translate it to actually send the request
to the disk drive at physical location B.
7.2.2 Backup data
The first step you should perform if you suspect a problem with LVM is to make a
backup of the affected volume group and save as much data as possible. This
may be required for data recovery. The integrity of the backup should be
compared with the last regular backup taken before the problem was detected.
7.2.3 ODM resynchronization
Problems with the LVM tend to occur when a physical disk problem causes the
ODM data to become out of sync with the VGDA, VGSA, and LVCB information
stored on disk.

No comments: