Wednesday, June 11, 2008

Extending the number of max physical partitions

When adding a new disk to a volume group, you may encounter an error due to
there being too few PP descriptors for the required number of PVs. This may
occur when the new disk has a much higher capacity than existing disks in the
volume group.
This situation is typical on older installations, due to the rapid growth of storage
technology. To overcome this, a change of the volume group LVM metadata is
required.
The chvg command is used for this operation using the -t flag and applying a
factor value, as shown in the following example:
# lsvg testvg
VOLUME GROUP: testvg VG IDENTIFIER: 000bc6fd5a177ed0
VG STATE: active PP SIZE: 16 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 542 (8672 megabytes)
MAX LVs: 256 FREE PPs: 42 (672 megabytes)
LVs: 1 USED PPs: 500 (8000 megabytes)
OPEN LVs: 0 QUORUM: 2
TOTAL PVs: 1 VG DESCRIPTORS: 2
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 1 AUTO ON: yes
MAX PPs per PV: 1016 MAX PVs: 32
# chvg -t 2 testvg
0516-1193 chvg: WARNING, once this operation is completed, volume group testvg
cannot be imported into AIX 430 or lower versions. Continue (y/n) ?
y
0516-1164 chvg: Volume group testvg changed. With given characteristics testvg
can include upto 16 physical volumes with 2032 physical partitions
each.
144 IBM ^ Certification Study Guide - AIX 5L Problem Determination Tools and Techniques
This example shows that the volume group testvg with a current 9.1 GB disk has
a maximum number of 1016 PPs per physical volume. Adding a larger 18.2 GB
disk would not be possible; the maximum size of the disk is limited to 17 GB
unless the maximum number of PPs is increased. Using the chvg command to
increase the maximum number of PPs by a factor of 2 to 2032 PPs allows the
volume group to be extended with physical volumes of up to approximately 34
GB.
7.3 Disk replacement
AIX, like all operating systems, can be problematic when you have to change a
disk. AIX provides the ability to prepare the system for the change using the
LVM. You can then perform the disk replacement and then use the LVM to
restore the system back to how it was before the disk was changed. This process
manipulates not only the data on the disk itself, but is also a way of keeping the
Object Data Manager (ODM) intact.
The ODM within AIX is a database that holds device configuration details and
AIX configuration details. The function of the ODM is to store the information
between reboots, and also provide rapid access to system data, eliminating the
need for AIX commands to interrogate components for configuration information.
Since this database holds so much vital information regarding the configuration
of a machine, any changes made to the machine, such as the changing of a
defective disk, need to be done in such a way as to preserve the integrity of the
database.
7.3.1 Replacing a disk
The following scenario shows a system that has a hardware error on a physical
volume. However, since the system uses a mirrored environment, which has
multiple copies of the logical volume, it is possible to replace the disk while the
system is active. The disk hardware in this scenario are hot-swappable SCSI
disks, which permit the replacement of a disk in a production environment.
One important factor is detecting the disk error. Normally, mail is sent to the
system administrator (root account) from the Automatic Error Log Analysis
(diagela). Figure 7-1 on page 145 shows the information in such a diagnostics
mail.
Chapter 7. LVM, file system, and disk problem determination 145
Figure 7-1 Disk problem mail from Automatic Error Log Analysis (diagela)
Automatic Error Log Analysis (diagela) provides the capability to do error log
analysis whenever a permanent hardware error is logged. Whenever a
permanent hardware resource error is logged, the diagela program is invoked.
Automatic Error Log Analysis is enabled by default on all platforms.
The diagela message shows that the hdisk4 has a problem. Another way of
locating a problem is to check the state of the logical volume using the lsvg
command, as in the following example:
# lsvg -l mirrorvg
mirrorvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvdb01 jfs 500 1000 2 open/syncd /u/db01
lvdb02 jfs 500 1000 2 open/stale /u/db02
loglv00 jfslog 1 1 1 open/syncd N/A
The logical volume lvdb02 in the volume group mirrorvg is marked with the status
stale, indicating that the copies in this LV are not synchronized. Look at the error
log using the error-reporting errpt command, as in the following example:
# errpt
EAA3D429 0713121400 U S LVDD PHYSICAL PARTITION MARKED STALE
F7DDA124 0713121400 U H LVDD PHYSICAL VOLUME DECLARED MISSING
41BF2110 0713121400 U H LVDD MIRROR WRITE CACHE WRITE FAILED
35BFC499 0713121400 P H hdisk4 DISK OPERATION ERROR
146 IBM ^ Certification Study Guide - AIX 5L Problem Determination Tools and Techniques
This error information displays the reason why the LV lvdb02 is marked stale.
The hdisk4 had an DISK OPERATION ERROR and the LVDD could not write the
mirror cache.
Based on the information in the example, hdisk4 needs to be replaced. Before
taking any action on the physical disk of the mirrored LV are recommended that
you do a file system backup in case anything should go wrong. Since the other
disk of the mirrored LV is still functional, all the data should be present. If the LV
contains a database, then the respective database tools for backup of the data
should be used.
Removing a bad disk
If the system is a high-availability (24x7) system, you might decide to keep the
system running while performing the disk replacement, provided that the
hardware supports an online disk exchange with hot-swappable disks. However,
the procedure should be agreed upon by the system administrator or customer
before continuing. Use the following steps to remove a disk:
1. To remove the physical partition copy of the mirrored logical volume from the
erroneous disk, use the rmlvcopy command as follows:
# rmlvcopy lvdb02 1 hdisk4
The logical volume lvdb02 is now left with only one copy, as shown in the
following:
# lslv -l lvdb02
lvdb02:/u/db02
PV COPIES IN BAND DISTRIBUTION
hdisk3 500:000:000 21% 109:108:108:108:067
2. Reduce the volume group by removing the disk you want to replace from its
volume group:
# reducevg -f mirrorvg hdisk4
# lsvg -l mirrorvg
mirrorvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvdb01 jfs 500 1000 2 open/syncd /u/db01
lvdb02 jfs 500 500 1 open/syncd /u/db02
loglv00 jfslog 1 1 1 open/syncd N/A
3. Remove the disk as a device from the system and from the ODM database
with the rmdev command:
# rmdev -d -l hdisk4
hdisk4 deleted
This command is valid for any SCSI disk. If your system is using SSA, then an
additional step is required. Since SSA disks also define the device pdisk, the erroneous disk, use the rmlvcopy command as follows:
# rmlvcopy lvdb02 1 hdisk4
The logical volume lvdb02 is now left with only one copy, as shown in the
following:
# lslv -l lvdb02
lvdb02:/u/db02
PV COPIES IN BAND DISTRIBUTION
hdisk3 500:000:000 21% 109:108:108:108:067
2. Reduce the volume group by removing the disk you want to replace from its
volume group:
# reducevg -f mirrorvg hdisk4
# lsvg -l mirrorvg
mirrorvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvdb01 jfs 500 1000 2 open/syncd /u/db01
lvdb02 jfs 500 500 1 open/syncd /u/db02
loglv00 jfslog 1 1 1 open/syncd N/A
3. Remove the disk as a device from the system and from the ODM database
with the rmdev command:
# rmdev -d -l hdisk4
hdisk4 deleted
This command is valid for any SCSI disk. If your system is using SSA, then an
additional step is required. Since SSA disks also define the device pdisk, the
Chapter 7. LVM, file system, and disk problem determination 147
corresponding pdisk device must be deleted as well. Use the SSA menus in
SMIT to display the mapping between hdisk and pdisk. These menus can
also be used to delete the pdisk device.
4. The disk can now be safely removed from your system.
Adding a new disk
Continuing the scenario from the previous section, this section describes how to
add a new disk into a running environment. After hdisk4 has been removed, the
system is now left with the following disks:
# lsdev -Cc disk
hdisk0 Available 30-58-00-8,0 16 Bit SCSI Disk Drive
hdisk1 Available 30-58-00-9,0 16 Bit SCSI Disk Drive
hdisk2 Available 10-60-00-8,0 16 Bit SCSI Disk Drive
hdisk3 Available 10-60-00-9,0 16 Bit SCSI Disk Drive
Use the following steps to add a new disk:
1. Plug in the new disk and run the configuration manager cfgmgr command.
The cfgmgr command configures devices controlled by the Configuration
Rules object class, which is part of the device configuration database. The
cfgmgr command will see the newly inserted SCSI disk and create the
corresponding device. Although the command requires no option, the -v flag
specifies verbose output, which helps in troubleshooting, as shown in the
following:
# cfgmgr -v
cfgmgr is running in phase 2
----------------
Time: 0 LEDS: 0x538
Invoking top level program -- "/etc/methods/cfgprobe -c
/etc/drivers/coreprobe.ext"
Time: 0 LEDS: 0x539
Return code = 0
*** no stdout ****
*** no stderr ****
----------------
Time: 0 LEDS: 0x538
Invoking top level program -- "/etc/methods/defsys"
Time: 0 LEDS: 0x539
Return code = 0
***** stdout *****
sys0
.....
.....
The result is a new hdisk4 added to the system:
# lsdev -Cc disk
148 IBM ^ Certification Study Guide - AIX 5L Problem Determination Tools and Techniques
hdisk0 Available 30-58-00-8,0 16 Bit SCSI Disk Drive
hdisk1 Available 30-58-00-9,0 16 Bit SCSI Disk Drive
hdisk2 Available 10-60-00-8,0 16 Bit SCSI Disk Drive
hdisk3 Available 10-60-00-9,0 16 Bit SCSI Disk Drive
hdisk4 Available 10-60-00-12,0 16 Bit SCSI Disk Drive
2. The new hdisk must now be assigned to the volume group mirrorvg by using
the LVM extendvg command:
# extendvg mirrorvg hdisk4
3. To re-establish the mirror copy of the LV, use the mklvcopy command.
# mklvcopy lvdb02 2 hdisk4
The number of copies of LV is now two, but the LV stat is still marked as stale
because the LV copies are not synchronized with each other:
# lsvg -l mirrorvg
mirrorvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvdb01 jfs 500 1000 2 open/syncd /u/db01
lvdb02 jfs 500 1000 2 open/stale /u/db02
loglv00 jfslog 1 1 1 open/syncd N/A
4. To get a fully synchronized set of copies of the LV lvdb02, use the syncvg
command:
# syncvg -p hdisk4
The syncvg command can be used with logical volumes, physical volumes, or
volume groups. The synchronization process can be quite time consuming,
depending on the hardware characteristics and the amount of data.
After the synchronization is finished, verify the logical volume state using
either the lsvg or lslv command:
# lsvg -l mirrorvg
mirrorvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvdb01 jfs 500 1000 2 open/syncd /u/db01
lvdb02 jfs 500 1000 2 open/syncd /u/db02
loglv00 jfslog 1 1 1 open/syncd N/A
The system is now back to normal.
7.3.2 Recovering an incorrectly removed disk
If a disk was incorrectly removed from the system, and the system has been
rebooted, the synclvodm command will need to be run to rebuild the logical
volume control block, as shown in the following examples.
Chapter 7. LVM, file system, and disk problem determination 149
In the examples, a disk has been incorrectly removed from the system and the
logical volume control block needs to be rebuilt.
The disks in the system before the physical volume was removed is shown in the
following command output:
# lsdev -Cc disk
hdisk0 Available 30-58-00-8,0 16 Bit SCSI Disk Drive
hdisk1 Available 30-58-00-9,0 16 Bit SCSI Disk Drive
hdisk2 Available 10-60-00-8,0 16 Bit SCSI Disk Drive
hdisk3 Available 10-60-00-9,0 16 Bit SCSI Disk Drive
The allocation of the physical volumes before the disk was removed are shown
as follows:
# lspv
hdisk0 000bc6fdc3dc07a7 rootvg
hdisk1 000bc6fdbff75ee2 volg01
hdisk2 000bc6fdbff92812 volg01
hdisk3 000bc6fdbff972f4 volg01
The logical volumes on the volume group:
# lsvg -l volg01
volg01:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
logvol01 jfs 1000 1000 2 open/syncd /userfs01
loglv00 jfslog 1 1 1 open/syncd N/A
The logical volume distribution on the physical volumes is shown using the lslv
command:
# lslv -l logvol01
logvol01:/userfs01
PV COPIES IN BAND DISTRIBUTION
hdisk1 542:000:000 19% 109:108:108:108:109
hdisk3 458:000:000 23% 109:108:108:108:025
The system after a reboot has the following physical volumes:
# lspv
hdisk0 000bc6fdc3dc07a7 rootvg
hdisk1 000bc6fdbff75ee2 volg01
hdisk3 000bc6fdbff972f4 volg01
When trying to mount the file system on the logical volume, the error may look
similar to the following example:
# mount /userfs01
mount: 0506-324 Cannot mount /dev/logvol01 on /userfs01: There is an input or
output error.
150 IBM ^ Certification Study Guide - AIX 5L Problem Determination Tools and Techniques
To synchronize the logical volume, the following command should be run:
# synclvodm -v volg01
synclvodm: Physical volume data updated.
synclvodm: Logical volume logvol01 updated.
synclvodm: Warning, lv control block of loglv00 has been over written.
0516-622 synclvodm: Warning, cannot write lv control block data.
synclvodm: Logical volume loglv00 updated.
The system can now be repaired. If the file system data was spread across all
the disks, including the failed disk, it may need to be restored from the last
backup.
7.4 The AIX JFS
Similar to the LVM, most JFS problems can be traced to problems with the
underlying physical disk.
As with volume groups, various JFS features have been added at different levels
of AIX, which preclude those file systems being mounted if the volume group was
imported on an earlier version of AIX. Such features include large file enabled file
systems, file systems with non-default allocation group size, and JFS2.
7.4.1 Creating a JFS
In a journaled file system (JFS), files are stored in blocks of contiguous bytes.
The default block size, also referred to as fragmentation size in AIX, is 4096
bytes (4 KB). The JFS i-node contains an information structure of the file with an
array of eight pointers to data blocks. A file that is less then 32 KB is referenced
directly from the i-node.
A larger file uses a 4-KB block, referred to as an indirect block, for the addressing
of up to 1024 data blocks. Using an indirect block, a file size of 1024 x 4 KB = 4
MB is possible.
For files larger than 4 MB, a second block, the double indirect block, is used. The
double indirect block points to 512 indirect blocks, providing the possible
addressing of 512 x 1024 x 4 KB = 2 GB files. Figure 7-2 on page 151 illustrates
the addressing using double indirection.

No comments: