Wednesday, June 11, 2008

Configuring Volume Groups, Logical Volumes, and Filesystems as

You define volume groups, logical volumes, and filesystems in AIX 5L and then configure
them as resources for HACMP. Plan and note them on the worksheets before configuring to
HACMP. For more information, see Chapter 5 in the Installation Guide and Chapter 11:
Managing Shared LVM Components in this guide.
Server Name Enter an ASCII text string that identifies the server. You
will use this name to refer to the application server when
you define resources during node configuration. The server
name can include alphabetic and numeric characters and
underscores. Use no more than 64 characters.
Start Script Enter the pathname of the script (followed by arguments)
called by the cluster event scripts to start the application
server. (Maximum 256 characters.) This script must be in
the same location on each cluster node that might start the
server. The contents of the script, however, may differ.
Stop Script Enter the pathname of the script called by the cluster event
scripts to stop the server. (Maximum 256 characters.) This
script must be in the same location on each cluster node
that may start the server. The contents of the script,
however, may differ.
Configuring HACMP Cluster Topology and Resources (Extended)
Configuring HACMP Resources (Extended)
102 Administration Guide
4
Configuring Concurrent Volume Groups, Logical Volumes, and
Filesystems as Resources
Concurrent volume groups, logical volumes, and filesystems must be defined in AIX 5L and
then configured as resources for HACMP. They should be planned and noted on the worksheets
before configuring to HACMP. See Chapter 5: Planning Shared LVM Components in the
Planning Guide and Chapter 12: Managing Shared LVM Components in a Concurrent
Access Environment for information in this guide.
Configuring Multiple Application Monitors
HACMP can monitor specified applications using application monitors. These application
monitors can:
• Check if an application is running before HACMP starts it.
• Watch for the successful startup of the application.
• Check that the application runs successfully after the stabilization interval has passed.
• Monitor both the startup and the long-running process.
• Automatically take action to restart applications upon detecting process termination or
other application failures.
In HACMP 5.2 and up, you can configure multiple application monitors and associate them
with one or more application servers.
By supporting multiple monitors per application, HACMP can support more complex
configurations. For example, you can configure one monitor for each instance of an Oracle
parallel server in use. Or, you can configure a custom monitor to check the health of the
database along with a process termination monitor to instantly detect termination of the
database process.
Note: If a monitored application is under control of the system resource
controller, ensure that action:multi are -O and -Q. The -O
specifies that the subsystem is not restarted if it stops abnormally. The
-Q specifies that multiple instances of the subsystem are not allowed
to run at the same time. These values can be checked using the
following command:
lssrc -Ss Subsystem | cut -d : -f 10,11
If the values are not -O and -Q, change them using the chssys command.
Process and Custom Monitoring
You can select either of two application monitoring methods:
• Process application monitoring detects the termination of one or more processes of an
application, using RSCT Resource Monitoring and Control.
• Custom application monitoring checks the health of an application with a custom monitor
method at user-specified polling intervals.
Configuring HACMP Cluster Topology and Resources (Extended)
Configuring HACMP Resources (Extended)
Administration Guide 103
4
Process monitoring is easier to set up, as it uses the built-in monitoring capability provided by
RSCT and requires no custom scripts. However, process monitoring may not be an appropriate
option for all applications. Custom monitoring can monitor more subtle aspects of an
application’s performance and is more customizable, but it takes more planning, as you must
create the custom scripts.
Fallover and Notify Actions
In both process and custom monitoring methods, when the monitor detects a problem, HACMP
attempts to restart the application on the current node and continues the attempts until the
specified retry count has been exhausted.
When an application cannot be restarted within the retry count, HACMP takes one of two
actions, which you specified when configuring the application monitor:
• Choosing fallover causes the resource group containing the application to fall over to the
node with the next highest priority according to the nodelist. (See Note on the Fallover
Option and Resource Group Availability for more information.)
• Choosing notify causes HACMP to generate a server_down event, which informs the
cluster of the failure.
Monitor Modes
When you configure process monitor(s) and custom monitor(s) for the application server, you
can also specify the mode in which the application monitor is used:
• Startup Monitoring Mode. In this mode, the monitor checks the application server’s
successful startup within the specified stabilization interval and exits after the stabilization
period expires. The monitor in the startup mode may run more than once, but it always runs
during the time specified by the stabilization interval value in SMIT. If the monitor returns
within the stabilization interval, its zero return code indicates that the application had
successfully started. If the monitor returns a non-zero code within the stabilization interval,
this is interpreted as a failure of the application to start.
Use this mode for applications in parent resource groups. If you configure dependencies
between resource groups in the cluster, the applications in these resource groups are started
sequentially as well. To ensure that this process goes smoothly, we recommend configuring
several application monitors, and, especially, a monitor that checks the application startup
for the application that is included in the parent resource group. This ensures that the
application in the parent resource group starts successfully.
• Long-Running Mode. In this mode, the monitor periodically checks that the application is
running successfully. The checking begins after the stabilization interval expires and it is
assumed that the application server is started and the cluster has stabilized. The monitor in
the long-running mode runs at multiple intervals based on the monitoring interval value that
you specify in SMIT.
Configure a monitor in this mode for any application server. For example, applications
included in child and parent resource groups can use this mode of monitoring.
• Both. In this mode, the monitor checks for the successful startup of the application server
and periodically checks that the application is running successfully.
Configuring HACMP Cluster Topology and Resources (Extended)
Configuring HACMP Resources (Extended)
104 Administration Guide
4
Retry Count and Restart Interval
The restart behavior depends on two parameters, the retry count and the restart interval, that
you configure in SMIT.
• Retry count. The retry count specifies how many times HACMP should try restarting before
considering the application failed and taking subsequent fallover or notify action.
• Restart interval. The restart interval dictates the number of seconds that the restarted
application must remain stable before the retry count is reset to zero, thus completing the
monitor activity until the next failure occurs.
Note: Do not specify both of these parameters if you are creating an
application monitor that will only be used as in a startup monitoring
mode.
If the application successfully starts up before the retry count is exhausted, the restart interval
comes into play. By resetting the restart count, it prevents unnecessary fallover action that could
occur when applications fail several times over an extended time period. For example, a
monitored application with a restart count set to three (the default) could fail to restart twice,
and then successfully start and run cleanly for a week before failing again. This third failure
should be counted as a new failure with three new restart attempts before invoking the fallover
policy. The restart interval, set properly, would ensure the correct behavior: it would have reset
the count to zero when the application was successfully started and found in a stable state after
the earlier failure.
Be careful not to set the restart interval for a too short period of time. If the time period is too
short, the count could be reset to zero too soon, before the immediate next failure, and the
fallover or notify activity will never occur.
See the instructions for setting the retry count and restart intervals later in this chapter for
additional details.
Application Monitoring Prerequisites and Considerations
Keep the following in mind when planning and configuring application monitoring:
• Any application to be monitored must be defined to an application server in an existing
cluster resource group.
• If you have configured dependent resource groups, we recommend to configure multiple
monitors: for applications included in parent resource groups, and for applications in child
resource groups. For example, a monitor for a parent resource group can monitor the
successful startup of the application, and a monitor for a child resource group can monitor
the process for an application. For more information, see Monitor Modes.
• Multiple monitors can be configured for the same application server. Each monitor can be
assigned a unique name in SMIT.
• The monitors that you configure must conform to existing configuration rules. For more
information, see Configuring a Process Application Monitor and Configuring a Custom
Application Monitor.
Configuring HACMP Cluster Topology and Resources (Extended)
Configuring HACMP Resources (Extended)
Administration Guide 105
4
• We recommend that you first configure an application server, and then configure the
monitor(s) that you can associate with the application server. Before configuring an
application monitor, configure all your application servers. Then configure the monitors
and associate them with the servers. You can go back at any time and change the association
of monitors to servers.
• You can configure no more than 128 monitors per cluster. No limit exists on the number of
monitors per application server, as long as the total number of all monitors in the cluster is
less than 128.
• When multiple monitors are configured that use different fallover policies, each monitor
specifies a failure action of either “notify” or “fallover”. HACMP processes actions in the
order in which the monitors indicate an error. Forexample, if two monitors are configured
for an application server and one monitor uses the “notify” method and the other uses the
“fallover” method, the following occurs:
• If a monitor with “fallover” action indicates an error first, HACMP moves the
resource group to another node, and the remaining monitor(s) are shut down and
restarted on another node. HACMP takes no actions specified in any other monitor.
• If a monitor with “notify” action indicates an error first, HACMP runs the “notify”
method and shuts down that monitor, but any remaining monitors continue to
operate as before. You can manually restart the “notify” monitor on that node using
the Suspend/Resume Application Monitoring SMIT panel.
• If multiple monitors are used, HACMP does not use a particular order for the monitors
startup or shutdown. All monitors for an application server are started at the same time. If
two monitors are configured with different fallover policies, and they fail at precisely the
same time, HACMP does not guarantee it processes methods specified for one monitor
before methods for the other.
• The same monitor can be associated with multiple application servers using the
Application Monitor(s) field in the Change/Show an Application Server SMIT panel.
You can select a monitor from the picklist.
• If you remove an application monitor, HACMP removes it from the server definition for all
application servers that were using the monitor, and indicates which servers are no longer
using the monitor.
• If you remove an application server, HACMP removes that server from the definition of all
application monitors that were configured to monitor the application. HACMP also sends
a message about which monitor will no longer be used for the application. If you remove
the last application server in use for any particular monitor, that is, if the monitor will no
longer be used for any application, verification issues a warning that the monitor will no
longer be used.
Note on the Fallover Option and Resource Group Availability
Be aware that if you select the fallover option of application monitoring in the Customize
Resource Recovery SMIT panel—which could cause a resource group to migrate from its
original node—the possibility exists that while the highest priority node is up, the resource
group remains inactive. This situation occurs when an rg_move event moves a resource group
from its highest priority node to a lower priority node, and then you stop the cluster services on
the lower priority node with the option to take all the resources offline. Unless you bring the
resource group up manually, it remains in an inactive state.

No comments: