ADMINISTERING
About this task
CAUTION: QoS and fault recovery should not be enabled at the same time.
Important: If QoS (re)starts a server that has a password on the server.id file, the server will not start until an administrator connects to the console on that server and enters the password. Therefore, if you want QoS to be capable of (re)starting Domino without intervention on a specific server, for example at inconvenient times when an administrator is not available for a manual password entry, do not use a password on the server.id file on that server.
QoS requires that the Domino server be run under the java controller using the java console.
The qosprobe add-in task can be configured with the following settings on the Domino server in the server NOTES.INI file:
The probe interval in minutes. This can be set in the notes.ini. The default is 1 minute.
The probe timeout in minutes. This can be set in the dcontroller.ini. The default is 5 minutes.
The server controller monitors a message queue to which the qosprobe add-in communicates its probing results. (SUCCESS, ERROR, TIMEOUT). The messages are captured in the qosctnrlrtimestamp.out file found in the server data directory. The following is an example of a SUCCESS message:
2010/01/07 07:42:56 QoS Probe: SUCCESS (88ms)
The following is an example of an error message:
2010/01/07 08:05:59 QoS Probe: ERROR: ProbeError=4803
When the QoS server is enabled, on TIMEOUT, the controller will smart kill the server and restart. A timeout can happen in either of the following cases:
2010/01/07 07:42:56 QoS Controller: The controller has received a probe timeout.
2010/01/07 07:42:56 QoS Controller: There are long running applications - probing will pause until they have completed.
If this condition is detected, the controller will then allow the lengthy ("long-running") operation more time to complete. If any lengthy operation fails to complete within that amount of time, the controller will then proceed with the smart kill/restart. You see a message like the one in the following example in the qoscntrlrtimestamp.out file:
2010/01/07 07:42:56 QoS Controller: Applications are not making progress.
Important: For the following six NOTES.INI values, if you do not configure the value, or configure it as less than the default, the default value applies. You can only change the value to be greater than the default.
Perform the following tasks:
2. Limiting QoS restarts QoS provides the option to limit the QoS restart times during one interval. When the restart times reach the time limitation, the QoS service is deactivated.
3. Pausing and resuming QoS QoS provides a mechanism to pause or resume the QoS service at a specific time. Pausing QoS avoids allowing the server to be killed during an option that is expected to take a long time or that is critical to server operation; examples are backups or other maintenance operations. Temporarily disabling QoS allows these operations to complete without being misinterpreted by QoS as a server problem.
4. Running QoS with a no kill option You can run QoS with a no kill option. When QoS detects server exceptions, it sends a single email to a specified administrator with notification of the exception instead of killing and restarting the server directly. (You can also set QoS to send mail to an administrator whether or not you enable the no kill option.)
5. Running QoS with other configuration options You can run QoS with several additional options to disable probing and manage timeouts.
6. Verifying that QoS is running If you are the not the administrator who enabled QoS, you can verify the correct setup.
Related tasks Setting up automatic diagnostic data collection on the server
Related reference Understanding Quality of Service (QoS) behavior and logging