This topic has not yet been written. The content below is from the topic description.
The RecoveryManager scans the ObjectStore and other locations of information, looking for transactions and resources that require, or may require recovery. The scans and recovery processing are performed by recovery modules. These recovery modules are instances of classes that implement the com.arjuna.ats.arjuna.recovery.RecoveryModule interface. Each module has responsibility for a particular category of transaction or resource. The set of recovery modules used is dynamically loaded, using properties found in the RecoveryManager property file. The interface has two methods: periodicWorkFirstPass and periodicWorkSecondPass. At an interval defined by property com.arjuna.ats.arjuna.recovery.periodicRecoveryPeriod, the RecoveryManager calls the first pass method on each property, then waits for a brief period, defined by property com.arjuna.ats.arjuna.recovery.recoveryBackoffPeriod. Next, it calls the second pass of each module. Typically, in the first pass, the module scans the relevant part of the ObjectStore to find transactions or resources that are in-doubt. An in-doubt transaction may be part of the way through the commitment process, for instance. On the second pass, if any of the same items are still in-doubt, the original application process may have crashed, and the item is a candidate for recovery. An attempt by the RecoveryManager to recover a transaction that is still progressing in the original process is likely to break the consistency. Accordingly, the recovery modules use a mechanism, implemented in the com.arjuna.ats.arjuna.recovery.TransactionStatusManager package, to check to see if the original process is still alive, and if the transaction is still in progress. The RecoveryManager only proceeds with recovery if the original process has gone, or, if still alive, the transaction is completed. If a server process or machine crashes, but the transaction-initiating process survives, the transaction completes, usually generating a warning. Recovery of such a transaction is the responsibility of the RecoveryManager. It is clearly important to set the interval periods appropriately. The total iteration time will be the sum of the periodicRecoveryPeriod and recoveryBackoffPeriod properties, and the length of time it takes to scan the stores and to attempt recovery of any in-doubt transactions found, for all the recovery modules. The recovery attempt time may include connection timeouts while trying to communicate with processes or machines that have crashed or are inaccessible. There are mechanisms in the recovery system to avoid trying to recover the same transaction indefinitely. The total iteration time affects how long a resource will remain inaccessible after a failure. – periodicRecoveryPeriod should be set accordingly. Its default is 120 seconds. The recoveryBackoffPeriod can be comparatively short, and defaults to 10 seconds. –Its purpose is mainly to reduce the number of transactions that are candidates for recovery and which thus require a call to the original process to see if they are still in progress. Note In previous versions of ArjunaCore, there was no contact mechanism, and the back-off period needed to be long enough to avoid catching transactions in flight at all. From 3.0, there is no such risk.