Kalman Fit Convergence

Gerry Lynch

February 9, 2004


Introduction

The Kalman track fits that we rely on in BaBar track reconstruction sometimes need to iterate to get a satisfactory fit. For the conditions that we have been using, iteration is begun for about 20% of the fits. The decision for doing another step in the iteration is made by checking to see if the difference between the last fit and the preceding one satisfies some tests.

1. A distance test that requires that the maximum separation of the trajectories be less than some tolerance. This tolerance has been set at 2mm. This seems to me to be a huge value for the tolerance, but I was not able to improve the convergence by reducing it. On rare occasions this test was failing when it should not have because the tests were sometimes done at a place where the trajectory is discontinuous. This was fixed by making the distance tests at places that were not  material sites.

2. A momentum test that requires that the momentum change be small. This tolerance has been 0.5 MeV/c. Again I found no benefit from changing this tolerance.

3. A pair of parameter tests that check that the chi-squared for the comparison of the five fitted track parameters separately at the first hit and the last hit. The errors used for this chi-squared are the fitted track parameter errors at the hit. I regard this as the best test of the set. Up to now it has not been used because it seemed to be unreliable.

At present when a fit does not converge, the track is not failed - it is labeled as a successful fit.

Improvement in the convergence process is wanted because we notice that different Kalman fits (with starting parameters that are a little different) often result in noticeable differences in the fitted track parameters.

Track History Problems

When I started this investigation, I used the information in the track history as an indicator track convergence. This procedure did not work very well for two reasons.

When TrkDchRadHitAdder fits an Svt track and then decides not to put any Dch hits on the track, this fit is put in the Default track list, but no record of this is put into the history. So the convergence failures are not recorded for many Svt-only tracks.

The test on convergence in KalRep that is done to decide what to put in the track status and history is done too late, after the track has been extended trough the material between the track and the origin. It is an invalid test, especially for low momentum tracks. The consequence of this has been that many track fits that passed the convergence tests have been labeled as non-converging. Another consequence of this is that it misled us about the effectiveness of the convergence tests. This flaw in the code is easily fixed.

In my investigation I have put debug statements in KalRep that trace the convergence progress and have not relied on the recorded history information.

Details of Convergence Observations

This study used 1000 Elf events (of which 371 passed the Elf filter). In a typical test about 480 potential fits failed before the iterator because the track lost all of its energy before the last hit. About 5800 fit attempts entered the iterator. Ten to 20 of these (depending on how many iterations were done) were failed because the iterator diverged. Under present conditions about 200 of these fit attempts do not converge. For a typical track that has a fit that does not converge, the electron, muon, and pion fits converge, the kaon fit does not converge and the proton fit fails. For most tracks that have a non-converging fit, only one mass assignment does not converge.

Using the Parameter Chi-squared Test

As far as I can see, the parameter chi-squareds are reliable numbers. They show that the convergence criteria that we have been using are quite weak. In our production running, the maximum number of iterations allowed is three. At the last iteration the maximum of the parameter chi-squareds has a median value of about 23; only about 16% of the time is it less than one, and about 13% it is greater than 100. It seems to me that we need to use these parameter chi-squared tolerances, and my intuition says that the tolerance should be at about one.

If we use the parameter chi-squareds with a reasonable tolerance, we need to allow more steps in the iterator. Actually I have been testing with 40 iterations allowed. Such a choice is not absurd. In one test, I found only a 5% decrease in Kalman fit time when I changed from 40 to 10 for the maximum number of iterations.

Step Size Adjustment

When an attempted fit does not converge, it often gets into a mode in which every other step or every third step is the same. One can try to avoid this by changing the rules of iteration. I have tried one such change. The Kalman fit iterates by changing the reference momentum for the track and doing the fit again. The first change that I made was if the fit has not converged in three steps, cut the step size after three iteration by 0.5*(1+exp(-0.1(n-3))), the Kalman fit parameter chi-squared tolerance, which reduces the step size from one to close to one-half as the iteration proceeds. This reduced the number that did not converge in 40 iterations by a factor of two. Perhaps someone can come up with a better method of speeding up the iterations. It's worth trying. I'm working on one that shows promise, but so far does not look better.

Testing with Different Events

It is unreliable to test a method with the same sample that was used to set the parameters used. The decision not to change the distance tolerance and the momentum tolerance, and the setting of the exponential coefficient in the step size modification were all based on their effects on the convergence and the program speed. So I tested the performance of the revised program with a 2000 event sample in the same run. My change in the stepping method was not effective in this set of events. The number of non-converging tracks was cut from 70 to 43 by this change.

Many things can be tuned to control the performance of of the iterator, but the most important one is the tolerance on the parameter chi-squareds. The figures in http://costard.lbl.gov/~grl/Kalman/parchisq.ps show how the running time (TrkMassFitter+TrkSvtHitAdder+TrkDchRadHitAdder+TrackMerge+ DchKalFinalFit+SvtKalFinalFit) on noric and also the convergence rate as a function of the parameter chi-squared tolerance, keeping the two tolerances the same. The abscissa on these figures is a convenient arbitrary function of the tolerance value [tolerance/(3+tolerance)]. We see that the addition of a chi-squared tolerance of one increases the running time by about 30%. We also see that as the tolerance is made smaller, fewer fits are found to converge. Some of these that do not converge will converge if more iterations allowed, but I think that many of them are tracks that deserve to be failed.

Summary of Proposed Changes

The proposed changes to the Kalman fits are:

1. Fix the logic in KalRep so that the reporting of non-convergence accurately reports what is done.

2. Change the distance test so that the testing is not done where there are discontinuities in the trajectory.

3. Activate the parameter chi-squared tests, perhaps with a tolerance of one.

4. Greatly increase the number of iterations allowed.

5. Add a convergence speed up method.

6. Call a fit that does not converge a failure.

Yet to do

The main thing that is lacking in this study is some measure of how accurate the procedure is. We intend soon to look at Monte Carlo events and/or compare reco and mini Kalman fits to get a better accuracy measurement.